At work, we tried to get all SQL queries for a given transaction to use prepared statements. This was my first time using prepared statements in Rails, and was surprised by what Rails does not prepare.

Let’s dive into what Rails 8.1 prepares, what it does not, and what we might do about it.

Identifying prepared statements

First, we need a way to identify which SQL query runs as a prepared statement. With Postgres, we have it easy: we can use the built-in sql.active_record instrumentation hook and read the :statement_name key.

ActiveSupport::Notifications.subscribe("sql.active_record") do |event|
  next if event.payload[:name] == "SCHEMA"

  if event.payload[:statement_name].nil?
    puts "❌ NOT PREPARED: #{event.payload[:sql]}"
  end
end

For other adapters, Ben Sheldon’s trick of using ActiveRecord::Connection#to_sql_and_binds works for ActiveRecord::Relation, but falls short for methods that do not return relations like #pluck or #count.

State of things

Using the above instrumentation, let’s see which queries get prepared.

Where

#where gets prepared when using a Hash or sending any ?-bound params, but not when sending a plain String, or for IN clauses.

Post.where(title: "Hello world").load
# ↳ ✅ SELECT "posts".* FROM "posts" WHERE "posts"."title" = $1

Post.where("title = ?", "Hello world").load
# ↳ ✅ SELECT "posts".* FROM "posts" WHERE (title = $1)

Post.where("title = 'Hello world'").load
# ↳ ❌ SELECT "posts".* FROM "posts" WHERE (title = 'Hello world')

Post.where("public IS NOT TRUE").load
# ↳ ❌ SELECT "posts".* FROM "posts" WHERE (public IS NOT TRUE)

Post.where(title: %w[foo bar]).load
# ↳ ❌ SELECT "posts".* FROM "posts" WHERE "posts"."title" IN ($1, $2)

Select

#select gets prepared when each argument matches exactly a column name or the table.column format. Anything else does not.

Same logic applies to the various ActiveRecord::Calculations methods (e.g. #sum).

Post.select('id').load
Post.select(:id).load
Post.select('posts.id').load
# ↳ ✅ SELECT "posts"."id" FROM "posts"

Post.select('"posts"."id"').load
# ↳ ❌ SELECT "posts"."id" FROM "posts"

Post.select('id, views').load
# ↳ ❌ SELECT id, views FROM "posts"

Post.select('COALESCE(public, false) AS public').load
# ↳ ❌ SELECT COALESCE(public, false) AS public FROM "posts"

Order

#order gets prepared only when using Symbol or Hash-style col: :direction.

Post.order(:views).load
Post.order(views: :asc).load
# ↳ ✅ SELECT "posts".* FROM "posts" ORDER BY "posts"."views" ASC

Post.order('views').load
# ↳ ❌ SELECT "posts".* FROM "posts" ORDER BY views

Joins

Joins get prepared when using associations, not when using SQL fragments as Strings.

Post.joins(:author).load
# ↳ ✅ SELECT "posts".* FROM "posts" INNER JOIN "authors" ON "authors"."id" = "posts"."author_id"

Post.joins("LEFT JOIN authors ON posts.author_id = authors.id").load
# ↳ ❌ SELECT "posts".* FROM "posts" LEFT JOIN authors ON posts.author_id = authors.id

Subqueries

Subqueries get prepared too, except when using the IN (?) form.

Post.where(author_id: Author.select(:id).where("id > ?", 1)).load
# ↳ ✅ SELECT "posts".* FROM "posts" WHERE "posts"."author_id" IN (SELECT "authors"."id" FROM "authors" WHERE (id > $1))

Post.where("author_id IN (?)", Author.select(:id).where("id > ?", 1)).load
# ↳ ❌ SELECT "posts".* FROM "posts" WHERE (author_id IN (SELECT "authors"."id" FROM "authors" WHERE (id > 1)))

Misc

#any? and #count do not get prepared.

Post.any?
# ↳ ❌ SELECT 1 AS one FROM "posts" LIMIT $1
Post.count
# ↳ ❌ SELECT COUNT(*) FROM "posts"

#distinct is fine.

Post.distinct.pluck(:id)
# ↳ ✅ SELECT DISTINCT "posts"."id" FROM "posts"

But why?

ActiveRecord walks the tree of Arel nodes. Each node type can set the resulting query as not-preparable. SQL string literal and IN / NOT IN are the only offenders:

def visit_Arel_Nodes_SqlLiteral(o, collector)
  collector.preparable = false # <-- the culprit!
  collector.retryable &&= o.retryable
  collector << o.to_s
end

But then again, why? I don’t know for sure, but I bet it’s because Strings can contain arbitrary arguments thus cause the number of prepared statements to balloon. Contrived example: Post.where("id = #{Integer(params[:id])}").

What might we do about it?

Since Rails 7.2 (#51336), Arel.sql accepts a retryable: bool kwarg. Knowing whether a query is retryable is similar to knowing whether it is preparable; a similar implementation would make sense.

Arel.sql helps for the arbitrary String piece, but in order to get to 100% prepared-statement coverage, we still need to handle the IN case. For that, we can leverage the Postgres-specific col = ANY(ARRAY[]) syntax, turning a multi-binds col IN (?, ?, ...) into a single-bind col = ANY(?). There’s an open PR from 2023 that does just that, but it appears stuck.


That’s all I had. Now that I’ve taken the time to research and write all that, I may as well turn it into a pull request. Stay tuned!