When Ruby GraphQL requests get slow

Ruby GraphQL is unusually good at hiding allocation problems behind code that looks harmless. You can fix the obvious N+1 issues and other database gotchas and after profiling, still see a bunch of ugly p99 spikes. Often the root of slow queries isn’t the query at all, it’s due to how requests allocate a pile of short-lived objects that force Ruby’s GC to stop and clean up at exactly the wrong time.

The real problem

A single field in GraphQL-Ruby can touch some or all of the following:

AST objects,
Authorization checks,
Field wrappers,
Application models,
Presenters,
and JSON hashes

All before anything leaves the process.

None of that has to be individually slow to become expensive as a whole. The real problem is the object churn in aggregate. GraphQL-Ruby wraps a lot of structure around your business logic, and your application code adds another layer on top. If you have some list field that returns 500 nodes, that could also mean you’re instantiating 500 ActiveRecord objects, then 500 GraphQL objects, then finally a bunch of nested arrays and strings while resolving the child fields as well.

The difficult part with debugging is that short-lived garbage doesn’t leave a nice signature you can point to. If each resolver is just a little wasteful, no single method dominates the time profile. The request just slowly creates enough heap pressure that the tail gets ugly. Each line item seems fine, but the whole request shape is not.

Profiling the Schema

The question you want to answer isn’t “which method was hottest” but “which code path allocated the most stuff while this exact GraphQL operation ran.”

MemoryProfiler is a great tool for this. It gives you allocation counts and retained-memory summaries in a form you can make sense of easily. This isn’t something you want to leave on for normal traffic. Allocation profiling has serious overhead. The good thing is you can add explicit opt-in and authorization checks before any diagnostic data comes back on a response.

Here’s an example for a GraphQL controller. It wraps execution only when an admin asks for it, runs MemoryProfiler.report around the real schema call, and appends a small summary to extensions.

class GraphqlController < ApplicationController
  def execute
    result = if should_profile_allocations?
      profile_allocations
    else
      AppSchema.execute(
        params[:query],
        variables: ensure_hash(params[:variables]),
        context: graphql_context,
        operation_name: params[:operationName]
      )
    end

    render json: result
  end

  private

  def should_profile_allocations?
    current_user.admin? && params[:profile_allocations].present?
  end

  def profile_allocations
    query = params[:query]
    variables = ensure_hash(params[:variables])
    context = graphql_context
    operation_name = params[:operationName]

    graphql_result = nil

    report = MemoryProfiler.report do
      graphql_result = AppSchema.execute(
        query,
        variables: variables,
        context: context,
        operation_name: operation_name
      )
    end

    graphql_result[:extensions] ||= {}
      graphql_result[:extensions][:allocation_profile] = {
        total_allocated: report.total_allocated,
        total_retained: report.total_retained,
        allocated_memory: report.total_allocated_memsize,
        retained_memory: report.total_retained_memsize,
        top_locations: report.allocated_memory_by_location.first(10),
        top_classes: report.allocated_objects_by_class.first(10)
      }

    graphql_result
  end
end

The full report can expose internals you don’t want floating around outside a debugging session, so keep the payload small. A summary in extensions is usually enough to point you at the hottest locations and object classes.

Making sense of the report

Start with total allocated objects and total allocated memory for the request. Then look at retained memory, but treat it as a signal, not the source of truth. Usually the first win is not subtle. One file, serializer, or resolver path is often responsible for an unnecessary amount of churn.

Here’s what a report for a slow request might look like:

Total allocated: 146.82 MB (1482611 objects)
Total retained:  9.74 MB (37412 objects)

Allocated memory by location:
  app/graphql/types/user_type.rb:41                                          46.18 MB
  graphql/execution/interpreter/runtime/graphql_result.rb:180                12.44 MB
  graphql/execution/interpreter/runtime.rb:200                               11.97 MB
  graphql/execution/interpreter/runtime.rb:201                                9.86 MB
  graphql/execution/interpreter/runtime.rb:753                                9.86 MB
  app/graphql/types/user_type.rb:36                                           8.12 MB
  app/graphql/types/user_type.rb:28                                           5.44 MB
  graphql/schema/field.rb:897                                                 2.21 MB

Allocated objects by class:
  String                                                                   1041228
  Array                                                                     163904
  Hash                                                                      119887
  GraphQL::Execution::Interpreter::Runtime::GraphQLResultArray               10201
  GraphQL::Execution::Interpreter::Runtime::GraphQLResultHash                10201
  UserType                                                                   10000

The numbers themselves are less important than how you read them:

The first thing I care about is the shape: huge allocated memory but much smaller retained memory. That indicates churn. The request is making Ruby work hard even if it isn’t leaking the whole time.
app/graphql/types/user_type.rb:41 is at the top of the list and probably where the request is reshaping data or building strings per node.
app/graphql/types/user_type.rb:28 being high too is a hint that repeated formatting work is part of the problem.
The GraphQL runtime lines are real, but I would start with the app’s own code first. Framework internals matter less than the allocation patterns you control.
String dominating the class list is another clue. If strings are flagged, look for formatting, reshaping, JSON-like hashes, and repeated value building before you go hunting in the database.

Scenario	What the flamegraph shows	What allocations show	Fix
`.all.map(&:name)` on many records	Nothing stands out, time is spread thin	Tons of Active Record model objects	Use `pluck(:name)` to skip the model layer
Resolver builds a hash, then GraphQL reshapes it again	Time split across serializer and resolver code	High object counts in your serializer files	Pick one place to shape the output
Per-node `strftime` or string formatting	Tiny repeated calls, easy to overlook	Flood of `String` allocations	Cache or precompute the formatted value

A bad resolver can look simple and still be the problem

The worst offenders are fields that only need a couple of scalar values but load a pile of models anyway. It’s usually due to convenience, using .all or some broad relation, but then child methods reach into each model for the final response shape.

# before
field :tag_names, [String], null: false

def tag_names
  object.tags.visible.order(:name).all.map(&:name)
end

# after
field :tag_names, [String], null: false

def tag_names
  object.tags.visible.order(:name).pluck(:name)
end

Both of these versions hit the database once. That’s what makes this easy to miss. The query count doesn’t change, and a flamegraph may not show anything dramatic. But the first version instantiates every Tag model, plus whatever those models drag in through attribute wrappers and related associations. Then it throws almost all of it away. The second version asks the database for exactly the values the GraphQL field needs and skips the model layer entirely.

Strings are another offender. Suppose a field formats timestamps for every node in a list.

field :created_at_label, String, null: false

def created_at_label
  object.created_at.utc.strftime("%Y-%m-%d %H:%M:%S UTC")
end

That looks harmless until it runs a few thousand times in one response and the report starts showing a flood of String allocations from formatting code. If the output format is fixed, then cache it or precompute it, or move the work to a place that doesn’t repeat per field resolution.

For list fields, the first question is whether you need full models at all. If the field only returns ids, names, or counts, pluck or a narrower select will skip the model layer entirely.

For serialization-heavy paths, look out for duplicate transformations. A resolver that builds a hash, then later a GraphQL object that re-derives just a part of that hash, is a code smell. Pick one layer for shaping output and keep it there.

If the flamegraph looks flat and the latency is still bad, stop shaving CPU time and start counting objects. The answer is usually not that any one piece of code is slow. It’s that the request as a whole creates too much garbage for Ruby to clean up quietly.