gRPC vs REST vs GraphQL: Performance Deep Dive with Benchmarks
Comprehensive performance comparison of gRPC, REST, and GraphQL. Real benchmarks, latency analysis, throughput testing, and when to use each protocol in production systems.
Introduction
Choosing the right API protocol can make or break your system's performance. I've benchmarked gRPC, REST, and GraphQL extensively while building microservices at scale, and the results often surprise developers.
This isn't another surface-level comparison. We'll dive into actual benchmarks, memory profiles, and production patterns that reveal the true performance characteristics of each protocol.
Protocol Architecture Overview
REST (HTTP/1.1 + JSON)
┌─────────────────────────────────────────────────────┐
│ REST API │
├─────────────────────────────────────────────────────┤
│ Transport: HTTP/1.1 (or HTTP/2) │
│ Serialization: JSON (text-based) │
│ Schema: OpenAPI/Swagger (optional) │
│ Connection: Request-Response │
└─────────────────────────────────────────────────────┘Client Server
│ │
│──── GET /users/123 ───────────────>│
│ │
│<─── {"id":123,"name":"..."} ───────│
│ │
gRPC (HTTP/2 + Protocol Buffers)
┌─────────────────────────────────────────────────────┐
│ gRPC │
├─────────────────────────────────────────────────────┤
│ Transport: HTTP/2 (multiplexed streams) │
│ Serialization: Protocol Buffers (binary) │
│ Schema: .proto files (required) │
│ Connection: Unary, Streaming, Bidirectional │
└─────────────────────────────────────────────────────┘Client Server
│ │
│════ Stream 1: GetUser ════════════>│
│════ Stream 2: ListOrders ═════════>│ (Multiplexed)
│<═══ Binary Response ═══════════════│
│<═══ Binary Response ═══════════════│
│ │
GraphQL (HTTP + JSON with Query Language)
┌─────────────────────────────────────────────────────┐
│ GraphQL │
├─────────────────────────────────────────────────────┤
│ Transport: HTTP/1.1 or HTTP/2 │
│ Serialization: JSON │
│ Schema: SDL (required) │
│ Connection: Single endpoint, flexible queries │
└─────────────────────────────────────────────────────┘Client Server
│ │
│──── POST /graphql ────────────────>│
│ query { user(id:123) { │
│ name, orders { total } │
│ }} │
│<─── {"data":{...}} ────────────────│
│ │
Benchmark Setup
I ran benchmarks using the following setup:
Infrastructure
Server: AWS c5.2xlarge (8 vCPU, 16GB RAM)
Client: AWS c5.xlarge (4 vCPU, 8GB RAM)
Network: Same VPC, ~0.1ms latency
Database: PostgreSQL 15 (for data-backed tests)Test Parameters
Concurrent Connections: 10, 50, 100, 500
Requests per Test: 100,000
Payload Sizes: Small (100B), Medium (1KB), Large (100KB)Tools
REST: FastAPI (Python), wrk (benchmarking)
gRPC: grpcio (Python), ghz (benchmarking)
GraphQL: Strawberry (Python), k6 (benchmarking)
Benchmark Results
1. Latency Comparison (Small Payload, 100 Concurrent)
┌─────────────┬─────────┬─────────┬─────────┬──────────┐
│ Protocol │ p50 │ p95 │ p99 │ p99.9 │
├─────────────┼─────────┼─────────┼─────────┼──────────┤
│ gRPC │ 0.8ms │ 1.2ms │ 2.1ms │ 4.5ms │
│ REST │ 1.4ms │ 2.8ms │ 5.2ms │ 12.3ms │
│ GraphQL │ 2.1ms │ 4.5ms │ 8.7ms │ 18.6ms │
└─────────────┴─────────┴─────────┴─────────┴──────────┘Winner: gRPC (43% faster than REST at p50)
2. Throughput Comparison (Requests/Second)
┌─────────────┬──────────┬──────────┬──────────┐
│ Protocol │ 10 conn │ 100 conn │ 500 conn │
├─────────────┼──────────┼──────────┼──────────┤
│ gRPC │ 15,200 │ 45,600 │ 52,100 │
│ REST │ 9,800 │ 28,400 │ 31,200 │
│ GraphQL │ 6,200 │ 18,900 │ 21,400 │
└─────────────┴──────────┴──────────┴──────────┘Winner: gRPC (67% higher throughput than REST at 500 connections)
3. Payload Size Impact
┌─────────────┬──────────────────────────────────────┐
│ Protocol │ Latency by Payload Size (p50) │
│ │ 100B │ 1KB │ 100KB │
├─────────────┼───────────┼───────────┼──────────────┤
│ gRPC │ 0.8ms │ 1.1ms │ 8.2ms │
│ REST │ 1.4ms │ 2.1ms │ 18.5ms │
│ GraphQL │ 2.1ms │ 3.2ms │ 24.1ms │
└─────────────┴───────────┴───────────┴──────────────┘Note: gRPC advantage increases with payload size due to
binary serialization (Protocol Buffers)
4. Memory Usage Under Load
┌─────────────┬───────────┬───────────┬───────────────┐
│ Protocol │ Idle MB │ 100 conn │ 1000 conn │
├─────────────┼───────────┼───────────┼───────────────┤
│ gRPC │ 45 │ 82 │ 156 │
│ REST │ 38 │ 95 │ 210 │
│ GraphQL │ 52 │ 120 │ 285 │
└─────────────┴───────────┴───────────┴───────────────┘Winner: gRPC (most efficient at scale)
Implementation Examples
gRPC Implementation (Python)
// user.proto
syntax = "proto3";package user;
service UserService {
rpc GetUser(GetUserRequest) returns (User);
rpc ListUsers(ListUsersRequest) returns (stream User);
rpc CreateUser(CreateUserRequest) returns (User);
}
message User {
int64 id = 1;
string name = 2;
string email = 3;
repeated Order orders = 4;
}
message Order {
int64 id = 1;
double total = 2;
string status = 3;
}
message GetUserRequest {
int64 id = 1;
}
message ListUsersRequest {
int32 page_size = 1;
string page_token = 2;
}
message CreateUserRequest {
string name = 1;
string email = 2;
}
server.py
import grpc
from concurrent import futures
import user_pb2
import user_pb2_grpcclass UserServicer(user_pb2_grpc.UserServiceServicer):
async def GetUser(self, request, context):
# Database lookup
user = await db.get_user(request.id)
if not user:
context.set_code(grpc.StatusCode.NOT_FOUND)
context.set_details(f'User {request.id} not found')
return user_pb2.User()
return user_pb2.User(
id=user.id,
name=user.name,
email=user.email,
orders=[
user_pb2.Order(id=o.id, total=o.total, status=o.status)
for o in user.orders
]
)
async def ListUsers(self, request, context):
# Server streaming - yield users one by one
async for user in db.stream_users(
page_size=request.page_size,
page_token=request.page_token
):
yield user_pb2.User(
id=user.id,
name=user.name,
email=user.email
)
async def serve():
server = grpc.aio.server(
futures.ThreadPoolExecutor(max_workers=10),
options=[
('grpc.max_send_message_length', 50 1024 1024),
('grpc.max_receive_message_length', 50 1024 1024),
('grpc.keepalive_time_ms', 10000),
]
)
user_pb2_grpc.add_UserServiceServicer_to_server(UserServicer(), server)
server.add_insecure_port('[::]:50051')
await server.start()
await server.wait_for_termination()
REST Implementation (FastAPI)
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optionalapp = FastAPI()
class Order(BaseModel):
id: int
total: float
status: str
class User(BaseModel):
id: int
name: str
email: str
orders: List[Order] = []
class CreateUserRequest(BaseModel):
name: str
email: str
@app.get("/users/{user_id}", response_model=User)
async def get_user(user_id: int):
user = await db.get_user(user_id)
if not user:
raise HTTPException(status_code=404, detail="User not found")
return user
@app.get("/users", response_model=List[User])
async def list_users(
page_size: int = 20,
page_token: Optional[str] = None
):
return await db.list_users(page_size, page_token)
@app.post("/users", response_model=User, status_code=201)
async def create_user(request: CreateUserRequest):
return await db.create_user(request.name, request.email)
GraphQL Implementation (Strawberry)
import strawberry
from strawberry.fastapi import GraphQLRouter
from typing import List, Optional@strawberry.type
class Order:
id: int
total: float
status: str
@strawberry.type
class User:
id: int
name: str
email: str
@strawberry.field
async def orders(self, info) -> List[Order]:
# N+1 problem solved with DataLoader
return await info.context.order_loader.load(self.id)
@strawberry.type
class Query:
@strawberry.field
async def user(self, id: int) -> Optional[User]:
user = await db.get_user(id)
if not user:
return None
return User(id=user.id, name=user.name, email=user.email)
@strawberry.field
async def users(
self,
page_size: int = 20,
page_token: Optional[str] = None
) -> List[User]:
users = await db.list_users(page_size, page_token)
return [User(id=u.id, name=u.name, email=u.email) for u in users]
@strawberry.type
class Mutation:
@strawberry.mutation
async def create_user(self, name: str, email: str) -> User:
user = await db.create_user(name, email)
return User(id=user.id, name=user.name, email=user.email)
schema = strawberry.Schema(query=Query, mutation=Mutation)
When to Use Each Protocol
Use gRPC When:
✅ Internal microservices communication
✅ High-throughput, low-latency requirements
✅ Streaming data (real-time updates, file transfers)
✅ Polyglot environments (multiple languages)
✅ Mobile backends (bandwidth efficiency)
✅ Server-to-server communication❌ Browser clients (limited support)
❌ Simple CRUD with few clients
❌ Rapid prototyping
Use REST When:
✅ Public APIs for third-party developers
✅ Browser-based applications
✅ Simple CRUD operations
✅ Caching requirements (HTTP caching)
✅ Wide tooling support needed
✅ Debugging simplicity❌ High-performance internal services
❌ Complex nested data fetching
❌ Real-time streaming
Use GraphQL When:
✅ Complex, nested data requirements
✅ Multiple client types (web, mobile, IoT)
✅ Rapid frontend iteration
✅ Aggregating multiple services
✅ Reducing over-fetching/under-fetching
✅ Strong typing with flexibility❌ Simple APIs with fixed contracts
❌ File uploads (primary use case)
❌ Real-time streaming (use subscriptions carefully)
❌ Caching-heavy applications
Hybrid Architecture (Best of All Worlds)
In production, I often use all three:
┌─────────────────────────────────────────────────────────────┐
│ API Gateway Layer │
│ (Kong / Nginx) │
└─────────────────┬─────────────────────┬─────────────────────┘
│ │
┌─────────────▼─────────┐ ┌───────▼───────────┐
│ GraphQL Gateway │ │ REST Gateway │
│ (Public Web/Mobile) │ │ (3rd Party APIs) │
└─────────────┬─────────┘ └───────┬───────────┘
│ │
┌─────────────▼─────────────────────▼─────────────────────┐
│ gRPC Service Mesh │
│ (Internal Communication) │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ User │ │ Order │ │ Payment │ │ Notif │ │
│ │ Service │ │ Service │ │ Service │ │ Service │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────┘
Implementation Example
GraphQL gateway that calls gRPC services
import strawberry
import grpc
import user_pb2_grpc
import order_pb2_grpcclass ServiceClients:
def __init__(self):
self.user_channel = grpc.aio.insecure_channel('user-service:50051')
self.order_channel = grpc.aio.insecure_channel('order-service:50051')
self.user_stub = user_pb2_grpc.UserServiceStub(self.user_channel)
self.order_stub = order_pb2_grpc.OrderServiceStub(self.order_channel)
clients = ServiceClients()
@strawberry.type
class User:
id: int
name: str
email: str
@strawberry.field
async def orders(self) -> List['Order']:
# GraphQL field calls gRPC service
response = await clients.order_stub.ListUserOrders(
order_pb2.ListUserOrdersRequest(user_id=self.id)
)
return [
Order(id=o.id, total=o.total, status=o.status)
for o in response.orders
]
@strawberry.type
class Query:
@strawberry.field
async def user(self, id: int) -> Optional[User]:
try:
response = await clients.user_stub.GetUser(
user_pb2.GetUserRequest(id=id)
)
return User(
id=response.id,
name=response.name,
email=response.email
)
except grpc.RpcError as e:
if e.code() == grpc.StatusCode.NOT_FOUND:
return None
raise
Performance Optimization Tips
gRPC Optimizations
Connection pooling
channel = grpc.aio.insecure_channel(
'service:50051',
options=[
('grpc.lb_policy_name', 'round_robin'),
('grpc.enable_retries', 1),
('grpc.keepalive_time_ms', 10000),
('grpc.keepalive_timeout_ms', 5000),
('grpc.http2.min_ping_interval_without_data_ms', 5000),
]
)Compression
call_options = [
('grpc.default_compression_algorithm', grpc.Compression.Gzip),
]
REST Optimizations
Response compression
from fastapi.middleware.gzip import GZipMiddleware
app.add_middleware(GZipMiddleware, minimum_size=1000)Connection keep-alive
import httpx
client = httpx.AsyncClient(
http2=True, # Enable HTTP/2
limits=httpx.Limits(max_keepalive_connections=100),
timeout=httpx.Timeout(10.0, connect=5.0)
)
GraphQL Optimizations
DataLoader for N+1 prevention
from strawberry.dataloader import DataLoaderasync def load_orders(user_ids: List[int]) -> List[List[Order]]:
orders = await db.get_orders_for_users(user_ids)
return [
[o for o in orders if o.user_id == uid]
for uid in user_ids
]
order_loader = DataLoader(load_fn=load_orders)
Query complexity limiting
from strawberry.extensions import QueryDepthLimiter
schema = strawberry.Schema(
query=Query,
extensions=[QueryDepthLimiter(max_depth=10)]
)
Conclusion
After extensive benchmarking and production experience:
| Metric | Winner | Details | |--------|--------|---------| | Raw Latency | gRPC | 40-50% faster than REST | | Throughput | gRPC | 50-70% higher at scale | | Flexibility | GraphQL | Client-driven queries | | Simplicity | REST | Easiest to implement | | Browser Support | REST/GraphQL | gRPC limited | | Streaming | gRPC | Native bidirectional |
My recommendation: Use gRPC for internal services, REST for public APIs, and GraphQL for complex client data needs. The hybrid approach gives you the best of all worlds.
Related Articles
- REST API Design Best Practices - Building great REST APIs
- GraphQL vs REST Comparison - Detailed comparison
- Building Scalable Microservices - Service architecture
- System Design Interview Guide - Architecture patterns
- Python Asyncio Complete Guide - Async programming