GraphQL Overview— Part 3 — The Infrastructure and Summary

Published in

SoftwareMill Tech Blog

4 min readFeb 22, 2021

Train station hall — photo by Rick Ligthelm

This series was co-authored by kamil kupczyński, Piotr Jasiak, and Sebastian Rabiej.

The third part of the cycle about GraphQL where we’ll cover: caching, load balancing, API gateways, and we’ll summarize the whole series.
Previous parts:

Caching

Caching GraphQL responses is much more difficult than regular REST API since all requests are executing the same endpoint. They differ in payload but caching responses is still difficult to implement. There are three main concerns:

Duplication — more than one cache key storing the same result. GraphQL queries may have different syntax but they may result in the same response (e.g. property ordering may differ, or there could be white space, or there could be multiple queries in one request).
Overlapping — it occurs when portions of the response/cache are almost exactly the same with very minimal differences. GraphQL queries may request different properties, it may happen that one query will request a subset of properties of another query. It will result in 2 cache entries containing almost the same data.
Cache TTL — TTL reduces traffic and keeps cached data fresh. The issue appears when wrong TTL values are used or TTL is ignored. Running multiple queries in one request may result in ignoring TTL values for queries with longer TTL just because they were run together with queries with shorter TTL in one request. This will reduce the cache-hit ratio and makes caching less efficient.

To solve the problems described above, an efficient GraphQL cache must: be able to sanitize and normalize the cache key (GraphQL call — normalize whitespace and ordering recursively). This is a much more complicated task than implementing the cache server for REST API which is straightforward and does not require request payload parsing or normalization. Caching might be provided by the GraphQL server (e.g. Apollo Server implements server-side caching) or by the CDN provider (e.g. Akamai), but it requires “computation at the edge” of the cache-key.

Load balancing

Due to the nature of GraphQL, load balancing of the incoming request is more complicated than the REST API. All requests have the same URL/path and will differ in payload. So it would be impossible to forward requests for a given query to specific nodes without analyzing the content of the request (e.g. there is a query/resource that we want to process on a designated group of nodes). GraphQL queries could have different content but they may query for the same data, there could be multiple queries in one request too.
However, if you don’t need such sophisticated load balancing, you may still use Ngnix or any other load balancer that might be used when working with REST API.

API Gateway

The common practice in distributed system design is API gateway. API gateway is an entity that is proxying client calls to services that serve it. API gateway for REST API is again quite a simple proxy that just forwards http call to services based on the URL path. With GraphQL it’s impossible, so the API Gateway is a server that parses request’s payload and forwards calls to services. API gateway must combine responses from services and return the response to the client. In simple cases — a request with a single query — it might be trivial, but when there are multiple queries within a single request, it might forward calls to multiple services and then combine their responses into the response that is sent to the client.

Hasura GraphQL Engine

Hasura is an open-source GraphQL Engine that allows creating GraphQL API from different sources, e.g. databases, REST API, or other GraphQL services. It might be used as API Gateway in the microservices environment. Hasura offers many advanced features such as authorization or query performance analysis.

Apollo Federation

Apollo Federation is an approach to building API Gateway for GraphQL services. It offers a JS library that allows creating API gateway in a simple manner. It also provides Managed federation which implements a schema registry that helps to safely validate, coordinate, deploy, and monitor changes to schema graph.

Summary

To sum up all parts of the series:

GraphQL is a powerful and flexible query language to fetch data in a type-safe manner.
GraphQL is a promising technology with a solid toolbox on both the client and the server-side.
GraphQL is not just better REST. It’s a different approach to communication with clients. It might remind you of SOAP and XSD Schema or RPC. This approach has its own pros and cons.
It fits best to frontends and public API. However, for systems with heavy traffic, caching or load balancing might be an issue and introduce additional complexity, overhead, or latency.
The other side of the coin is that GraphQL doesn’t apply well for communication between microservices. GraphQL’s complexity is not balanced with its advantages in that use case (such as flexible query API). Good old REST or gRPC are simply better for that.

It’s crucial to understand that GraphQL is not just REST on steroids. It’s a different approach to communication that solves some of the REST issues but comes with a cost.