Jared Short in serverless 6 minutes to read

A Look at Serverless GraphQL

GraphQL, currently one of the biggest squares on the buzzword bingo card, is gaining a lot of steam recently and serverless sits right next to it. Recently, I have had the joy (and sorrow) of working with both technologies in tandem, “serverless GraphQL”.

Why GraphQL you ask? It actually turns out to simplify quite a few things, and it brings several advantages to the serverless world:

  • Provides an easy to understand uniform API layer for all your resources
  • API clients request and receive only exactly the data they need in a single request, the ultimate in network performance optimization
  • One endpoint for your API, or logical division of resources (think “services”), no need to create dozens of rest endpoint and method mappings
  • API versioning and managing many connecting client versions may be simplified
  • A GraphQL endpoint could talk to a rest API, a database and a datastore (S3) and that complexity is irrelevant to the client… use the best tool for the job
  • Because GraphQL uses a defined typed schema, interactive query and mutation IDEs with things like autocomplete already exist and will work with your endpoint

It’s important to realize that GraphQL defines both the read AND write (mutation) interfaces and can serve them all on the same endpoint. A user sends a “graphql” defined read or mutation via a POST request to the GraphQL endpoint. There is no concept of an endpoint defining the operation or resources like in a RESTful API.

If you want to dive more into GraphQL, check out this awesome talk from one of the creators.

For quick understanding, let’s look at a simple blog for example… a blog has posts and comments.

# Get all posts ids and titles available for the blog
{
  posts{
    id
    title
  }
}

# Example Response
{
  "posts":{
    "data":[
      {
        "id": "pid123",
        "title": "Hello World"
      },
      {
        "id": "pid124",
        "title": "Hello World 2"
      }
    ]
  }
}

# Get a post, its content, and all of its comments and their content
{
  post(pid: "abc123"){
    id
    title
    comments {
      id
      author
      content
    }
  }
}

# Example Response:
{
  "post": {
    "data": {
      "id": "pid123",
      "title": "Hello World",
      "comments": [
        {
          "id": "cid123",
          "author": "Moe",
          "content": "I agree."
        },
        {
          "id": "cid124",
          "author": "Larry",
          "content": "I disagree."
        },
        {
          "id": "cid125",
          "author": "Curly",
          "content": "You are both wrong."
        }
      ]
    }
  }
}

Mutations (or writes) are also straight forward… let’s say we want to create a comment.

# Create a comment on a given post
{
  mutation createComment(pid: "abc123", content: “This is pretty cool!”){
    id
  }
}

# This will write to the database, and if successful return back the ID of the new comment
{
  "createComment": {
    "data": {
      "id": "cid126"
    }
  }
}

That’s a trivial blog implementation, and doesn’t address the complexities involved with authentication, authorization, validation, limitations, and pagination for starters. However, the vast majority of those complexities should not be at the GraphQL layer, rather in your business logic layer. Facebook engineers suggest that GraphQL should be a very thin access layer on top of your actual business logic.

So what about serverless GraphQL?

Boy am I glad you asked! We’ve found that there are a couple key benefits to a wrapping up a GraphQL endpoint in a serverless architecture. Sure, you get the ultra-scalability and cost effectiveness of API Gateway and Lambda, and you can leverage the simplicity of deployment managed by the Serverless Framework, but let’s look a bit beyond the obvious.

Monitoring is drastically simplified. You don’t need to trigger on CPU, disk space, bandwidth concerns, or any of those normal concerns. Because each GraphQL query runs in its own context, there is no concern of a single query dominating a machine’s resources (however you may need to still monitor your downstream dependencies such as databases). Your monitoring can be as straight forward as looking at latency and invocation errors in many cases.

Function Run Isolation gives us some safety in the case someone maliciously or accidentally hammers away at our API with intense operations. Querying computational or memory expensive operations is protected against because each lambda request is allocated a full container’s resources, no sharing. Unlike EC2, this means that rogue requests can’t as easily impact the good requests. Now, you still have to worry about downstream services, but serverless is a good head start.

Versioning a service can be simplified as you no longer need to worry if adding fields is going to break clients. As a by-product of GraphQL, the client only ever asks for and receives exactly what they need. You only need to update your lambda function and you are good to go. You can continue to add fields and features without breaking clients depending on pre-existing fields and features. If you change how a particular field works, sure, you may need to warn API consumers, but that is fairly rare in a well thought out API design.

Self-Documenting / Explorable API this is more of a function of GraphQL but you can hand out an API explorer with little upfront effort on your end. This is made especially easy by a project called GraphiQL. Developers and other API consumers can visually tinker with your API, learn how it works, and construct queries and mutations. For a running example check out this Star Wars GraphQL API.

Standing up the Second Endpoint is even easier than the first. In some circumstances you may have a different logical service or business case for a second GraphQL endpoint. You can leverage all the same components in your Serverless Framework configs, and just add a second endpoint if it makes sense. This can be a huge win if you are used to standing up new boxes for new services. (Side note: If you are doing that… Look into CloudFormation or Terraform or other Infrastructure as Code options!)

Lambda Fanout, while more advanced, could be used for particularly intense queries and jobs. Perhaps a query request for more information about data over several sub-systems, or spread through several csvs in S3 buckets. Using a lambda fanout pattern can supercharge your GraphQL layer with the burstiest capabilities possible. No large auto-scaling group on standby “just in case”.

You can also checkout an example of serverless GraphQL with the Serverless Framework!

We’ll call that a wrap, but serverless GraphQL sure has me excited. The paradigm is powerful, we’ve seen it driving a lot of interest here at Trek10, and I think if you give it a spin you’ll come to some of the same conclusions we have.