← Back to the Note Garden

How to GraphQL

11 min read

GraphQL

Jump to a Section:

These notes are based off the How to GraphQL courses that I'm going through with the Party Corgi adventure club!

Introduction

GraphQL (gql): a new API standard, meant to be a more efficient & flexible alternative to REST.

Gql enables declarative data fetching & exposes single endpoint in response to queries. It's meant to be a way for a client to ask specifically for the data it needs, and get that information back on one endpoint instead of a number of different requests.

An API is a way for client devices to get & load data from a server.

Gql is a query language, not a database! It's db agnostic & can be used effectively with any API context.

3 factors that impact how API's are designed in today's landscape:

  • Increased mobile usage. Gql minimizes the amount of data that needs to be transferred, making it more efficient
  • Front end framework variety. It's difficult to make API's fit requirements of so many different options - gql helps
  • Fast development. Continuous deployment & rapid iterations can get tricky with REST options

Gql isn't just for react - can be used with any language or framework!


GraphQL is the better REST

REST was a good start w/ some solid ideas like stateless servers and structured access to resources. But it's a rigid system, and is often misconstrued and not used in the way it was designed. So gql is meant to be a new methodology, built for flexibility and efficiency.

Main difference point - with REST you're making multiple requests, and getting all the info in each request instead of only the data you want from each call. Can modify the API to match your designs, but would then add extra work when your design changes or you need other information, having to update both the front code & the server.

With gql, you only need to send 1 request (not multiple), and specify exactly which data you want, with limits/specific query options. When you've got changes in design, you only need to modify your query request, not the actual API.

gql also has it's own schema system, which acts as a contract between server & client - once it's set, it's easier for both sides of developers to work on their own.

gql also has the ability to monitor performance & requested queries, so you can see if certain data fields aren't being used anymore, and can notice bottlenecks or issues with your structure.


Core Concepts

The Schema Definition Language (SDL)

Basic example:

type Person {
  id: ID! // unique id generated by server
  name: String!
  age: Int!
}

This defines the name of the object, as well as the possible fields and what type of data those fields are. The ! means the field is required.

If we want to add a relation between two types, we can do that like this:

type Post {
  title: String!
  author: Person!
}

type Person {
  name: String!
  age: Int!
  posts: [Post!]!
}

This makes each post tied to a specific person, and then in the Person type, we have a list [] of posts, which will show the name of any posts the person is tied to as the author.

The structure of the data returned from gql is not fixed. A gql database usually only exposes one endpoint. So we can fetch data using queries, which we can adjust to match whatever data we need.

{
  allPersons { // root field
    name // payload
  }
}

// can also add arguments to request
{
   allPersons(last: 2) {
     name
   }
}

// can get nested information
{
  allPersons {
    name
    posts { // this was our array field
      title
    }
  }
}

When we need to make changes to the data, we do so with mutations. Generally follow same structure as requests, but have the word mutation first.

Gql has 3 kinds:

  • creating new data
  • updating existing data
  • deleting existing data
mutation { // saying we want to change something
  createPerson(name: 'Bob', age: 36) { // root
    id // can also request payload with mutations. So if we had an id field on our Person, we can add a new Person and get the unique id back as a request
  }
}

If we need real time updates when something on the server changes, we can subscribe to certain events. When that's done, it gets a steady connection to the server, and when new info comes in, it automatically sends that new data to the client. Subscriptions work as a stream of information, rather than request/response cycle.

subscription {
  newPerson {
    name
    age
  }
}

The GraphQL Schema

The schema defines capabilities of the API by specifying how a client can fetch and update data. It's a collection of gql types with special root types.

type Query {
  allPersons(last: Int): [Person!]! // allows last parameter, and lists type of what the query will return
}

type Mutation {
  createPerson(name: String!, age: String!): Person! // takes in arguments we want to add for mutation, and returns a single Person object
}

type Subscription {
  newPerson: Person! // shows that when subscribed, this is the type of object we're watching for
}

Big Picture

Gql is actually only a specification - just a long document how a gql server should behave. You have to build the server yourself if you want to use it.

3 possible setups:

  • server w/ connected database
  • server that integrates with existing system
  • hybrid - connected database and integration into existing system
  1. server with connected db

uses a single web server that implements gql server resolves queries and constructs responses with data it fetches from the db transport layer agnostic - can use any language of construct could also use any type of db you'd like

  1. server w/ existing system

great for companies with legacy systems and lots of different APIs gql can unify existing systems and hide complexity of the data fetching logic, providing a nice single endpoint server doesn't care about where the data sources are

  1. hybrid approach

combines both options, so when a query is received, it will either fetch the data from the connected systems, or resolve it from the connected db and then send back the data

resolver functions - retrieves data for it's corresponding field - gql server has exactly one resolver function per field.

Resolvers can take either provided arguments or implicit arguments, depending on what data it needs to get it's response.

Resolver functions with an example

gql is great for front end, as data fetching can be pushed to server side. Don't need to care where data is coming from, so logic can be abstracted away.

Storage of data, request specifics, and all other steps besides describing the data you want and displaying it is handled by the gql server. This is called declarative data fetching. Imperative data fetching is the full process of constructing the HTTP request, parsing the response, storing the data, and then displaying it. Lots of gql client libraries exist to allow you to use declarative fetching and focus on the data itself.

Clients

You'll often use a client to handle sending the HTTP requests. So instead of using fetch or writing it out yourself, you'll just write the query or mutation and send it to the client, and the client will handle the networking part.

Then, when the client gets the server response, depending on what framework you're using, there are ways to update the UI with the data. Works especially well with functional reactive techniques, where the view declares what data it needs and the UI is wired to automatically update it with data.

Caching note - often best to flatten a query result and store that in a local store, that you can reference by id when needed.

Can also help to validate and optimize queries at build time, if the build environment has access to schema. Can go through and parse all the gql queries created and check for typos & optimizing chances during build time.

Server

Big benefit for the server side is allowing it to focus on describing the data, rather than how the endpoints are implemented.

Basically, when a query is received, the server algorithm goes field by field and executes a resolver for each one.

Every field in a query will have a specific type it's associated with - either a declared type, or the type of data it holds (string, int, etc). This makes it easy for the resolvers to know what they need to run on to find the data. Execution runs breadth-first, so will start at the top, resolve that, then pass te result to it's child, and so on until all the data is collected. Then the algorithm puts it all into the correct shape and returns it.

Lots of server implementations will provide default resolvers, too - so you don't have to specify a resolve for every field if the parent object contains a field with the right name.

This can sometimes be a bit naive and could result in multiple calls of the same data. There are options to batch requests or create functions that wait for all the resolvers to run then only fetch each once.

More Concepts

Fragments - improve structure and reusability by making a collection of fields from a specific type.

// if we have this type:
type User {
  name: String!
  age: Int!
  email: String!
  street: String!
  zipcode: String!
  city: String!
}

// we can make a fragment for the address like so:
fragment addressDetails on User {
  name
  street
  zipcode
  city
}

// can then refer to the fragment & save to get all that data
{
  allUsers {
    ...addressDetails
  }
}

Types can take arguments, and we can specify default values for these if desired.

type Query {
  allUsers(olderThan: Int = -1): [User!]!
}

Named Queries - can assign aliases to queries, so you can send multiple with the same fields but different arguments

first: User(id: 1) {
  name
}
second: User(id: 2) {
  name
}

Advanced Schema Things

Gql has two different types:

  • scalar - concrete units of data; string, int, float, boolean, ID
  • object - composable fields, like User and Post, objects we create

Can create your own of both types. A common scalar is Date.

Enumeration types (enums) - special kind of scalar type, a way to define semantics of a type that as a fixed set of values (could do a type Weekday and list all the days)

Interfaces - specify a set of fields that any type that implements the interface should have.

interface Node {
  id: ID!
}

type User implements Node {
  id: ID!
  name: String!
}

Union types - shows a type should be either of a collection of types

type Adult {
  name: String!
  work: String!
}

type Child {
  name: String!
  school: String!
}

union Person = Adult | Child

If we want to get information on a child but only have a Person type, we can use conditional fragments to see if we can actually access the child type.

allPersons {
  name //available for both adult and child
  ... on Child {
    school
  }
  ... on Adult {
    work
  }
}

Tooling

Introspection - clients can ask server for information about schema. Can query __schema, always available on the root of a query. Can show what all types exist in the schema, and can go into detail about fields on a specific type.

Security

Timeout - defends against large queries, sets max time allow for a query.

Max query depth - can set a max depth, and reject if query goes deeper than that. Typically analyzed statically so won't add load to the server.

Query complexity - can restrict queries with a max complexity. Common to set a default of 1, then increase it for more complex fields or based on arguments (more complex to get 5 posts than 1 user)

Throttling - For gql, often based on server time (how long it takes the server to complete the query). Can also set throttle based on query complexity.

Common Questions

  • Gql is a query language for API's - not a database. Can be used with any database or even none.

  • Also not just for JS or React - can be used anywhere you use an API, with any client language/framework that can use HTTP, and any server that is used to build for the web.

  • Auth - Can do Authentication (user login) with common patterns like OAuth. Authorization (permission rules) are best handle in the business logic of your app, not by gql.

  • Errors - Successful queries return a data object with your data. it will return an errors object.

← Back to the Note Garden