A developer friendly GraphQL client

21 December 2018

Jasper van Heijst

With a growing microservice landscape, the importance of aggregation layers is also increasing. For some of these aggregation services we are trying Facebook’s GraphQL, with two of them now running on our production environment. We've found that one of the drawbacks in using GraphQL is maintenance on both domain models and GraphQL queries. We've found a solution for this, in this blog we would like to share the problem we encountered and our way of working around it. An additional benefit of this solution is that now developers don’t need to worry about the GraphQL query syntax. While we’re still exploring the possibilities, we’d already like to share our preliminary findings.

GraphQL driver

GraphQL stands for Graph Query Language and is developed by Facebook. On the GraphQL website you can read: "GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data". The main concept is that the client is in charge of the data it retrieves from the server.

At bol.com, we are using GraphQL for example for aggregation services with a lot of data from different sources. By doing this, there is a single point a client can request data from and can determine what data fields it wants to get back from the GraphQL service.

To do this, we need to create a GraphQL query. This query needs to be posted with an HTTP POST request to the GraphQL service. In this query certain parameters can be added as well: if you want to retrieve certain fields of an objects with id X, you have to define the fields and add the id X as a query parameter. A basic example from the GraphQL documentation:

1. A GraphQL query (left) and its possible response (right), from the GraphQL documentation

In the example above (image 1), on the left side you can see the GraphQL query. Here is defined that from the object "human" with id:"1000", the fields "name" and "height" are requested. On the right side the response from the server is displayed, as you can see the data as requested is given back.

API's return data in some kind of generic data format, most of the times this is JSON. Usually, the client maps this JSON to a domain model that represents the fields needed by the client. The mapping defines which fields in the JSON file are represented by which fields in the object in the client. Popular Java libraries for parsing JSON are GSON and Jackson. Most of these libraries for the mapping of JSON to Java object provide a way to do this with annotations, so it is possible to define the name of the JSON-field in the definition of the Java class. The relation between the JSON and object can be 1:1 but this is not necessary: not all fields in the JSON need to be present in the object (depending on the settings of the mapper).

Described above are two concepts:

For a GraphQL service you can create a query to determine which field in the response will be present
In the client you can map fields from the response to a Java model. One way to do this is using a library and annotations.

Keeping this in mind, it is logical to generate the GraphQL query from the Java model as well. All fields in the model will then be present in the GraphQL query, will be returned by service with a value, which can be mapped to the original mapping. By doing this, you have a single source of truth.

2. Architecture of the proposed solution

In the image 2, you see that the input for the driver is the java model and a search query. These are combined and transformed to a query to a service using GraphQL. The response is a JSON, mapped to the object by an object mapper to your Java object.

Example

Above the GraphQL query is shown for the data of a Human with ID 1000. Here the Human is the Java object and the ID is the search input. Below, a working example is shown that has the Java code, the output of this code is the GraphQL example shown above!

public class Example {
    
    public static void main(String<> args) {
        String query = GraphQLParser.parse(Human.class,
                new GraphQLSearchField(Human.class, "id", "1000"));
        System.out.println(query);
    }

    @GraphQLField("human")
    private class Human {
        private String name;
        private double height;
    }

}

//output:
//        {
//        human(id: "1000") {
//            name
//            height
//            }
//        }

This is a very simple Java model, a Human object with two fields (name and height). One annotation is added, since the Java class is capitalized as is conventional, but we want the output to have a different name (human, uncapitalized). This model is the input for the parser. The second input parameter is a "GraphQLSearchField", where the search query for this specific ID (1000) is defined. More annotations are available, for when a field in the Java object needs to be ignored or a class needs to be treated as a different class (for example a date class that needs to be treated as a String), but we won't go into detail on that.

For developing, working like this proved to be very easy. If you want an extra field, you just add it in the Java model and it will be requested from the service and parsed to your model, so the next time your request is processed you have your new field. So, no need for keeping an object and a GraphQL query up to date with each other when the specs change (and usually, they will at some point in time).

Performance

Under the hood, this driver uses reflection to build a GraphQL query from the classes. Reflection is a technique to allow a running Java program to examine or "introspect" upon itself. By using this, all fields in a class can be retrieved with annotations. This technique does come at a cost: performance-wise, reflection is expensive. This is no reason not to use it, but it is important to be aware of. Premature optimization is the root of all evil, so we'll first test the performance of the system before worrying about the cost of reflection.

To test the performance we did a simple test: two objects have been parsed to a GraphQL query a million times on a developer’s computer (2015 MacBook Pro). The first object is very simple and has 10 fields without any nesting, but with a search field. This object has been chosen as it represents very simple calls, where you request a simple object with a certain ID from a service. It is comparable to the “human” object example that has been used throughout this blog, but with some additional fields. The second object has three layers of nesting and the total amount of fields in the query will be 10 times higher, so 100. This object represents a more extensive data call. The parse time and the relation between the two parsing times are interesting metrics, and they can give us trust (or distrust) in our solution.

Simple 10 field object
Big 100 field object
2577 ms
24395 ms

In the table the time to parse the model one million times is displayed. We can draw two conclusions from this data:

The ratio between the two is a factor 9.5 while the amount of fields is a factor 10. This means that the growth of the parsing time is a bit less than linear to the growth of the amount of fields. We do not really need to worry when our model grows.
The time of parsing the big 100 field object takes about 25 microseconds on a developers computer. This is maybe not exactly the time it would take on the node the code runs on, but it indicates it will be done quite fast and the time is probably negligible compared to the actual API call to the service.

Encouraged by these findings, we chose to test this in an acceptance, and later in a production environment. The costs of the model parsing was not really noticeable. If it would have been a problem the compilation of the model can be moved to build time, generating a template where only the search parameters need to be added on runtime. This is a nice possibility for the future, but for now there is no need for it.

Conclusion

While we were trying out GraphQL we encountered a small problem, but the proposed solution makes the lives of the developers a little easier again: the developers don’t need to worry about the syntax of the query and only one instead of two objects need to be maintained. With the use of annotations a Java model can be used to generate the GraphQL query, the result of this query will be mapped on the same object. By doing this, you have a single source of truth. This has as a result that adding or removing a field is very simple: you only have to do it on one place. You also don't have to worry about your GraphQL query anymore. This comes at a cost since reflection is expensive, but this was found to be negligible for our use case.

Jasper van Heijst

All articles by me