Sushi: A DSL for Conversational Flow Design

TL;DR

We share our experience on how we built a conversational bot with Finite State Machines. It turned out to have its own shortcomings so we built a (Domain-Specific Language) DSL to design our flows and called it “Sushi”.

Sushi is a more flexible and straight-forward tool to build complicated flows.

A Conversational Bot

At bol.com we want to help our customers with questions as quickly as possible, and preferably do it automatically. So we developed "ContactBot"! This product is built with this goal in the mind: saving money in Customer Service by redirecting questions for bol.com partners directly to partners, without interference by bol.com call agents.

bol.com a platform for 20000 partner shops

The bol.com webshop has its own products but also is a platform for more than 20.000 partner sellers. Currently, if our customers have questions they will always contact the bol.com customer service first. If the question is best answered by a partner seller, the customer will be redirected to the partner. To improve the customer experience and to save costs, we want to automatically determine which party is most suitable to answer the question.

ContactBot was the idea to route customers in a more efficient way to the channels to get their issues solved. We do this by showing appropriate channels to our users to contact with and encouraging to select these channels based on the information we get from them in previous steps.

Building the bot

We analyzed the design of a product that serves a conversation-like question & answer with customers without free text input. This indicated that in each step of the question we try to understand what the customer intends to reach, and indeed this needs to be done in several different ways and combinations of situations specific for the customers. We started by translating the problem into a Finite State Machine problem, where each state shows what the conversation situation is in terms of functional design. There is a clear mapping among different states of the conversations, and each state change is triggered by a certain event. In each of the states the bot should perform a certain job and then a transition occurs based on the input we get from the user.

We picked some libraries for FSM implementations and encountered several shortcomings, e.g. static declaration of states as enums and less flexibility in defining action types in state changes.

Eventually, we decided to go with our own simple implementation of FSM. This was a bit tricky in the beginning since we need user interaction in some states, and wait for the incoming events. The whole state machine for each conversation also needs to be served under an endpoint.

We faced some challenges during the development. One of them was re-usability of actions we defined in the conversations and being able to easily change the flows without hurting the functionality of other flows. We were able to solve this with appropriate isolation in definitions and abstraction.

During the project, we noticed that complexity in the FSM grew and maintainability would become an issue. It becomes time-consuming to follow the flow from the code. We also visualized what our flow looks like. You can see Figure 1. that even visually it is hard to follow what is happening.This made us think about a better way of defining and visualizing the flows.

Figure 1.

State Machine Visualization

Sushi is better than Spaghetti code

Keeping state names and translating the incoming events from users to transitions makes it hard to follow when the number of states and transitions increase.

We didn't necessarily need to define states since every state is mapped to the execution of an action. We can just define actions and connect them to each other based on user interaction, but at the same time keep track of where we are in this graph of actions.

We decided to use an approach different than FSMs, so we made a change in the concept of events. We don't have events anymore and all the transitions would be defined in each step of the execution.

Now we are ready to introduce Sushi.

What is Sushi?

Sushi is a Kotlin based Flow Design DSL that is built to create complicated flows with simplicity in mind. It is easy to define the flows in Sushi specifically there is even no need for programming knowledge.

Sushi is made of blocks. You define blocks with their types and connect them. There is three different type of blocks.

  1. Action
  2. Branch
  3. Container

1. Actions

These kind of blocks are the main ingredients to make delicious sushi. Each action is based on it is the type which shows what it actually does. By looking at figure 2. below you can understand how an action actually works.

Figure 2.

Action block

2. Branches

If you want to control your flow based on conditions you can use Branch blocks.

3. Containers

What if we want to re-use a couple of blocks that are connected to each other. This is similar to a function call in a programming language. It enables us to build more complicated flows by isolating the complexity in different layers.

Sushi comes in two different artifacts, core, and service. If you would like to use the core library of Sushi in your project where you define your own actions using Sushi's API, you need to go with sushi-core version. In case you would like Sushi to take care of all things for you as a self-contained server managing persistence and running the flows for you, then sushi-service is your choice.

A DSL for building flows

The beauty of Sushi becomes clear when you actually start using it. There are three ways to build flows in Sushi:

  1. Programmatically, using Kotlin DSL. (You need to know Kotlin)
  2. Using TOML configuration files. (You need to know how to configure blocks)
  3. Using UI to generate the flow. (You just drag and drop visually)

1. Define Flows Programmatically

You only need to create a list of blocks and specify what would be the next blocks. Then Sushi's engine will take care of wiring and validating your flow.

Just as easy as ordering sushi in a restaurant.

val flows = mutableListOf(
    Action().apply {
        name = "go to the store"
        id = "first-action-id"
        type = "go-to-store"
        source = true
        nextBlocks = mutableListOf("second-action-id")
    }, Action().apply {
        name = "ask the menu from waiteress"
        id = "second-action-id"
        type = "ask-menu"
        nextBlocks = mutableListOf("third-action-id")
    }, Branch().apply {
        name = "check if they have Maguro Teryaki"
        id = "check-maguro"
        on = "menu"
        mapping = mapOf(
            ""
        )
    }
)

flowEngine.wire(flows)
flowEngine.executeFlow()
flowEngine.await()

You can even inject objects using params of actions to do complicated tasks with dependencies. The engine will take care of all the complexity and wiring the blocks with each other.

2. Using TOMLs

TOML is an easily-readable configuration format which we use in Sushi. It is very easy to define tables and key-value pairs in TOML. You can read more about the syntax here.

Keywords that you need to consider in defining blocks in Sushi:

  • You define your actions do easily by their type.
  • then shows where to go after this state, can be multiple blocks.
  • You can use the keyword depends when your action depends on other actions.
  • Define your parameters as more as you want.
<>
    name = "go to the store"
    source = true
    type = "go-to-store"
    id = "first-action-id"
    next = <"second-action-id">

<>
    name = "ask the menu from waiteress"
    type = "ask-menu"
    id = "second-action-id"
    next = <"check-maguro">


<>
    name = "check if they have Maguro Teryaki"
    id = "check-maguro"
    on = "menu"
    
        branch-1  = "has-maguro"
        branch-2 = "6"
        branch-3 = "7"

3. Using UI

For generating and adjusting TOMLs, you can also use our UI:

  • Just add blocks visually and connect them to each other.
  • Sushi Engine will take care of validations of the flow.
  • You can execute the flows from UI and get the results.

The UI is connected to the backend with websockets that gets updates from backend on the fly while executing the actions. This is very useful to get the updates and debug your flow on while developing it.

You can search in the library of actions which consists of custom-made public actions by other users. Any change on the graphical flow will change the TOML configuration synchronously. Editing actions are also something you can do with UI. Changing the parameters and editing source blocks is also configurable from UI.

Figure 3.

Visualization of the flow

Containers

The real power of sushi comes with the Containers. It allows us to define a complex combination of blocks with specific parameters and re-use them easily.Anyone can define custom containers and make it public or private. By defining more useful public containers you can help others to reuse your containers as custom actions.

This makes the real difference here with other libraries/tools. Being able to abstract the complexity in different layers is the desired functionality while defining a complicated flow.

Status and Future Work

Sushi is continuously under development, but we are already able to build microbots with it.

Sushi can be described as a hybrid solution for building flows. Since it supports both state machine capabilities and Slot Filling at the same time. This feature makes it a good choice for building bots that are based on dynamic/non-linear flows.

Acknowledgments

I would like to thank Emiel Ubink, Daniël Heres, Sjors van Berkel, and all my awesome team members for their valuable feedback.

Amin Dorostanian

All articles by me