State Timeouts with gen_statem

Tags

Hourglass with black sand on table

A few years ago when I was first getting into Elixir, I wanted to learn some Erlang as well. While browsing through the Erlang docs, I discovered gen_statem. gen_statem is a behaviour in Erlang OTP for building state machines. In this post, I’ll explore what I learned experimenting with gen_statem by stepping through a ticketing prototype application.

Getting Started

Since we’re in Elixir land, we could interoperate with gen_statem directly via atom name: :gen_statem.start_link(). However there’s a great Elixir wrapper, GenStateMachine, around gen_statem that provides us with all the conveniences we expect from a behaviour. For the sake of convenience, I’ll be using the Elixir wrapper GenStateMachine see the readme for install instructions.

After a quick look at the docs for gen_statem there appear to be a couple of uses:

  1. It’s a state machine
  2. Timeouts for states

What does a state machine do? I like to think of a state machine as a predefined set of states. The transition from one state to another happens only when certain conditions are met.

Timeouts are really easy to do with a GenServer by sending a message to the current process like so:

Process.send_after(self(), {:transition_state, :available}, 1000)

Adding one timer to a process is easy enough. Adding several timers to a process adds complexity with managing the timers. Luckily though gen_statem makes state-based timers easy.

Box Office Demo

One way to model the usage for GenStateMachine is to use a ticket purchasing model. In this basic example a user attempting to buy a ticket can hold the ticket, but the hold must expire after a set interval.

Here’s the first pass at our module:

defmodule BoxOffice.ShowSeatState do
  use GenStateMachine

  alias BoxOffice.{Customer, ShowSeat}

  ### Client API
  def start_link(show_seat = %ShowSeat{}, opts \\ []) do
    %ShowSeat{current_state: current_state} = show_seat

    default_state = Keyword.get(opts, :default_state, :available)

    data = %{default_state: default_state, show_seat: show_seat}

    GenStateMachine.start_link(__MODULE__, {current_state, data})
  end

  @doc """
  Get the current state and data from the process.
  """
  def get_state(pid) do
    :sys.get_state(pid)
  end

  @doc """
  Get the current state of the seat.
  """
  def current_state(pid) do
    {current_state, _data} = :sys.get_state(pid)
    current_state
  end

  @doc """
  Hold a seat temporarily for a customer.
  """
  def hold(pid, customer = %Customer{}) do
    GenStateMachine.call(pid, {:hold, customer})
  end

  ### Server API

  @doc """
  State can transition from `available` to `held`.
  """
  def handle_event({:call, from}, {:hold, customer}, :available, data) do
    %{state_timeout: state_timeout} = data

    data = Map.put(data, :current_customer, customer)
 
     {:next_state, :held, data, [{:reply, from, {:ok, :held}}]}
  end
end

Most of the above code is pretty vanilla GenServer except for the handle_event callback. Instead of a {:reply, data, state} or {:noreply, state}, the tuple starts with {:next_state, ...}. Let’s break this down some more.

The first element in the tuple :next_state indicates that the state is changing.

The second element :held is the new state of the ticket.

The third element data is what you would call the state in a GenServer. The data is anything (string, tuple, map, etc) that persists as the “state” of the process. This is not the state of the ticket / state machine.

The fourth and last element of the tuple is a list of actions (or it could be just one action). In this case, it’s a {:reply, ...} tuple because the client used a GenStateMachine.call.

How did I learn all this? Well there’s some information in the GenStateMachine documentation. The documentation shows :gen_statem.event_handler_result(state()) for the return type, and then it has a link to the Erlang documentation.

{next_state, NextState :: StateType, NewData :: data(), Actions :: [action()] | action()}

From the Erlang documentation, I had to dig into each type for the elements of the tuple. If you’re like me, I don’t read a lot of Erlang documentation. I find Erlang interesting, but it’s not always the easiest to understand. We’ll come back to the return type in a bit.

Here’s a few tests to get the ball rolling:

defmodule BoxOffice.ShowSeatStateTest do
  use ExUnit.Case
  alias BoxOffice.{Customer, ShowSeat, ShowSeatState}

  setup do
    show_seat = %ShowSeat{id: 1, theater_id: 1, seat_id: 1, current_state: :available}
    customer = %Customer{id: 2, first_name: "Joe", last_name: "Blow"}

    {:ok, %{show_seat: show_seat, customer: customer}}
  end

  test "process holds the full state", context do
    %{show_seat: show_seat, customer: _customer} = context

    {:ok, pid} = ShowSeatState.start_link(show_seat)

    full_state = ShowSeatState.get_state(pid)

    assert full_state ==
             {:available, %{default_state: :available, show_seat: show_seat}}
  end

  test "spawn a process to track the current state", context do
    %{show_seat: show_seat, customer: _customer} = context

    {:ok, pid} = ShowSeatState.start_link(show_seat)

    assert ShowSeatState.current_state(pid) == :available
  end

  test "holds a seat for a set interval and resets state on timeout", context do
    %{show_seat: show_seat, customer: customer} = context

    {:ok, pid} = ShowSeatState.start_link(show_seat, state_timeout: 1_000)

    assert ShowSeatState.current_state(pid) == :available

    assert ShowSeatState.hold(pid, customer) == {:ok, :held}
    assert {:held, %{current_customer: customer}} = ShowSeatState.get_state(pid)

    Process.sleep(1_001)

    assert {:available, %{current_customer: nil}} = ShowSeatState.get_state(pid)
  end

Looks like this last test fails, and for good reason. The state timeout is not implemented.

  1) test holds a seat for a set interval and resets state on timeout (BoxOffice.ShowSeatStateTest)
     test/show_seat_state_test.exs:31
     ** (EXIT from #PID<0.191.0>) an exception was raised:
         ** (MatchError) no match of right hand side value: %{default_state: :available, show_seat: %BoxOffice.ShowSeat{current_state: :available, id: 1, seat_id: 1, theater_id: 1}}

Coming back to the documentation, let’s take a look at the documentation for the last element of tuple actions. When you’re on the Erlang docs, and looking at the event_handler_result. You can click the action() type which is hyperlinked. When looking at the action() type, there’s 3 subtypes:

postpone |
    {postpone, Postpone :: postpone()} |
    {next_event,
     EventType :: event_type(),
     EventContent :: term()} |
    enter_action()

No wait, there are 4 subtypes. The last one, enter_action(), almost got past me the first few times I looked at the docs. Clicking enter_action() takes you to yet another type!

This is what enter_action() type looks like:

enter_action() =
  hibernate |
  {hibernate, Hibernate :: hibernate()} |
  (Timeout :: event_timeout()) |
  {timeout, Time :: event_timeout(), EventContent :: term()} |
  {timeout,
   Time :: event_timeout(),
   EventContent :: term(),
   Options :: timeout_option() | [timeout_option()]} |

  {state_timeout,
   Time :: state_timeout(),
   EventContent :: term()} |
  {state_timeout,
   Time :: state_timeout(),
   EventContent :: term(),
   Options :: timeout_option() | [timeout_option()]} |
  reply_action()

Ok, so there are quite a few options here. The one that looks most appropriate to trigger a timeout for the hold state is state_timeout.

{state_timeout, Time :: state_timeout(), EventContent :: term()}

Also there’s the reply_action() which is how we can use the {reply, ...} in this list of actions. Ok now that we better understand how to set a state timeout, let’s update the code.

Here’s the changes to start_link() so we can set a default state timeout and be able to pass in a value for a custom timeout.

def start_link(show_seat = %ShowSeat{}, opts \\ []) do
  %ShowSeat{current_state: current_state} = show_seat

  default_state = Keyword.get(opts, :default_state, :available)
  state_timeout = Keyword.get(opts, :state_timeout, 5000)

  data = %{default_state: default_state, state_timeout: state_timeout, show_seat: show_seat}

  GenStateMachine.start_link(__MODULE__, {current_state, data})
end

This is the updated handle_event() for transitioning the state from available to held.

def handle_event({:call, from}, {:hold, customer}, :available, data) do
  %{state_timeout: state_timeout} = data

  data =
    data
    |> Map.put(:current_customer, customer)

  {:next_state, :held, data, [
                              {:reply, from, {:ok, :held}}, 
                              {:state_timeout, state_timeout, :hold_timeout}
                              ]}
end

Notice the last element of the tuple is a list of actions(). The first action is {:reply, from, {:ok, :held}} so the client will get a response after calling the client api hold() function. Then there’s the newest addition: {:state_timeout, state_timeout, :hold_timeout}. The first element is fairly straightforward, :state_timeout atom indicated a timeout should be set for the transition to the :held state. The second element is a variable that is bound to the number of milliseconds to start the timer for. The third and last element is the EventContent called :hold_timeout. The :hold_timeout will be used to identify and handle the timeout.

Above we set a timeout for the held state. Now it’s time to handle timeout, and transition the state back to available.

  @doc """
  Timeout is triggered when the current state is `held`.
  State resets to the `default_state`.
  """
  def handle_event(:state_timeout, :hold_timeout, :held, data) do
    %{default_state: default_state} = data

    data =
      data
      |> Map.put(:current_customer, nil)

    {:next_state, default_state, data}
  end

Essentially, the code above is looking for the default_state (available) that was set in start_link() and returns {:next_state, default_state, data}. Using the specific tuple of :next_state, the state transitions back to available. Now when the tests are run the state will timeout after 1 second timeout set in start_link().

test "holds a seat for a set interval and resets state on timeout", context do
  %{show_seat: show_seat, customer: customer} = context

  {:ok, pid} = ShowSeatState.start_link(show_seat, state_timeout: 1_000)

  assert ShowSeatState.current_state(pid) == :available

  assert ShowSeatState.hold(pid, customer) == {:ok, :held}
  assert {:held, %{current_customer: customer}} = ShowSeatState.get_state(pid)

  Process.sleep(1_001)

  assert {:available, %{current_customer: nil}} = ShowSeatState.get_state(pid)
end

Wrapping up

It took a bit of digging, but using GenstateMachine (gen_statem) can simplify code using timers for states. The behaviour provides a well-defined structure for managing state transitions. Yes, the same can be done with a GenSever though you have to do the wiring yourself. Overall, it was an interesting exercise to explore gen_statem, and learn a bit more about Erlang too. The full source can be viewed on Github.

DockYard is a digital product agency offering custom software, mobile, and web application development consulting. We provide exceptional professional services in strategy, user experience, design, and full stack engineering using Ember.js, React.js, Ruby, and Elixir. With a nationwide staff, we’ve got consultants in key markets across the United States, including San Francisco, Los Angeles, Denver, Chicago, Austin, New York, and Boston.