Yannick Clybouw

Blog about Google Cloud

Performance Tests for Your Minimal Viable Product (MVP)

You are excited: you have just finished the Minimal Viable Product (MVP) of your web application and are ready to go live! You have well prepared this MVP: the requirements have been gathered through extensive market research, you have found some early customers to test a prototype, and you have set-up a marketing strategy supporting the launch date.

And then the big moment is there: you press the launch button of the MVP and the marketing campaign. Already in the first minutes after the news letter has been sent, 10 people click on the link to your application... And boom, and everything is so slow now!

So what happened? You find a database query that is only slow if several users add data to your application. That was code you've written in the beginning and you overlooked it since then. You fix it now, but those 10 people won't come back.

This could have been avoided by having done some simple performance testing before the launch. This tutorial shows how to tackle this and is applicable for all kind of web applications whether or not they are serverless, monolith or using microservices.

Step 1: define the user actions

Define the actions you expect your users to make. If you are already live, it's better to actually measure this.

As an example, we take a web application I'm still maintaining: Winston-Analytics, a speech coaching bot. User actions for our app are:

  • A user logs in

  • A user list their speeches

  • A user uploads a new speech

  • A user views a speech report

  • A user shares a speech report

Note that we are only interested in the load on the backend systems. The performance in the browser is not in scope of this tutorial. Examples for our coaching bot that are browser-only: scrolling through the report, playback of an already loaded audio file...

There is no need to define all actions – this is not a functional test – but is is important to have a realistic impact on all sub systems:

  • Cover all complex database queries

  • Cover all CPU-intensive tasks (e.g. speech analysis)

  • Cover all internal and external services (e.g. authentication service)

Step 2: define the load of the user actions

Try to estimate (or measure if you are already live) following numbers:

  1. What is the maximum number of users on your application at the same time?

  2. How long are users on your application?

  3. How many times are the users executing each action?

Example for our speech coaching bot:

  1. There are maximum 360 users at the same time

  2. They are using the application during 30 minutes

  3. Each user does:

    • 1 login

    • 4 speech listings

    • 2 speech uploads

    • 3 speech report views

    • 1 speech report sharing

Step 3: implement the user actions

We will use a tool to exercise the load on our application. In this tutorial, we use Taurus as a wrapper for JMeter, which is free to use.

We need to mimic the behaviour of the user's browser to the backend. You can use the Network tab of your browser's Developer Console deduct them. Only use the requests that actually do something on the backend (so no static files):

  • If your backend is only a REST interface, then you can filter for the XHR requests.

  • If your backend serves templates generated on the server, then you need to know which HTML files are generated, and which ones are just static files.

Mind that you will need to find out how the authentication is done (e.g. a session cookie).

In Taurus, we define our test script in a YAML file. Each request can be defined, and you can even extract data from previous requests to use as variable in a next one (e.g. a session cookie or a user id). If you have a lot of scenarios, or if you need to parse HTML in order to get this data, you can use a fixed data set and "hard code" them. Anyhow, you will need to prepare already some data in your system (see step 5 further); for our speech coaching bot, this would be: users, speech reports, etc.

If you have a look at the documentation of Taurus, you can choose (A) to consider each action as a separate scenario, or (B) to group the actions in scenarios. Both use cases have their pro and cons.

Option A: one scenario for each user action

Pro: you can easily adjust the load of all user actions independently.

Con: you cannot pass data from one scenario to another one. For example: if you have a separate login scenario, you cannot pass a session cookie to the other ones. For the other scenarios, you need to foresee valid session cookies in the test data.

Let's take following three actions as an example: login, list speeches, view speech report. We provide Taurus a file with the test data to be used in CSV files. user_credentials.csv:

username,password

test001@example.com,pass1234

test002@example.com,pass1234

(...)

user_data.csv (we foresee three different speeches per user session):

user_id,session_id,speech_id

f293u8uf29,6c5729b1-53a4-4028-abf3-695fe2accd9b,6c5729b153a4

f293u8uf29,6c5729b1-53a4-4028-abf3-695fe2accd9b,of0923rods04

f293u8uf29,6c5729b1-53a4-4028-abf3-695fe2accd9b,if02d97g00ss

keo230lr2s,c9facb95-6367-4fb3-917b-48f2a3f82a88,902uf9pkdfui

keo230lr2s,c9facb95-6367-4fb3-917b-48f2a3f82a88,kdkdiii92332

keo230lr2s,c9facb95-6367-4fb3-917b-48f2a3f82a88,90eiujr28999

(...)

The actual Taurus configuration file:

execution:

- scenario: login

concurrency: 1

hold-for: 5m

throughput: 0.2 # = 1 * 360 req / 30 min

- scenario: list-speeches

concurrency: 1

hold-for: 5m

throughput: 0.8 # = 4 * 360 req / 30 min

- scenario: view-speech-report

concurrency: 1

hold-for: 5m

throughput: 0.6 # = 3 * 360 req / 30 min


settings:

env:

domain: speech.winston-analytics.com

base_url: https://speech.winston-analytics.com


scenarios:


login:

data-sources:

# Extract ${username} and ${password}:

- user_credentials.csv

requests:

- label: login

url: ${base_url}/login

method: POST

headers:

Content-Type: application/json

body:

username: ${username}

password: ${password}


list-speeches:

data-sources:

# Extract ${session_id} and ${user_id}:

- user_data.csv

cookies:

- name: session_id

value: ${session_id}

domain: ${domain}

requests:

- label: list-speeches

url: ${base_url}/speeches?userId=${user_id}

method: GET


view-speech-report:

data-sources:

# Extract ${session_id} and ${speech_id}:

- user_data.csv

cookies:

- name: session_id

value: ${session_id}

domain: ${domain}

requests:

- label: view-speech-report

url: ${base_url}/speeches/${speech_id}

method: GET

Option B: group all user actions in one scenario

Pro: you can easily pass data (e.g. session cookie, report id) from one user action to the next one.

Con: each user will execute their actions sequentially, meaning that first all login calls we be executed, then the next one, etc. If you hit the limits of your backend, you will need to tweak with ramp-ups (to spread logins over time) and different scenarios with different orders.

As user data, we only need user_credentials.csv (see option A).

The actual Taurus configuration file:

execution:

- scenario: all-actions

concurrency: 60 # 60 threads = 360 / 30 min * 5 min

ramp-up: 5m

hold-for: 10m

throughput: 0.2 # = 1 * 360 req / 30 min


scenarios:

all-actions:


variables:

base_url: https://speech.winston-analytics.com


data-sources:

# Extract ${username} and ${password}:

- user_data.csv


requests:

- label: login

url: ${base_url}/login

method: POST

headers:

Content-Type: application/json

body:

username: ${username}

password: ${password}

store-cookie: true # Is actually true by default

extract-jsonpath:

user_id:

# userId is top-level key in response body

jsonpath: $.userId


- label: list-speeches

url: ${base_url}/speeches?userId=${user_id}

method: GET

extract-jsonpath:

speech_ids:

# Get three first speeches from the response:

jsonpath: $.speeches[:3].id


- foreach: speech_id in speech_ids

do:

- label: view-speech-report

url: ${base_url}/speeches/${speech_id}

method: GET

Step 4: define your metrics

It is important to define some metrics before running your tests. There are two categories:

  • Metrics related to the HTTP calls, like latency, duration and failure rate. Those will be generated by Taurus, so no need to worry now.

  • Metrics related to the backend systems, like CPU utilisation, memory usage, etc. Preferably, you have already monitoring in place, so you can easily use this to look how the system will behave during the load tests. If you haven't, you can login to your system and prepare to watch them live (e.g. open top on your servers).

Step 5: run the tests

If you have a staging environment and you are already live in production, it is recommended to use that environment for the performance tests, only if that staging environment has similar infrastructure as the production environment.

In step 3, we have foreseen test data for our load tests, like user IDs, session cookies, etc. Make sure that they are valid for the system under test. You might need to write a small script to add them to your databases.

Before running the full performance test, can can first try with only a few calls, e.g.:

execution:
- scenario: <scenario_name>
concurrency: 1
iterations: 1

You can launch the test with:

bzt performance_test.yaml -report

In the console output, there will be a link to an online report. If everything looks fine, you can change your executions back to a more realistic load, and run the tests again.

Note that you need to keep an eye on the CPU and network load of the computer where you are running Taurus. If this becomes a bottleneck, you can use a test server in the same subnet as your backend. Depending of the size of that machine, you can easily simulate hundreds of users per second with one Taurus command.

Probably, you will need to play with the defined load settings to see how your backend can handle this. If you are using a non-production environment, you can also try to push until it breaks. This gives you an indication of when you will need to work of the scalability of your product.

Afterthoughts

Performance is a functional requirement as any other one, but often overlooked. Doing some simple performance tests is a good practice before going live with your product.

Once you've done those tests once, it is easy to redo them in the future, especially when doing infrastructural changes. You can also add them to you continuous integration pipelines, so they are run on a regular basis.