JMeter Load Testing Tutorial: Regular Expressions

Using regular expressions for JMeter can be daunting

This is a guide to using the JMeter regular expression extractor, to help correlate response values with future request parameters. It introduces you to regular expressions and how they're implemented in JMeter. We'll take a look at common scenarios we see on Flood and answer common support questions.

You may have heard the term correlation used by performance testers, but what does it mean?

Correlate. have a mutual relationship or connection, in which one thing affects or depends on another.

Instead of the science of correlation and causation, in load testing language, this often refers to the act of correlating a dynamic value from a response with a value in a future request. If you are converting from LoadRunner to JMeter then you will know what I mean, if not, then read on!

Examples of dynamic data in response bodies

  • CSRF Tokens: the server sends a unique authenticity token in the body of a response, which is then used by the browser when it submits a form in the next request.
bash
first response<<<<<<<<<<<<<<<<<<<<<<<<<meta content="authenticity_token" name="csrf-param" /><meta content="VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E=" name="csrf-token" />next request>>>>>>>>>>>>>>>>>>>>>>>>Content-Disposition: form-data; name="authenticity_token"VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E=
  • __ VIEWSTATE: Microsoft® ASP.NET web applications persist changes to the state a form in the body of a response, which is used on each susbequent postback to the server.
bash
first response<<<<<<<<<<<<<<<<<<<<<<<<<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58=" />next request>>>>>>>>>>>>>>>>>>>>>>>>Content-Disposition: form-data; name="__VIEWSTATE"qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58=

How do I know if I need to correlate response values with future requests?

An experienced tester will have an eye for this, and by looking at network activity in their favourite proxy recorder or network debug tool, well known parameters such as VIEWSTATE or JSESSIONID, or timestamp and tokens should stand out like the proverbial. Other parameters may be more subtle to detect, especially single characters or obfuscated parameters that result from some form of javascript parsing / execution.

The only way to know for sure, is to adapt the old proverb

Measure twice and DIFF!

By that we mean take a snapshot or recording of your transaction twice, with the same user and use some form of file comparison tool to detect differences. If you can't do that, then often the Mark I Human Eyeball will suffice.

Correlation using JMeter

Using the above CSRF token example, let's take a look at how it's done in JMeter.

We'd make our first request with a HTTP Request Sampler

bash
Thread Group
 HTTP Request
   Server Name or IP: flood.io
   Path: /

To that request, we'd add a Regular Expression Extractor

bash
Thread Group
 HTTP Request
   -> Regular Expression Extractor
     Reference Name: authenticity_token
     Regular Expression: content="(.+?)" name="csrf-token"
     Template: $1$
     Match No. (0 for Random): 1
     Default Value:

The Reference Name will store the results of the expression in the JMeter variable ${authenticity_token}

Your First Regular Expression

If you don't know any regex, take heart, you can get started with some basics. Consider the following expression:

`content="(.+?)" name="csrf-token"`

All regular expressions are pattern matching. This expression says:

  1. match the characters content=" literally (case sensitive)
  2. then capture the 1st group in brackets (.+?)
  3. inside that group, match any character .+? between one and unlimited times, as few times as possible, expanding as needed (otherwise known as lazy)
  4. then match the characters " name="csrf-token" literally (case sensitive)

Visually the expression looks like this:

Now that we have a regular expression, we specify a template for using it with $1$. This means the variable ${authenticity_token} will be populated with the first matched group only.

Our Match Number is set to 1, we want the first instance of this matched string, as there could be multiple matches on a page.

Some Variations on Regex

The JMeter manual has some useful regular expressions which you can familiarise yourself with. Following is an an example which fleshes out some of these concepts.

Often you will need to extract multiple attributes from a HTML tag, for example the ID and Name attributes associated with a particular class.

bash
<input class="cats" id="meow" name="Buster">
<input class="cats" id="purr" name="Mac">
<input class="cats" id="roar" name="Sooky">

Consider the following regular expression extractor:

bash
Thread Group
 HTTP Request
   -> Regular Expression Extractor
     Reference Name: cat
     Regular Expression: class="cats" id="(.+?)" name="(.+?)"
     Template: $2$ says $1$
     Match No. (0 for Random): 1
     Default Value:

This would yield the following results:

bash
cat=Buster says meow
cat_g=2
cat_g0=class="cats" id="meow" name="Buster"
cat_g1=meow
cat_g2=Buster

What does that all mean? We stored the results of the expression class="cats" id="(.+?)" name="(.+?)" in a JMeter variable called ${cat}. The template we used was the 2nd group followed by the string says followed by the first 1st group. So in effect ${cat} now equals Buster says meow

JMeter also breaks the expression up into other variables which is handy when we only want parts of the matched expression. For example ${cat_g1} says the 1st group equals meow and likewise ${cat_g2} equals the 2nd group Buster. Indeed we can also get the entire matched string from the regular expression via $cat_g0.

What happens if we wanted all the matches on a page? This is where Match No.comes into play. In previous examples we used the 1st match found on the page.

Match No. is not a zero based index! If you specify 0 then you will get a random match from the page.

So this expression:

bash
Thread Group
 HTTP Request
   -> Regular Expression Extractor
     Reference Name: cat
     Regular Expression: class="cats" id=".+?" name="(.+?)"
     Template: $1$
     Match No. (0 for Random): 0
     Default Value:

Would yield the following results:

bash
cat=Sooky
cat_g=1
cat_g0=class="cats" id="roar" name="Sooky"
cat_g1=Sooky

In this case we only matched one (group) and used a random match on the page, so this iteration ${cat} equals Sooky.

In further iterations, we would see any random value from Buster, Sooky and Mac.

The other trick we might like to do is return all matches on the page.

This expression:

bash
Thread Group
 HTTP Request
   -> Regular Expression Extractor
     Reference Name: cat
     Regular Expression: class="cats" id="(.+?)" name="(.+?)"
     Template: $1$
     Match No. (0 for Random): -1
     Default Value:

Yields the following results:

bash
cat=Mac
cat_1=meow
cat_1_g=2
cat_1_g0=class="cats" id="meow" name="Buster"
cat_1_g1=meow
cat_1_g2=Buster
cat_2=purr
cat_2_g=2
cat_2_g0=class="cats" id="purr" name="Mac"
cat_2_g1=purr
cat_2_g2=Mac
cat_3=roar
cat_3_g=2
cat_3_g0=class="cats" id="roar" name="Sooky"
cat_3_g1=roar
cat_3_g2=Sooky
cat_matchNr=3

There's a new variable present called ${cat_MatchNr} which equals 3. As the name suggests, this is the total amount of matches on the page. This can be quite handy for a number of reasons. For example if we wanted to loop through all the matches in the response and include them in the next request, we could do something like this using a BeanShell PreProcessor

The following BeanShell script executes a basic for loop, from 1 up to cat_matchNrwhich equals 3, and for each iteration of the loop, adds a HTTP request parameter as follows:

bash
Thread Group
 HTTP Request
   -> BeanShell PreProcessor
     Script:
       int count = Integer.parseInt(vars.get("cat_matchNr"));

       for (int i=1; i<=count; i++)
       {
         says = vars.get("cat_" + i + "_g1");
         cat = vars.get("cat_" + i + "_g2");
         sampler.addArgument(cat + "_says", says);
       }

This yields the following request:

`GET http://wheres.my.kitten.com/?Buster_says=meow&Mac_says=purr&Sooky_says=roar`

Extracting from the Header

Sometimes the value you are after does not exist in the response body, it might exist in the response header.

It's easy to do this, just change the response field to check to Headers

So this expression:

bash
Thread Group
 HTTP Request
   -> Regular Expression Extractor
     Response Field to Check: Headers
     Reference Name: auth_token
     Regular Expression: token":"([^"]+)"
     Template: $1$
     Match No. (0 for Random): 1

Would yield the following results:

bash
auth_token=abcd1234
auth_token_g=1
auth_token_g0=token":"abcd1234"
auth_token_g1=abcd1234

Extracting values from JSON

Flood supports the use of JMeter plugins which make it very simple to extract JSON values from a response body.

Consider the following response body from a typical HTTP response containing JSON:

bash
{
 "ok" : true,
 "status" : 200,
 "name" : "Armory",
 "version" : {
   "number" : "0.19.8",
   "snapshot_build" : false
 },
 "tagline" : "You Know, for Search"
}

This expression:

bash
Thread Group
 HTTP Request
   -> jp@gc - JSON Path Extractor
     Name: name
     JSON Path: $.name

   -> jp@gc - JSON Path Extractor
     Name: version_number
     JSON Path: $.version.number

Would yield the following results:

bash
name=Armory
version_number=0.19.8

It doesn't get more simple than that!

The examples we've shown on this page are probably enough to get you started. Feel free to contact support with more specific questions.

Start load testing now

It only takes 30 seconds to create an account, and get access to our free-tier to begin load testing without any risk.

Keep reading: related stories
Return to the Flood Blog