Ask a Flooder 22: Señor Performo on Load Testing Scenarios, Part 2

Breakpoint/stress, soak/endurance, and spike tests: Leandro Melendez (aka Señor Performo) discusses the differences between them and which one is right for your requirements

Señor Performo is back with the second part of his load testing scenarios video. This time, he talks about the differences between breakpoint tests, soak tests, and spike tests.

Hello, Nicole and Flood amigos!

I am glad that you allowed me to come back and finish up describing some of the most common load test scenarios.

You will see that the wait was worth it. Most probably you were left hanging, eager to find out about the rest of them, from my last visit here.

But... before we move on let's do a super quick recap of the ones that we saw in the last episode.

First, we saw the tiniest scenario, the single thread, single user, the lone ranger, etc. This one hits the processes one at a time, no load or micro loads.

Then we saw the average load scenario. We model this one from one average hour in production using an average volume of load.

And last, we mentioned the super average or busiest day scenario, stress scenario, etc, simulating what would be pushing further an average load scenario, but still inside common circumstances.

Aaaand now, if you are ready... We will get moving over the rest of the most common load scenarios. Let's get to the 4th scenario!

Breakpoint scenario

4--This one is known as well by many names but one of the most common names is break. The breakpoint scenario!

Some of its other aliases are stress (a bit confusing with the previous ones, I know), ceiling, and some others. Here I do like the breaking point name. This is because as the name says, this scenario wants to see how far the system can go in terms of the load before breaking.

In this scenario, when you are modeling and defining it, you must be clear on what you mean when you say how far can it go.

In other words, what will you count as far?

It could be the number of active users working regularly.
Or the number of transactions for a single process.
Or maybe the number of transactions from various processes.
Or... well, many many things.

But a way to simplify it, and a common practice, is to increase the threads or virtual users in the average load scenario, gradually. Each thread or virtual user might be doing the same as they were in the average load scenario, but we will keep increasing them until the system goes boom boom.

Here we just have ramp-up time. We start the threads and let them keep working while we keep starting more.

If you were diligent and executed the scenarios we mentioned in order, the system should have survived by now the average and super average loads.

Now, as mentioned, you can just use the ramp-up of super average and keep increasing load similarly, but without the instruction to stop the increase. Just keep it going like the batteries bunny... going and going and going. You just stop the increase of threads when you reach the limits of the system. Again these limits can be a bit variating, depending on what are you interested in finding as ceiling or limit.

This could be the moment that your responses and metrics become unacceptable. In other words, when the SLAs are not met. This could also be when the response time or metrics start to go bonkers to incredibly high levels, or when parts of the system start to fail and prompt not only weird responses but could also be plain nonresponsive elements or error messages. And you could go as far as pretty much when the server starts to melt, is on fire, or a meteor falls on it.

Last, you could as well do all of the above and take note at what load each breaking moment happens.

Ok, enough of this one, let's move on with more scenarios!

Endurance or soak test scenario

5--The following one is known as well with many names such as endurance, soak, degradation and others. I have even heard it being called a stamina scenario.

Crazy disconnect on names right? I think we all performers should get together and officially get to an official name for each. Maybe the problem is that some scenarios can become another one with small tweaks or could be a sub-family of the previous one.

Focus, FOCUS, Leandro! You are explaining endurance scenarios.

The main characteristic of the endurance scenario is that it is executed for a considerably long period of time. The goal for this is to check that the system will survive extended periods of use without needing a break, clear up stuff, reboots, or be given a pat in the back.

There there... you can do it man... keep going!

The duration periods for these scenarios are varied as well, I think the lowest is 4 hours more or less. But the general standard is around 8 hours.

Although I have seen some to last 24 hours, 48 hours! FIVE days! And the longest I have seen is a full week.

The main things that we will try to clear up here, are degradation over time of response times, hardware metrics, storage space, and most commonly, memory management.

The general recommendation for this one is to use the average load scenario and just extend its duration by the number of hours I mentioned. This scenario may also want to check the hyper average scenario for an extended period of time, a valid test only if you already ran that hyper average.

There are lots of mixes in the scenarios, right? Well, there are even more... let's keep moving.

Spike test scenario

6--The next one does not have that many names, as most of the time I have seen it named the same... spike. I have heard a few other weird ones, such as sudden death. But the most common name is "spike".

I kinda liked the sudden death name, or even Mexican standoff... got it? :P But let's keep it as spike which is a bit more official.

This one is not so common as it is used only when you expect a sudden increase in load in a tiny amount of time.

This increase can be overwhelmingly greater than the average or super average scenario, and the increase from zero to full can happen in just a few seconds. These types of situations are common to ticket sales sites, valued conventions, hot sales, and even yearly events like Black Friday.

Get ready, cause, to explain this one, here comes another example. Imagine that Ticketmaster revives the Beatles for just one concert. What do you think will happen to their servers the instant they open the ticket sales for that concert? My guess is that almost the entire world would try to buy their ticket in that instant, right?

So for this test, the main difference is that the ramp-up period will be zero or just a very short period of time, going up like crazy to an estimated high number, and staying there just for a bit, doing probably only the purchase process.

On this one as well, the mix of things that we will include will be very different, as most of the time, we may want to only check that the key processes survive. We may not care much about the other processes. Just sales processes for the revived Beatles concert. (Sorry Nickelback, we do not care much about your concerts ... or at least at that moment.)

So, only key elements, or just the single most important ones. We do not include things such as the ticket printing process, reviewing your account, and others. We can review those later, for now, we want to make sure that we survive the avalanche of receiving payments and selling.

Maybe the ticketing process is also important to survive. Whatever you define as REALLY important given the event. You may even want to decommission anything else or make it super light to not obstruct or slow down the Beatles!

Phew.... alright, even splitting this topic into two episodes, we have extended for too long, and believe it or not, we have barely scratched the surface.

So amigos, as you can see there are so many types of scenarios that you can do as a load test. Each of them clears up a load risk area.  There is no silver bullet scenario that covers every performance and load risk, so plan and execute accordingly.

And last, I will repeat my recommendation from the last episode. DO NOT TRY TO MIX THEM. Do not mix the scenarios. Each one has its own purpose and mixing them would be like searching for needles in multiple haystacks at the same time.

I hope all of them are clear and that you can now explain them. For now, we close up this topic. thank you my Flooder amigos for making it this far!

And thank you very much, Nicole!
See you next time!
Adios, amigos!

Start load testing now

It only takes 30 seconds to create an account, and get access to our free-tier to begin load testing without any risk.