Preparing Your E-Commerce Site for A Holiday Flash Sale

How to run load tests to make sure your site stays up

Black Friday, Cyber Monday and other flash sale events have become significant sales opportunities for e-commerce sites. Newer online flash sale events such as Click Frenzy that target consumers by luring them with the promise of huge bargains have also become exceedingly popular.

When Click Frenzy started in 2014 - it resulted in $189 million dollars profit and over 1 million visitors and this continues to grow every year. Unfortunately, a lot of the businesses involved and the site itself are generally woefully underprepared for the number of visitors wanting to snap up a bargain.

Any downtime for a participating online store during these events will significantly impact sales, generate negative publicity for the brand and effectively negate any financial expenditure and time spent in preparing for the event.

Why Websites Fail Under the Pressure of Flash Sales

A significant reason for websites and online businesses that say "we are ready" but then fail miserably when the event goes live is not being able to simulating the correct user behaviour of one of these typical sales events. Most of these online events are very similar to a Black Friday sale at a bricks-and-mortar store in that they have the following characteristics:

  1. Customers clamoring on the front door before the sale opening time.
  2. Once the doors open, customers stream into the store all at once.

The same scenario for an online sales event has almost identical characteristics:

  1. Customers constantly hitting F5 Refresh on a site's dedicated landing page before the sale opening time.
  2. Once the sales site opens, customers stream into the online store all at once.

This exact behaviour can be reproduced using a Load testing scenario called a 'spike test' and we'll show you how we can create a simple one using the Flood Element browser-level load testing tool.

Setting up your Spike test script.

The script itself is fairly straightforward and aims to simulate the scenario described above. The entire sample script can be downloaded here and then run using Tricentis Flood.

Creating the Pre-Sale Refresh Barrage

Below is the waitForSaleTime function thats sole purpose is to refresh the page until a specific time is met. Every user in the test will run this function until a specified sale time is met. This behaviour is very similar to what a real user would do before an actual sales event.

Simulating users consistently refreshing the browser before the Flash Sale begins

Most sales events will have a temporary landing page such as the following which is made available to the customer before the actual sale site goes live.

As the timer goes down to 00:00 - users will generally be refreshing the page fairly consistently so they can be amongst the first to view the sales items and make a purchase once the sale goes live.

This continual refreshing behaviour by existing users combined with new users navigating to the landing page tends to present a very high impact on a website’s infrastructure even before the sale starts so it is important to test this as well as the resulting 'spike' of traffic hitting the actual sales page.

There are many real life examples where a sales event website has gone offline due to this exact presale traffic behaviour even before the actual sale starts.

Aligning our Test to our Business Requirements

In order to run this script we need the following information:

1. A sale event time. This is the exact time that the sale event site goes live. A time & date will need to be converted to a timestamp using a site such as epochconverter.com

2. A sales landing page URL.  This is a URL where users will 'wait' until the sale time has been observed.

3. The actual sales site URL. This is a URL where the sale items on offer are presented to the user for purchase.

We have three Flood Element steps that will form the complete end to end scenario and these exact steps will get reported on the Tricentis Flood dashboard where we will be running this test.

The first step is the initial navigation to the sales landing page (in this case just the front page of our online sales store). The second step simply executes the __waitForSale function which does the bulk of the script functionality. The 3rd and final step is the navigation to the actual sales site (which in this case is just a sub page of our online sales store).

Evaluating the Application with Real World Load

Now we want to run the scenario against our site to see how it handles a flash sale. We have chosen a time in the not so distant future (a few minutes in advance) so we don't have to wait too long to see how things go. We also want to ensure we have enough time for all of our users to ramp-up and all be waiting at the same time.

As an aside - because Flood Element runs real users we can see a real time view of user traffic using something like Google Analytics. Browser level users (as opposed to protocol level users when using tools like JMeter and Gatling) will execute the client side Javascript code that often needs to be embedded onto your site for these services. This is really cool and very helpful to confirm how many users are actively on your site at any time.

Google Analytics showing Flood Element users as real users currently on the site

Starting the test we are able to see the initial behaviour of users ramping up and waiting on and refreshing the landing page every second, as seen below:

You can see that all 400 users completed ramp-up and were able to access the initial landing page with an average time of ~6.5 seconds which in this case is once the page is fully loaded in the browser.

There is a period of time where all users are refreshing and waiting for the sale to begin and this can be seen in the period marked __waitForSale duration. Once this period has completed the sale begins and all users stampede at once to the specific sale page.

Selecting and drilling down to the transaction label 2. Navigate to sale page shows the spike test in action. You can see the spike in response times (blue line) as every user tries to navigate to the same page at once - spiking the max response time of the page to over 12 seconds.

Making Sense of Your Test Results to Improve Performance

We can easily see that the stampede of users caused some issues for our site with a number of failed transactions (red line) being observed at the same time as the sale went live.

This brings both both good and bad news for us. The bad news is that this means our site cannot support this amount of users at once but the good news is that we can learn from these errors and make some performance enhancements to the site so that it can successfully support this level of user concurrency.

Because we take screenshots for our main script steps - we can actually see the exact problem and error that is presented to one of our poor users. We can also use server-level monitoring to see how much of our website’s infrastructure resources were being used at the time of the spike. However just from the captured screenshots we can see a potential bottleneck related to our site’s database infrastructure.

Conclusion: Flash Sales Require a Spike Test

So what have we learnt with testing this type of scenario?

Let's start with the main difference between a spike scenario and a normal 'steady-state' scenario. Running a steady-state scenario is not the representative scenario we are looking for with this type of sales event. If we had purely run 400 concurrent users all transacting against the sale page for an extended period we probably wouldn't have observed the errors raised above and the test probably would have been considered a success.

A spike scenario will also have a very different impact on your target infrastructure in terms of resource utilization and one that should be mandatory for any online business wanting to prepare for a major sales event.

Start load testing now

It only takes 30 seconds to create an account, and get access to our free-tier to begin load testing without any risk.