Set up A/B tests

What Is A/B Testing?

A/B testing is a method to determine whether specific configuration changes are impactful. By creating two revisions, “A” and “B”, and running each revision on 50% of the site traffic, publishers can measure which revision performs better.

For example, if you want to test whether adding an additional bidder improves monetization, you can setup an A/B test to run on 50% of the traffic with the new bidder and 50% without. Comparing the eCPM of each version provides tangible data to make such decisions. Results of the AB Test can be evaluated in GAM using key-values.

Typical uses of A/B testing include measuring the impact of:

  • Adding / removing ad networks (bidders)
  • Adding new ad units / modifying ad unit sizes
  • Changing ad refresh rates
  • Eager vs. lazy loading

Step 1 - Create revisions to be compared

First, create two or more revisions whose performance you want to compate.

  1. Determine which revision will be the test BASELINE, usually this is the revision that's currently in production. In the video above, the baseline revision is #18.
  2. Make a copy of the BASELINE revision. Edit the copy to enable/disable the behavior or feature that you want to test.
  3. Save this “modified” revision and add a short note about what was changed

Now you should have two versions saved:

  • A - the “baseline” reversion (that is, the original version)
  • B - the “modified” revision (which includes the changes you want to test)

Step 2 - Create A/B Test Revision

Next, create an “A/B Test Revision”. This is a special type of revision, which tells HTL BID which two configurations are being compared in the A/B Test.

  1. Open the latest revision
  2. Click "AB Testing" in "Tech Config" > Services
  3. image
  4. Go to AB Testing in the sidebar menu
  5. Select from the drop down menu and select at least 2 revisions to be tested.
    1. One revision should be “A”, the “baseline” revision from Step 1
    2. One revision should be “B”, the “copy” revision from Step 1
  6. After selecting all revisions to be tested you will need to assign the "Custom Allocation" (weighted ratio at which the versions are served). This ratio will determine what percentage of the daily traffic each revision receives. For example, in configuration shown below, the CDN will serve each revision 50% of the time.
  7. image
💡
A/B tests are most easily understood if each revision is weighted equally. If you choose unequal weights, additional analysis may be required to normalize the data.

Step 3 - Create the key-value in GAM

  1. Create the htlbidid key-value in GAM that will be used for reporting.
  2. Set the key-value settings as:
    • Dynamic
    • Include values in reporting
image
💡
The key-value must be configured in GAM before the A/B Test starts running to access the results. Otherwise the Test will run but no data will be reported for it.

Step 4 - Deploy and QA

Deploy the “A/B Test” revision.

Once deployed, you can verify that the chosen revisions are randomly loaded according to the A/B Test weights.

Verify revisions are switching by opening the Chrome Dev Tools in the site and typing htlbid.versionId in the Console. This will print out the Version ID that loaded on the page. Hit reload and repeat a few times and you should see that different versions are loaded.

image

Step 5 - Review A/B Test Results

Use GAM reporting to compare A/B Test results. The simplest report is the following

  1. Filter by:
    • Key-Value CONTAINS htlbidid
  2. Date range
  3. Dimensions:
    • Date
    • Order
    • Key-Value
  4. Metrics:
    • Total Impressions
    • Total CPM and CPC revenue
    • Total average eCPM
  5. Create a Pivot Table to look at the effects of different revisions:
  6. image
  7. Compare the eCPM to determine which version performed better
💡
If your “test weights” are not 50-50, you might need to do some extra math here to normalize metrics such as impressions/revenue. For example, if the test weights are A=20% and B=80% instead of 50%-50%, then you would expect for “B” to have 4x as many impressions as “A” Depending on what you are testing, you may want to add extra metrics and dimensions to the report, such as Device, AdUnit, Creative Size, or Viewability. The reporting time period should match the time period when the A/B Test revision was live. Best practice is to run each test for several days to ensure clean reporting data.

Step 6 - Ending the test

You can end the A/B test by deploying a different revision.

Edit the A/B test itself

  • Navigate to the Setup Options screen
  • Un-check the “AB Testing” checkbox
  • Continue to edit and deploy the revision normally
image

Troubleshooting

Reports are empty

If you are not getting any data with the htlbidid filter, it could be the key-value doesn’t have reporting activated. In Google Ad Manager, under Inventory > Key-values select the “htlbidid” key and ensure the “Include values in reporting” setting is selected.

Wrong key-value

By default, the key-value for A/B testing is named htlbidid. If your configuration has modified the “AMS Global Var” setting in the user interface, your key-value might be named something different (rare)

Best Practices

Only test one thing at a time

Data analysis is more difficult if you make multiple configuration changes in the “B” version. For example, if you enable a new bidder, change a timeout, and add a new identity module — all at once — then it becomes difficult to separate the effect of each change. If you want to test multiple changes, the best practice is to create multiple revisions, “A”, “B”, “C”, “D”

Designate A/B tests using the “notes” field

A good “note” for an A/B test includes the ID numbers of the revisions being compared. For example, a test comparing the performance of versions 57 and 58 might say:

  • “A/B Test - 57, 58”