Flexible Attribution · Digital Fuel Capital

08 Dec 2020, 00:00

Challenges

Attribution is inherent in understanding variable contribution margin as it pertains to a marketing channel, campaign, product category, or any other entity. However, we often have to either choose a single method that has known flaws, or potentially combine multiple methods which conflict with one another. Let’s quickly review a few problems we encounter:

First or last click attribution ignores the effects of either top of funnel or closing channels
Platform-specific attribution e.g. Facebook may have the ability to better track a customer, but each platform has different levels of aggression in claiming a conversion. Also, they may end up claiming the same customers, therefore: sum of platform conversions > actual total conversions
A uniform approach of assigning each order to a channel through GA or internal tracking may penalize certain channels, particularly when that channel encourages browser switching. For example, a user will often see an ad on the FB app, click on it through the FB browser, but then return via their device’s native browser such as Safari. This user “never clicked” FB because the cookie isn’t there!
Offline channels can’t be tracked very well and hence inherit attribution in some parallel universe
What really drove the customer? Who cares if she clicked on an ad when she was referred by a friend, or inversely if she visited via direct but only knew about the website from a TV ad?

Simple Solution

Ultimately, you’d like to be able to encompass all of a user’s click pathing with self-reported data around how a customer believes that he/she found your site. Theoretically, you could have within your data warehouse a table with click records assigned to a transaction ID as well as the user’s How Did You Hear About Us (HDYHAU) response. Couldn’t you then have a rule to merge that data?

I’ve done this in the past with some success - given both data sets, you have the choice on how to weight them by creating a single mapping file of [click, HDYHAU] to attributed channel that lives in your data warehouse and where the weights can be modified. For channels that are unlikely to show up in click tracking - certainly TV and word of mouth, but also FB - give much more weight when there is evidence of that channel. How this works is that the marketer controls a mapping file that gets read into the database.

You would have the following for every combination of click channel and HDYHAU, assigning each combination to a fraction of a channel. So, if I click paid search and answer TV, I would be 75% TV and 25% paid search:

##   click_channel        HDYHAU attributed_channel weight
## 1   Paid Search            TV        Paid Search   0.25
## 2   Paid Search            TV                 TV   0.75
## 3        Direct Friend/Family      Word of Mouth      1
## 4           SEO        Social             Social    0.6
## 5           SEO        Social                SEO    0.4
## 6           ...           ...                ...    ...

To be clear, if you had 10 defined click channels, 8 HDYHAU responses, and 9 possible attributed channels (just making these numbers up), this table would have 10*8*9 = 720 rows with the weight for any given combination summing to 1.

Automation

We’ll tackle this in a future post of the Data Warehousing Series, but this can all be a simple part of an ETL process. Using the AWS ecosystem as an example:

Setup a GA API call through AWS Lambda to pull transaction ID and channel and dump the data into S3 (will cost you a couple cents per month)
Use AWS Glue to scan S3, do any calculations, and reformat the data. (again, a few cents)
One line script in Redshift or database tool to copy table from S3 (data transfer is free)

Once in your data warehouse, you can write logic in your BI tool to automate the views you need to see.

…again, these are things we can help with.

Enhancements

The above implies a single click channel. This need not be the case - companies in the portfolio are already beginning to use Google’s Multi Channel Funnel API in the aforementioned automation step #1 to pull the full click path of transactions. The mapping logic then just follows another step of chaining to apply the attribution methodology to the click path (all written in SQL you don’t need to worry about).

Nuances

You can have null responses to HDYHAU. Just have some SQL written to calculate the distribution of HDYHAU responses for that click channel that week. Don’t worry about this; it isn’t that complicated, and we can help.

But I need to optimize campaigns, not just macro channels

Yes, this framework can be used exactly for that. Let’s say that Marketing Platform A reports 500 conversions but this internal attribution shows that it is only 400. All you need to do is scale the campaigns within accordingly:

##   campaign reported total_reported total_attributed adjusted
## 1       A1      100            500              400       80
## 2       A2      350            500              400      280
## 3       A3       50            500              400       40

You can do this ad hoc or, since you’re at it, have additional API scripts in your ETL process to pull down these conversion numbers so it’s all automated. We’ve done this and can help you with it.

HDYHAU Question

This probably deserves its own topic at another point, but a few recommendations:

Randomize the ordering of the choices
If you want, you can add an open text subquestion, but make sure you capture that primary categorical response
Some people are skeptical of the accuracy of the user’s response. Throw in a dummy response (something that doesn’t exist, such as billboard ads) to measure the % of responses that are junk

Wrap up

This was a quick intro into a way to control your own attribution, combine different sources, and avoid some common pitfalls. It really doesn’t require rocket science to maintain this data pipelining, and we can help you set it up.