A Six-Step Approach To Evaluate And Manage Algorithmic Performance

A Six-Step Approach To Evaluate And Manage Algorithmic Performance
A Six-Step Approach To Evaluate And Manage Algorithmic Performance

Understanding how to evaluate and manage algorithmic performance could be the difference between success and failure. This article outlines a six-step approach for defining what to measure and monitor. Central to this approach is to work out where the waste is by measuring failure states. These are critical to monitoring performance, prioritizing enhancements and understanding if actions are actually improving performance.

A U.S. retailer was spending $50 million a year bidding across a million keywords on Google. This spend drove $500 million of sales (equivalent to a ROAS, or return on ad spend, of 10). They were very pleased with the results and were planning to increase their spend.

But when I helped the retailer analyze performance at the keyword level, we uncovered a different picture: While the overall performance was good, they were spending $7 million a year on thousands of long-tail keywords that generated zero sales. Buried in their Google bidding algorithm was a parameter that determined how much spend was acceptable before a specific keyword was paused. Simply by changing the value of this single parameter, the business saved $7 million a year with no impact on sales.

It is tempting when confronted with millions of keywords to retreat to a simplified aggregated average such as ROAS (which is, after all, the objective here). But looking at aggregated outcomes is insufficient to manage and optimize an algorithm that is operating at an atomic level. For this, we need a new approach for understanding and performance-managing algorithms.

Digitalization Requires Algorithms

Companies used to operate at an aggregate level because the mechanisms they had were blunt, but digital technologies now enable surgical micro-decisions across the enterprise. The volume of these decisions requires automation, and algorithms define the assumptions, data and logic that determine how the micro-decisions are made.

These algorithms are increasingly ubiquitous in digitally driven businesses, and make decisions across a wide range of areas, including pricing, promotions, inventory allocation, supply chain, resource scheduling, credit decisioning, fraud, digital marketing, product sort orders, recommendations, and personalization. A common characteristic of these decisions is that they have some inherent uncertainty that requires a tradeoff to be made. For example, digital marketing decisions have a volume-profit tradeoff. Supply-chain decisions have a waste-availability tradeoff. Resourcing decisions have a service-cost tradeoff.

Business leaders recognize the need to manage and optimize these new decision systems, both to navigate the tradeoffs and drive continuous improvement. But how? Metrics and KPIs are the mechanism for management control, but traditional approaches to reporting don’t work for these new algorithmically powered decisions, as we explain below.

The Anatomy Of An Algorithm

To manage a decision algorithm, you need to start by understanding how it’s constructed. It’s helpful to break down the characteristics of an algorithm into 4Ps, and highlight the tradeoffs to be made in each:

Purpose: What Is The Objective Of The Algorithm?

Define the primary objective, and any guardrails or constraints.  There is typically a tradeoff between selecting a complex enterprise objective (e.g., profit) and a proxy or simpler, more siloed objective (e.g., ROAS).  For example, choosing to maximize sales within ROAS guardrails or to maximize profit will lead to very different decision logic.

Precision: How Personalized Is The Micro-Decision?

For example, for the retailer bidding on millions of keywords, should they set a single ROAS target for all keywords, or millions of “personalized” keyword-level targets, or something in between?  The tradeoff is about management: Is the human resource required to set atomic-level strategies worth the effort?

Prediction: How Is The Uncertainty Modeled?

This could be a simple extrapolation or an extremely complex AI/ML model. In addition, prediction models can be simplified as micro-decisions become more frequent. For example, a European airline used to set prices on a weekly basis and had developed very sophisticated forecast models for predicted fill rates. As they evolved, their price changed from weekly to every few minutes, and they were able to respond to actual demand rather than needing prediction models. The tradeoffs here are between forecast accuracy, interpretability, and decision frequency.

Policy: What Are The Rules/Logic/Maths That Determine The Actual Micro-Decision?

The tradeoff here is between a simple, easy-to-understand algorithm and a more complex formula that delivers a better outcome but can only be understood by experts. For example, we could define keyword bids with a simple rule, or a complex regression formula.

These examples bring to life the concept of “satisficing” developed by the Nobel-prize-winning economist Herb Simon – the choice between finding optimum solutions for a simplified world, or by finding satisfactory solutions for a more realistic world. There is no “perfect” algorithm. But we now have the data and tools to think differently and find “good enough” solutions for the real world.

As we’ve seen, the nature of algorithms requires new types of tradeoff, both at the micro-decision level, and also at the algorithm level. A critical role for leaders is to navigate these tradeoffs, both when the algorithm is designed, but also on an ongoing basis. Improving algorithms is increasingly a matter of changing rules or parameters in software, more like tuning the knobs on a graphic equalizer than rearchitecting a physical plant or deploying a new IT system.

The New Metrics Of Algorithms

Algorithms are often treated with an undeserved reverence: Because they’re clever, they must be right.  And in many cases, algorithms have been a huge improvement on “what was being done before.” But this sort of cognitive bias can lull managers into a false sense of complacency.

Metrics are critical to measure algorithm performance, but also to highlight opportunities for improvement, and in particular where tradeoffs may be sub-optimizing. The different nature of the data created by these digital decision systems motivates a new approach.

Traditional methods for defining metrics focus on management control with insight seen as a separate ad hoc activity. But now metrics can be designed to drive a continuous improvement cycle and, as the speed of reporting increases, can create increasingly autonomous feedback loops. This requires a mindset shift for managers used to weekly management meetings where their colleagues explain performance.

A further complexity is the interdependency created by atomization which requires performance measurement that crosses silos. For example, in the pre-digital world retail marketers could measure the performance of TV advertising independently from the sales performance of pricing and promotions. Now, advertising on Google (determined by a marketing algorithm) drives traffic directly to products and interacts with pricing and promotion decisions (determined by a sales algorithm). Measuring the performance of each algorithm independently can give a misleading view. There are three critical adaptations to metrics when performance managing algorithms:

From Top-Down To Bottom-Up Metrics

  • Traditional: corporate objectives cascading down into functional silos.
  • New Approach: low-level metrics needed to evaluate algorithms need to roll-up to corporate KPIs.

From Outcomes To Input Metrics

  • Traditional:  outcome-centric metrics focused on aggregates and averages.
  • New Approach: input-centric metrics focused on distributions and de-averaging. In the pre-digital world, the challenge was to manage “to the average” and handle outliers. Now atomization allows businesses to exploit heterogeneity and take advantage of outliers.

From Reporting To Actionable Metrics

  • Traditional: a reporting mindset, with static reports requiring human review and interpretation.
  • New Approach: action oriented, recognizing that as the latency from decision to insight reduces, it creates the opportunity to semi or fully automate a feedback loop.

How To Manage Your Algorithm

So what do you do? Below is a six-step approach for managing algorithms, and defining what to measure and monitor. Central to this approach is to work out where the waste is by measuring failure states. These are critical to monitor performance, prioritize enhancements and to understand if actions are improving performance.

Define The Enterprise Objective

It’s fundamental to identify the enterprise objective that the algorithm influences. This could be profit, return on investment or customer lifetime value. For example, a retailer decided that customer lifetime value was the objective for its digital marketing algorithm.

Identify The Atomic Level

Understand the level at which the micro-decision is currently being made, and what is the lowest level it might be possible to make that decision. For example, a retailer was bidding at a keyword level but recognized it was possible to bid based on geography, device, time of day, and other customer characteristics.

Define Success/Guardrails

Determine your acceptable performance. This will always be a judgement for business leadership and should typically lead to a conversation about risk appetite. This step will set guardrails at the micro-decision level. For example, a retailer decided that it wanted its advertising to pay back within six months, so it was willing to invest if it remained confident about the return.

Quantify Failure States

Understand the cost of being wrong. This is the critical step that assigns a dollar value to performance outside guardrails.  Sometimes this will be a simple measure of wasted spend, but in other cases a model may be required to estimate the missed opportunity. For example, a retailer identified two key areas of waste: underspending on keywords that were driving a high volume of profitable customers, and overspending on keywords with a greater-than-six-month payback.

Measure Performance

What is the total waste and the missed opportunity? The next step is to analyze performance at the lowest level identified in step 2 to understand the total business value outside the guardrails. A common failure here is to analyze performance at the same level that you are currently operating – this is typically self-fulfilling and hides both issues and opportunities. For example, a retailer was able to identify wasted spend of $Xm, and missed opportunity of $Ym of customer lifetime value.

Understand What Is Driving The Waste

Finally, analyze performance through the lens of the 4Ps (purpose, precision, prediction, and policy) to understand which element of the algorithm is the biggest driver of waste and missed opportunity. For example, a retailer found that there were opportunities to improve each aspect of their bidding algorithm with precision being the single biggest root cause of missed opportunity.

This is HR for algorithms. Understanding how to evaluate and manage algorithmic performance could be the difference between success and failure.

originally posted on hbr.org by Michael Ross

About Author: Michael Ross is a cofounder of DynamicAction, which provides cloud-based data analytics to retail companies, and an executive fellow at London Business School.