How to Optimize Your Marketing Spend

There are formal methods and tools you can use to optimize marketing spend, including software from SAS, IBM and HP (among others).  The usefulness of these methods, however, depends on basic disciplines that are missing from many Marketing organizations.

In this post I’d like to propose some informal rules for marketing optimization.  These do not exclude using formal methods as well — think of them as organizing principles to put in place before you go shopping for optimization software.

(1) Ignore your agency’s “metrics”.

You use agencies to implement your Marketing campaigns, and they will be more than happy to provide you with analysis that shows you how much value you’re getting from the agency.  Ignore these.  Asking your agency to measure results of the campaigns they implement is like asking Bernie Madoff to serve as the custodian for your investments.

Every agency analyst understands that the role of analytics is to make the account team look good.   This influences the analytic work product in a number of ways, from use of bogus and irrelevant metrics to cherry-picking the numbers.

Digital media agencies are very good at execution, and they should play a role in developing strategy.  But if you are serious about getting the most from your Marketing effort, you should have your own people measure campaign results, or engage an independent analytics firm to perform this task for you.

(2) Use market testing to measure every campaign.

Market testing isn’t the gold standard for campaign measurement; it’s the only standard.  The principle is straightforward: you assign marketing treatments to prospects at random, including a control group who receive no treatment.  You then measure subsequent buying behavior among members of the treatment and control groups; the campaign impact is the difference between the two.

The beauty of test marketing is that you do not need a hard link between impressions and revenue at the point of sale, nor do you need to control for other impressions or market noise.  If treatments and controls are assigned at random, any differences in buying behavior are attributable to effects of the campaign.

Testing takes more effort to design and implement, which is one reason your agency will object to it.  The other reason is that rigorous testing often shows that brilliant creative concepts have no impact on sales.  Agency strategists tend to see themselves as advocates for creative “branding”; they oppose metrics that expose them as gasbags.  That is precisely why you should insist on it.

(3) Kill campaigns that do not cover media costs.

Duh, you think.  Should be obvious, right?  Think again.

A couple of years ago, I reviewed the digital media campaigns for a big retailer we shall call Big Brand Stores.  Big Brand ran forty-two digital campaigns per fiscal year; stunningly, exactly one campaign — a remarketing campaign — showed incremental revenue sufficient to cover media costs.  (This analysis made no attempt to consider other costs, including creative development, site-side development, program management or, for that matter, cost of goods sold.)

There is a technical term for campaigns that do not cover media costs.  They’re called “losers”.

The client’s creative and media strategists had a number of excuses for running these campaigns, such as:

  • “We’re investing in building the brand.”
  • “We’re driving traffic into the stores.”
  • “Our revenue attribution is faulty.”

Building a brand is a worthy project; you do it by delivering great products and services over time, not by spamming everyone you can find.

It’s possible that some of the shoppers rummaging through the marked-down sweaters in your bargain basement saw your banner ad this morning.  Possible, but not likely; it’s more likely they’re there because they know exactly when you mark down sweaters every season.

Complaints about revenue attribution usually center on the “last click” versus “full-funnel” debate, a tiresome argument you can avoid by insisting on measurement through market testing.

If you can’t measure the gain, don’t do the campaign.

(4) Stop doing one-off campaigns.

Out of Big Brand’s forty-two campaigns, thirty-nine were one-offs: campaigns run once and never again.  A few of these made strategic sense: store openings, special competitive situations and so forth.  The majority were simply “concepts”, based on the premise that the client needed to “do something” in April.

The problem with one-off campaigns is that you learn little or nothing from them.  The insight you get from market testing enables you to tune and improve one campaign with a well-defined value proposition targeted to a particular audience.  You get the most value from that insight when you repeat the campaign.  Marketing organizations stuck in the one-off trap never build the knowledge and insight needed to compete effectively.  They spend much, but learn nothing.

Allocate no more than ten percent of your Marketing spend to one-off campaigns.  Hold this as a reserve for special situations — an unexpected competitive threat, product recall or natural disaster.  Direct the rest of your budget toward ongoing programs defined by strategy.  For more on that, read the next section.

(5) Drive campaign concepts from strategy.

Instead of spending your time working with the agency to decide which celebrity endorsement to tout in April, develop ongoing programs that address key strategic business problems.  For example, among a certain segment of consumers, awareness and trial of your product may be low; for a credit card portfolio, share of revolving balances may be lagging competing cards among certain key segments.

The exact challenge depends on your business situation; what matters is that you choose initiatives that (a) can be influenced through a sustained program of marketing communications and (b) will make a material impact to your business.

Note that “getting lots of clicks in April” satisfies the former but not the latter.

This principle assumes that you have a strategic segmentation in place, because segmentation is to Marketing what maneuver is to warfare.  You simply cannot expect to succeed by attempting to appeal to all consumers in the same way.  Your choice of initiatives should also demonstrate some awareness of the customer lifecycle; for example, you don’t address current customers in the same way that you address prospective customers or former customers.

When doing this, keep the second and third principles in mind: a campaign concept is only a concept until it is tested.  A particular execution may fail market testing, but if you have chosen your initiatives well you will try again using a different approach.  Keep in mind that you learn as much from failed market tests as from successful market tests.

Automated Predictive Modeling

A colleague asks: can we automate predictive modeling?

How we answer the question depends on the context.   Consider the two variations on the question below, with more precise wording:

  1. Can we completely eliminate the need for expertise in predictive modeling — so that an “ordinary business user” can do it?
  2. Can we make expert analysts more productive by automating certain repetitive tasks?

The first form of the question — the search for “business user” analytics — is a common vision among software marketing folk and industry analysts; it is based on the premise that expert analysts are the key bottleneck limiting enterprise adoption of predictive analytics.   That premise is largely false, for reasons that warrant a separate blog post; for now, let’s just stipulate that the answer is no, it is not possible to eliminate human expertise from predictive modeling, for the same reason that robotic surgery does not eliminate the need for cardiologists.

However, if we focus on the second form of the question and concentrate on how to make expert analysts more productive, the situation is much more promising.  Many data preparation tasks are easy to automate; these include such tasks as detecting and eliminating zero-variance columns, treating missing values and handling outliers.  The most promising area for automation, however, is in model testing and assessment.

Optimizing a predictive model requires experimentation and tuning.  For any given problem, there are many available modeling techniques, and for each technique there are many ways to specify and parameterize a model.  For the most part, trial and error is the only way identify the best model for a given problem and data set. (The No Free Lunch theorem formalizes this concept).

Since the best predictive model depends on the problem and the data, the analyst must search a very large set of feasible options to find the best model.  In applied predictive analytics, however, the analyst’s time is strictly limited; a client in the marketing services industry reports an SLA of thirty minutes or less to build a predictive model.  Strict time constraints do not permit much time for experimentation.

Analysts tend to deal with this problem by settling for sub-optimal models, arguing that models need only be “good enough,” or defending use of one technique above all others.  As clients grow more sophisticated, however, these tactics become ineffective.  In high-stakes hard-money analytics — such as trading algorithms, catastrophic risk analysis and fraud detection — small improvements in model accuracy have a bottom line impact, and clients demand the best possible predictions.

Automated modeling techniques are not new.  Before Unica launched its successful suite of marketing automation software, the company’s primary business was advanced analytics, with a particular focus on neural networks.  In 1995, Unica introduced Pattern Recognition Workbench (PRW), a software package that used automated trial and error to optimize a predictive model.   Three years later, Unica partnered with Group 1 Software (now owned by Pitney Bowes) to market Model 1, a tool that automated model selection over four different types of predictive models.  Rebranded several times, the original PRW product remains as IBM PredictiveInsight, a set of wizards sold as part of IBM’s Enterprise Marketing Management suite.

Two other commercial attempts at automated predictive modeling date from the late 1990s.  The first, MarketSwitch, was less than successful.  MarketSwitch developed and sold a solution for marketing offer optimization, which included an embedded “automated” predictive modeling capability (“developed by Russian rocket scientists”); in sales presentations, MarketSwitch promised customers its software would allow them to “fire their SAS programmers”.  Experian acquired MarketSwitch in 2004, repositioned the product as a decision engine and replaced the “automated modeling” capability with outsourced analytic services.

KXEN, a company founded in France in 1998, built its analytics engine around an automated model selection technique called structural risk minimization.   The original product had a rudimentary user interface, depending instead on API calls from partner applications; more recently, KXEN repositioned itself as an easy-to-use solution for Marketing analytics, which it attempted to sell directly to C-level executives.  This effort was modestly successful, leading to sale of the company in 2013 to SAP for an estimated $40 million.

In the last several years, the leading analytic software vendors (SAS and IBM SPSS) have added automated modeling features to their high-end products.  In 2010, SAS introduced SAS Rapid Modeler, an add-in to SAS Enterprise Miner.  Rapid Modeler is a set of macros implementing heuristics that handle tasks such as outlier identification, missing value treatment, variable selection and model selection.  The user specifies a data set and response measure; Rapid Modeler determines whether the response is continuous or categorical, and uses this information together with other diagnostics to test a range of modeling techniques.  The user can control the scope of techniques to test by selecting basic, intermediate or advanced methods.

IBM SPSS Modeler includes a set of automated data preparation features as well as Auto Classifier, Auto Cluster and Auto Numeric nodes.  The automated data preparation features perform such tasks as missing value imputation, outlier handling, date and time preparation, basic value screening, binning and variable recasting.   The three modeling nodes enable the user to specify techniques to be included in the test plan, specify model selection rules and set limits on model training.

All of the software products discussed so far are commercially licensed.  There are two open source projects worth noting: the caret package in open source R and the MLBase project.  The caret package includes a suite of productivity tools designed to accelerate model specification and tuning for a wide range of techniques.   The package includes pre-processing tools to support tasks such as dummy coding, detecting zero variance predictors, identifying correlated predictors as well as tools to support model training and tuning.  The training function in caret currently supports 149 different modeling techniques; it supports parameter optimization within a selected technique, but does not optimize across techniques.  To implement a test plan with multiple modeling techniques, the user must write an R script to run the required training tasks and capture the results.

MLBase, a joint project of the UC Berkeley AMPLab and the Brown University Data Management Research Group is an ambitious effort to develop a scalable machine learning platform on Apache Spark.  The ML Optimizer seeks to simplify machine learning problems for end users by automating the model selection task so that the user need only specify a response variable and set of predictors.   The Optimizer project is still in active development, with Alpha release expected in 2014.

What have we learned from various attempts to implement automated predictive modeling?  Commercial startups like KXEN and MarketSwitch only marginally succeeded because they tried to oversell the concept as a means to replace the analyst altogether.  Most organizations understand that human judgement plays a key role in analytics, and they aren’t willing to entrust hard money analytics entirely to a black box.

What will the next generation of automated modeling platforms look like?  There are seven key features that are critical for an automated modeling platform:

  • Automated model-dependent data transformations
  • Optimization across and within techniques
  • Intelligent heuristics to limit the scope of the search
  • Iterative bootstrapping to expedite search
  • Massively parallel design
  • Platform agnostic design
  • Custom algorithms

Some methods require data to be transformed in certain specific ways; neural nets, for example, typically work with standardized predictors, while Naive Bayes and CHAID require all predictors to be categorical.  The analyst should not have to perform these operations manually; instead, the transformation operations should be built into the test plan script and run automatically; this ensures the maximum number of possible techniques for any data set.

To find the best predictive model, we need to be able to search across techniques and to tune parameters within techniques.  Potentially, this can mean a massive number of model train-and-test cycles to run; we can use heuristics to limit the scope of techniques to be evaluated based on characteristics of the response measure and the predictors.   (For example, a categorical response measure rules out a number of techniques, and a continuous response measure rules out a different set of techniques).  Instead of a brute force search for the best technique and parameterization, a “bootstrapping” approach can use information from early iterations to specify subsequent tests.

Even with heuristics and bootstrapping, a comprehensive experimental design may require thousands of model train-and-test cycles; this is a natural application for massively parallel computing.  Moreover, the highly variable workload inherent in the development phase of predictive analytics is a natural application for cloud (a point that deserves yet another blog post of its own).  The next generation of automated predictive modeling will be in the cloud from its inception.

Ideally, the model automation wrapper should be agnostic to specific implementations of machine learning techniques; the user should be able to optimize across software brands and versions.  Realistically, commercial vendors such as SAS and IBM will never permit their software to run under an optimizer that they do not own; hence, as a practical matter we should assume that the next generation predictive modeling platform will work with open source machine learning libraries, such as R or Python.

We can’t eliminate the need for human expertise from predictive modeling.   But we can build tools that enable analysts to build better models.

How Important is Model Accuracy?

Go to a trade show for predictive analytics and listen to the presentations; most will focus on building more accurate predictive models.  Presenters will differ on how this should be done: some will tell you to purchase their brand of software, others will encourage you to adopt one method or another, but most will agree: accuracy isn’t everything, it’s the only thing.

I’m not going to argue in this post that accuracy isn’t a good thing (all other things equal), but consider the following scenario: you have a business problem that can be mitigated with a predictive model.  You ask three consultants to submit proposals, and here’s what you get:

  • Consultant A proposes to spend one day and promises to produce a model that is more accurate than a coin flip
  • Consultant B proposes to spend one week, and promises to produce a model that is more accurate than Consultant A’s model
  • Consultant C proposes to spend one year, and promises to produce the most accurate model of all

Which one will you choose?

This is an extreme example, of course, but my point is that one rarely hears analysts talk about the time and effort needed to achieve a given level of accuracy. or the time and effort needed to implement a predictive model in production.  But in real enterprises, there are essential trade-offs that must be factored into the analytics process.  As we evaluate these three proposals, consider the following points:

(1) We can’t know how accurate a prediction will be; we can only know how accurate it was.

We judge the accuracy of a prediction after the event of interest has occurred.  In practice, we evaluate the accuracy of a predictive model by examining how accurate it is with historical data.  This is a pretty good method, since past patterns often persist into the future.  The key word is “often”, as in “not always”; the world isn’t in a steady state, and black swans happen.   This does not mean we should abandon predictive modeling, but it does mean we should treat very small differences in model back-testing with skepticism.

(2) Overall model accuracy is irrelevant.

We live in an asymmetrical world, and errors in prediction are not all alike.   Let’s suppose that your doctor thinks that you may have a disease that is deadly, but can be treated with an experimental treatment that is painful and expensive.  The doctor gives you the option of two different tests, and tells you that Test X has an overall accuracy rate of 60%, while Test Y has an overall accuracy of 40%.

Hey, you think, that’s a no-brainer; give me Test X.

What the doctor did not tell you is that all of the errors for Test X are false negatives: the test says you don’t have the disease when you actually do.  Test Y, on the other hand, produces a lot of false positives, but also correctly predicts all of the actual disease cases.

If you chose Test X, congratulations!  You’re dead.

(3) We can’t know the value of model accuracy without understanding the differential cost of errors.

In the previous example, the differential cost of errors is starkly exaggerated: on the one hand, we risk death, on the other hand, we undergo painful and expensive treatment.  In commercial analytics, the economics tend to be more subtle:  in Marketing, for example, we send a promotional message to a customer who does not respond (false positive) or decline a credit line for a prospective customer who would have paid on time (false negative).  The actual measure of a model, however isn’t its statistical accuracy, but its economic accuracy: the overall impact of the predictive model on the business process it is designed to support.

Taking these points into consideration, Consultant A’s quick and dirty approach looks a lot better, for three reasons:

  • Better results in back-testing for Consultants B and C may or may not be sustainable in production
  • Consultant A’s model benefits the business sooner than the other models
  • Absent data on the cost of errors, it’s impossible to say whether Consultants B and C add more value

A fourth point goes to the heart of what Agile Analytics is all about.  While Consultants B and C continue to tinker in the sandbox, Consultant A has in-market results to use in building a better predictive model.

The bottom line is this: the first step in any predictive modeling effort must always focus on understanding the economics of the business process to be supported.  Unless and until the analyst knows the business value of a correct prediction — and the cost of incorrect predictions — it is impossible to say which predictive model is “best”.

What Business Practices Enable Agile Analytics?

Part four in a four-part series.

We’ve mentioned some of the technical innovations that support an Agile approach to analytics; there are also business practices to consider.   Some practices in Agile software development apply equally well to analytics as any other project, including the need for a sustainable development pace; close collaboration; face-to-face conversation; motivated and trustworthy contributors, and continuous attention to technical excellence.  Additional practices pertinent to analytics include:

  • Commitment to open standards architecture
  • Rigorous selection of the right tool for the task
  • Close collaboration between analysts and IT
  • Focus on solving the client’s business problem

More often than not, customers with serious cycle time issues are locked into closed single-vendor architecture.  Lacking an open architecture to interface with data at the front end and back end of the analytics workflow, these organizations are forced into treating the analytics tool as a data management tool and decision engine; this is comparable to using a toothbrush to paint your house.  Server-based analytic software packages are very good at analytics, but perform poorly as databases and decision engines.

Agile analysts take a flexible, “best-in-class” approach to solving the problem at hand.  No single vendor offers “best-in-class” tools for every analytic method and algorithm.  Some vendors, like KXEN, offer unique algorithms that are unavailable from other vendors; others, like Salford Systems, have specialized experience and intellectual property that enables them to offer a richer feature set for certain data mining methods.  In an Agile analytics environment, analysts freely choose among commercial, open source and homegrown software, using a mashup of tools as needed.

While it may seem like a platitude to call for collaboration between an organization’s analytics community and the IT organization, we frequently see customers who have developed complex processes for analytics that either duplicate existing IT processes, or perform tasks that can be done more efficiently by IT. Analysts should spend their time doing analysis, not data movement, management, enhancement, cleansing or scoring; but surveyed analysts typically report that they spend much of their time performing these tasks.  In some cases, this is because IT has failed to provide the needed support; in other cases, the analytics team insists on controlling the process.   Regardless of the root cause, IT and analytics leadership alike need to recognize the need for collaboration, and an appropriate division of labor.

Focusing the analytics effort on the client’s business problem is essential for the practice of Agile analytics.  Organizations frequently get stuck on issues that are difficult to resolve because the parties are focused on different goals; in the analytics world, this takes the form of debates over tools, methods and procedures.  Analysts should bear in mind that clients are not interested in winning prizes for the “best” model, and they don’t care about the analyst’s advanced degrees.   Business requires speed, agility and clarity, and analysts who can’t deliver on these expectations will not survive.

What Is Driving Interest in Agile Analytics?

Part three in a four-part series.

A combination of market forces and technical innovation drive interest in Agile methods for analytics:

  • Clients require more timely and actionable analytics
  • Data warehouses have reduced latency in the data used by predictive models
  • Innovation directly impacts the analytic workflow itself

Business requirements for analytics are changing rapidly, and clients demand predictive analytics that can support decisions today.  For example, consider direct marketing:  ten years ago, firms relied mostly on direct mail and outbound telemarketing; marketing campaigns were served by batch-oriented systems, and analytic cycle times were measured in months or even years.  Today, firms have shifted that marketing spend to email, web media and social media, where cycle times are measured in days, hours or even minutes.  The analytics required to support these channels are entirely different, and must operate at a digital cadence.

Organizations have also substantially reduced the latency built into data warehouses.  Ten years ago, analysts frequently worked with monthly snapshot data, delivered a week or more into the following month.  While this is still the case for some organizations, data warehouses with daily, inter-day and real-time updates are increasingly common.  A predictive model score is as timely as the data it consumes; as firms drive latency from data warehousing processes, analytical processes are exposed as cumbersome and slow.

Numerous innovations in analytics create the potential to reduce cycle time:

  • In-database analytics eliminate the most time-consuming tasks, data marshalling and model scoring
  • Tighter database integration by vendors such as SAS and SPSS enable users to achieve hundred-fold runtime improvements for front-end processing
  • Enhancements to the PMML standard make it possible for firms to integrate a wide variety of end-user analytic tools with high performance data warehouses

All of these factors taken together add up to radical reductions in time to deployment for predictive models.  Organizations used to take a year or more to build and deploy models; a major credit card issuer I worked with in the 1990s needed two years to upgrade its behavior scorecards.  Today, IBM Netezza customers who practice Agile methods can reduce this cycle to a day or less.

What Is Agile Analytics?

This post is the second in a four-part series.

Agile Analytics is an approach to predictive analytics that emphasizes:

  • Client satisfaction through rapid delivery of usable predictions
  • Focus on model performance when deployed “in market”
  • Iterative and evolutionary approach to model development
  • Rapid cycle time through radical reduction in time to deployment

The Agile approach focuses on the client’s end goal: using data-driven predictions to make better decisions that impact the business.  In contrast, conventional approaches to predictive modeling (such as the well-known SEMMA[1] model) tend to focus on the model development process, with minimal attention given to either the client’s business problem or how the model will be deployed.

Since Agile Analytics is most concerned with how well the predictive model supports the client’s decision-making process, the analyst evaluates the model based on how well it serves this purpose when deployed under market conditions.  In practice, this means that the analyst evaluates model accuracy in production together with score latency, deployment cost and interpretability – a critical factor when building predictive analytics into a human process.   Conventional approaches typically evaluate predictive models solely on model accuracy when back-tested on a sample, a measure that often overstates the accuracy that the model will achieve when deployed under market conditions.

Agile analysts stress rapid deployment and iterative learning; they assume that the knowledge produced from tracking an initial model after it is deployed enables enhancements in subsequent iterations, and they build this expectation into the modeling process.  An Agile analyst quickly develops a predictive model using fast, robust methods and available data, deploys the model, monitors the model in production and improves it as soon as possible.  A conventional analyst tends to take extra time perfecting an initial model prior to deployment, and may pay no attention to in-market performance unless the client complains about anomalies.

Reducing cycle time is critical for the Agile analyst, since every iteration produces new knowledge.  The Agile analyst aggressively looks for ways to reduce the time needed to develop and deploy models, and factors cycle time into the choice of analytic methods.  Conventional analysts are often strikingly unengaged with what happens outside of the model development task; larger analytic teams often delegate tasks like data marshalling, cleansing and scoring to junior members, who perform the “grunt” work with programming tools.

[1] Sample, Explore, Modify, Model, Assess

Agile Analytics: Overview

Is this the year of Agile Analytics?  Recent publications show growing interest in the application of Agile methods to analytics:

  • Ken Collier, an Agile pioneer, tackles analytics in his aptly named new book Agile Analytics .
  • A quick Google search surfaces a number of recent blogs and articles (here, here and here)
  • Curt Monash recently published an excellent two-part blog on the subject (here and here)

I’ve commented in the past on IBM’s Big Data Hub about techniques that contribute to Agile Analytics, such as in-database analyticsopen source analytics and tighter integration with commercial packages like SAS.  In addition, I’ve commented on some of the barriers to agility, such as limitations of the PMML standard.

In this series, I’ll cover these topics

(1) What is Agile Analytics?

(2) What’s driving interest in Agile Analytics?

(3) What business practices enable Agile Analytics?