Analytic users are not all the same; in most organizations, there are a number of different user “personalities”, or personas, with distinct needs. If you develop an analytics architecture for your organization or develop analytic software to sell to others, it is important to understand these personas. In this essay, I profile four personas:
- Power Analyst
- Data Scientist
- Business Analyst
- Analytic Consumer
Your organization may or may not include all four personas; for example, if your organization consistently outsources predictive model building, you may have no Power Analysts or Data Scientists. Moreover, if your organization is large enough, it may be valuable for you to recognize distinct subclasses of users within each persona. In any event, your success depends on how well you understand the diverse needs of prospective users.
The Power Analyst
The Power Analyst sees advanced analytics as a full-time job, and holds positions such as Statistician or Actuary in organizations with significant investments in analytics, or as consultants in organizations that provide analytic services. The Power Analyst understands conventional statistics and machine learning, and has considerable working experience in applied analytics.
Power Analysts prefer to work in an analytic programming language such as Legacy SAS or R. They have enough training and working experience with the language to be productive, and consider analytic programming languages to be more flexible and powerful than analytic software packages with GUI interfaces. They do not need analytics to be “easy”, and may look down on those who do.
The “right” analytic method is extremely important to Power Analysts; they tend to be more concerned with using the “correct” methodology than with actual differences in business results achieved with different methods. This means, for example, if a particular analytic problem calls for a specific method or class of methods, such as Survival Analysis, the Power Analyst will go to great lengths to use this method even if the improvement to predictive accuracy is very small.
In practice, since working Power Analysts tend to work with highly diverse problems and cannot always predict the nature of the problems they will need to address, they place a premium on being able to use a wide variety of analytic methods and techniques. The need for a particular method or technique may be rare, but Power Analysts want to be able to use it if the need arises.
Since data preparation is critical to successful predictive analytics, Power Analytics need to be able to understand and control the data they work with. This does not mean that Power Analytics want to manage the data or perform ETL tasks; it means that they need the data management processes to be transparent and responsive. In organizations where IT does not place a premium on supporting predictive analytics, Power Analysts will take over data management and ETL to meet their own needs, but this is not necessarily the working model they prefer.
The work product of Power Analysts may be a management report of some kind showing the results of an analysis, a predictive model specification to be recoded for production, a predictive model object (such as a PMML document) or an actual executable scoring function written in a programming language such as Java or C. Power Analysts do not want to be heavily involved in production deployment or routing model scoring, though they may be forced into this role if the organization has not invested in tooling for model score deployment.
Power Analysts are highly engaged in the specific brand, release and version of analytic software. In organizations where the analytics team has significant influence, they play a decisive role in selecting analytic software. They also want control over the technical infrastructure supporting the analytic software, though they tend to be indifferent about specific brands of hardware, databases, storage and so forth.
In many organizations, the Power Analyst provides an “attest” function to validate that analytics are correctly performed; hence, they tend to have disproportionate authority in analytic matters based on their reputation and expertise.
The Data Scientist
As the Google Trends graph below illustrates, the term “Data Scientist” is of recent origin, hardly used at all prior to 2011 but rapidly increasing since then.
The Data Scientist is similar in many respects to the Power Analyst. Both share a lack of interest in “easy to use” tooling, and a desire to engage at a granular level with the data.
The principal differences between Data Scientists and Power Analysts relate to background, training and approach. Power Analysts tend to understand statistical methods, bring a statistical orientation to analytics, and tend to prefer working with higher-level languages with built-in analytic syntax; Data Scientists, on the other hand, tend to come from a machine learning, engineering or computer science background. Consequently, they tend to prefer working with programming languages such as C, Java or Python and tend to be much better equipped to work with SQL and MapReduce.
It is no accident that the growing usage of the Data Scientist label correlates with expanded deployment and use of Hadoop. Data Scientists tend to have working experience with Hadoop, and this may be their preferred working environment. They are comfortable working with MapReduce or Apache Spark, and will develop their own code on these platforms if there is no available “off-the-shelf” software that meets their needs.
Data Scientists’ machine learning roots influence their methods, techniques and approach, which affect their requirements for analytic tooling. The machine learning discipline tends to focus less on choosing the “right” analytic method, and places the focus on results of the predictive analytics process, including the predictive power of the model produced by the process. Hence, they are much more open to various forms of “brute force” learning, and choose methods that may be difficult to defend within the statistical paradigm but demonstrate good results.
Data Scientists tend to have low regard for existing analytic software vendors, especially those like SAS and IBM who cater to business customers by soft-peddling technical details; instead, they tend to prefer open source tooling. They seek the best “technical” solution, one with sufficient flexibility to support innovation. Data Scientists tend to engage directly in the process of “productionizing” their analytic findings; Power Analysts, in contrast, tend to prefer an entirely “hands-off” role in the process.
Since the Data Scientist role has recently emerged, it may lack the sapiential authority enjoyed by the Power Analyst in conservative organizations. In some organizations, “Data Science” is perceived negatively, and
The Business Analyst
The Business Analyst uses analytics within the context of a role in the organization where analytics is important but not the exclusive responsibility. Business Analysts hold a range of titles, such as Loan Officer, Marketing Analyst or Merchandising Specialist.
Business Analysts are familiar with analytics and may have some training and experience. Nevertheless, they prefer an easy-to-use interface and software such as SAS Enterprise Guide, SAS Enterprise Miner, SPSS Statistics or similar products.
While Power Analysts are very concerned with choosing the “right” method for the problem, Business Analysts tend to prefer a simpler approach. For example, they may be familiar with regression analysis, but they are unlikely to be interested in all of the various kinds of regression and the details of how regression models are calculated. They value “wizard” tooling that guides the selection of methods and techniques within a problem-solving framework.
The Business Analyst may be aware that data is important to the success of analytics, but does not want to deal with it directly. Instead, the Business Analyst prefers to work with data that certified correct by others in the organization. Face validity matters to the Business Analyst; data should be internally consistent and align with the analyst’s understanding of the business.
In most cases, the work product of a Business Analyst is a report summarizing the results of an analysis. The work product may also be a decision of some kind, such as the volume of merchandise to a complex loan decision. Business Analysts rarely produce predictive models for production deployment, because their working methods tend to lack the rigor and exhaustiveness of Power Analysts.
Business Analysts value good customer-friendly Technical Support, and tend to prefer to use software from vendors with demonstrated credibility in analytics.
The Analytic Consumer
Analytic Consumers are fully focused on business questions and issues and do not engage directly in the “production” of analytics; instead they use the results of analytics in the form of automated decisions, forecasts and other forms of intelligence that are embedded into the business processes in which they engage.
Analytic Consumers are not necessarily “top management” or any other specific level in the organization; they are simply not professionally engaged in the “sausage-making” of forecasts, automated decisions, and so forth.
While the Analytic Consumer may not engage with mathematical computations, they are concerned with the overall utility, performance and reliability of the systems they use. For example, a customer service rep in a credit card call center may not be concerned with the analytic method used to determine a decision, but will be very concerned if the system takes a long time to reach a decision. The rep may also object if the system does not provide reasonable explanations when it declines credit request, or appears to decline too many customers that seem to be good risks.
In most organizations, Analytic Consumers are the largest group of prospective users. Since the range of possible ways that analytics can positively affect business processes is large and growing rapidly, and since embedded analytics have few barriers to use, this group of users also has the greatest growth potential.
In most organizations, there are many more prospective Analytic Consumers and Business Analysts than Power Analysts and Data Scientists; on the surface, this means that a strategy of appealing to Analytic Consumers and Business Analysts offers the greatest potential for business value. However, few organizations are willing to entrust “hard money” analytic applications (such as fraud, credit risk or trading) to analytic novices; since the best and brightest analysts tend to be Power Analysts or Data Scientists, they tend to carry the most weight in decision-making about analytics.