FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

Analytics, Humans and Black Boxes

I am always interested in the interactions between business intelligence, analytics and people. That's what led me to reexamine the BI space eight years ago after a long hiatus. At that time many organizations were arguing that you didn't need people for BI and analytics. Data mining was becoming popular, and the conventional wisdom seemed to be that you could just turn computers and software loose on a database, stand back, and wait for the profound results to emerge. Voila!

I doubted this was true, and indeed, when several co-authors and I did a research project on highly analytical companies, we found that they all had lots of smart analytical people nearby. People seemed to be necessary to create hypotheses, tell the software where to look in the data, interpret results, and discuss them with decision-makers. Even when analytics eventually became embedded into systems for automated decision-making, the model development stage seemed to require some crack analysts. Then, in more recent research for my new book with Jeanne Harris (Competing on Analytics: The New Science of Winning), we found the same relationship between humans and analytical excellence. For me, the issue was settled.

However, the question of whether humans are really necessary for good analytics has come up once again. I've talked with a couple of analytical software vendors who have lately described their offerings as eliminating - or drastically reducing - the requirement of having smart people around. Though they use different technology and modeling approaches, both vendors' systems allow organizations to throw in basically any variable that might possibly be related to the ones you want to predict. After a bit of hands-off cranking, the systems come up with the model that produces the greatest statistical fit to the data. Representatives of both vendors mentioned to me that organizations with many statisticians in place don't tend to be interested in their software because it would mean their "quant jock" services were no longer needed. The same claims had been made earlier about neural network software, which similarly employs a "black box" approach to model fitting.

In short, we have dueling approaches to analytics as it regards the human role. There's the human-centered, hypothesis driven (only humans can have hypotheses, right?) school that requires knowledge of the difference between a coefficient and a chi square. These people tend to like traditional statistical tools like SAS and SPSS. Then there's the black box school that requires only minimal human intervention. This approach tends to feature fairly esoteric analytical approaches: genetic algorithms, machine learning, and neural nets.

Perhaps there is even a third school that is even more human-intensive than the hypothesis-driven approach. It involves substantial visual analysis of the data. John Tukey, the Princeton statistician who also coined the terms "bit" and "software," was a strong advocate of visual methods for exploring data. This school is represented by visual analytics vendors such as Spotfire in BI, and Visual Sciences in the Web metrics field.

So which school is "right?" Of course, each approach can be right under particular circumstances. Usually those circumstances are described in technical terms. Neural networks, for example, tend to require large amounts of data. The visual exploration approach is good when you have relatively small amounts of data and only want to look at a small number of independent and dependent variables. You get the picture.

But I would argue that we should think about analytics in sociological and psychological - i.e., human - terms as well. One obvious way to do so is to consider whether your company employs the sophisticated analysts who are required for the hypothesis-driven approach before going very far down that path. For example, at Genalytics, a genetic algorithms software company with a focus on marketing applications, the primary target is mid-sized firms that may lack the analytical hardware, software, and "wetware" to predict who will be their best customers. Genetic algorithms are a black box-like technology, so if you're selling that approach you might as well look for customers who don't have a human alternative.

But the role of humans in analytics doesn't stop at their creation. People are also useful in communicating and discussing the results of analyses, and that's tough to do with black box approaches. The "why" of a black box approach simply doesn't exist; all that matters is that a good fit is achieved. In some cases the reasoning behind a black box outcome will be obvious; in others it won't make any sense at all.

I spoke recently with the Chief Risk Officer of a large U.S. bank who had tried all of the different points along the continuum of human involvement. Each, he believed, could generate useful results. However, he said that black box analytics have not often led to action in the same degree as have human-intensive approaches. "We've had good results with neural networks, for example," he said. "But no one can explain why they come up with the results they get. As a result, our executives just don't trust them, and they don't tend to act on them."

I discussed a similar issue with the head of analytics at an insurance company that sells policies to small business owners. Most of their policies are issued on the basis of automated recommendations from an analytical underwriting model. He also said that being able to explain the results was particularly important - in this case to their customers. "When we have to turn down a customer's application, it's very important we explain to them in straightforward terms why we don't want their business. We've had models that used some very strange variable combinations to achieve a good fit to the data, but they were virtually impossible to explain logically. So we use a more straightforward model coming out of a hypothesis from our underwriters."

Some decision-makers or customers won't care how a system came up with the models it developed as long as it performs well. However, organizations that are planning an analytical application should think carefully about the entire lifecycle of the system and its results. If humans are necessary to create, communicate, or defend analytical results, make sure you're using analytical methods that make such a human role possible, and that you've got the right people on board to do the job.


Tom Davenport is Professor and director of research, Babson Executive Education, Babson College. He can be contacted at tdavenport@babson.edu.

For more information on related topics, visit the following channels:



Industry Vendors