A Presenta*on from Big Data 22 February 2013
Big Data Analytics: avoiding the pitfalls with robust analytics
Steve Cohen In4mation insights
All copyright owned by The Future Place and the presenters of the material For more informa:on about NewMR events visit NewMR.org
Big Data Analytics: avoiding the pitfalls Steve Cohen Partner, in4mation insights
[email protected] www.in4ins.com
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Agenda
• What Big Data is NOT • The danger of Big Data • New methods for Big Data • Robust analytics for deep dives on Big Data
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
3
Harness Big Data Big Value 1. 2. 3. 4. 5.
Cut time to market and improve quality Quantify variability and improve performance Segment to customize action Improve decision making and minimize risk Create new products and services
Big Data is driving the demand for skilled problem solvers Source: McKinsey Global Institute Report (May 2011) Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
4
What is Big Data?
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
5
Three V’s of Big Data
Volume Source: Doug Laney, Gartner Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
6
Mach in
es
Solving the Big Data Problem
Source: UC Berkeley AMP Lab & McKinsey Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
7
Where is all of the buzz?
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
8
Dominated by H & H?
1 Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
9
Dominated by H & H?
5 Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
10
SALES
The Long Tail
PRODUCTS
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
11
The fourth V
Variability
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
12
Apophenia Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
13
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
14
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
15
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
16
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
17
“Nothing is so alien to the human mind as the idea of randomness.” John Cohen Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
18
Statistics is sexy!
“The sexy job in the next ten years will be statisticians … The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill.” Hal Varian, chief economist at Google
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
19
No more samples “I’m talking about the notion of “wholepopulation analytics” against the entire population of data, rather than just the traditional capacity-constrained samples/subsets.” James Kobelius, IBM
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
20
What skills are needed for Big Data?
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
21
Bayesian statistical models facilitate micro-marketing.
Discover and quantify all sources of variability in market response or in customer behavior at the level of the individual SKU or the individual consumer.
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
22
Bayesian statistics
≠
Bayesian networks
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
23
Hierarchical Bayesian statistics • • • • • •
Complex systems of linear or nonlinear equations Often no analytic solution Monte Carlo simulation Predict quantitative or qualitative Incorporate sensible prior beliefs or knowledge Different coefficient for each unit of analysis at the “lower” level
• “Upper” level = “why behind the what” • “Borrow” when sparse Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
24
What could effect sales of SKUs in a store? Lower Model
Lower Model
Upper Model
Base Price
National TV
Channel
Discounted Price
Local TV
Feature Display Form Size
Radio
Geography Ingredients Location at point of sale
Outdoor Magazines
Store size Store age
Newspapers
Store format
Seasonality
Social media activity
Company vs. franchise
Holidays
Website & search
Demos of trading area
Coupons
Weather
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
25
Big Data in. Big Data out. Over 1,700 stores, 208 weeks of data, ~3,000 SKUs = 1.06 Billion sales numbers Lower X N SKUs 50 X 3,000 Lower X Upper 50 X 100
= Lower coefficients = 150,000 = Upper coefficients = 5,000
At every iteration from 1 … 5,000 (or more) !! Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
26
Why doesn't everyone use hierarchical Bayesian statistics on Big Data?
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
27
Average & base price across sizes and channels over time
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
28
Price elasticity across sizes and channels over time
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
29
Avoid Big Data pitfalls • • • • •
Danger in Big Data is Variability Avoid apophenia Use theory & statistics & avoid mindless data mining Full dataset analytics, not samples Hierarchical Bayesian statistics quantify variability and permit very deep dives on marketing elasticities
• Move Big Data analytics beyond a hardware and software solution to a change in business philosophy where decisions are data-driven Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
30
Q&A
Steve Cohen In4mation insights
Ray Poynter Vision Critical University
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
31
in4mation insights & Steve Cohen In4mation insights
• Marketing analytics, research, and technology consulting firm • Marketing Mix Modeling, Price/ Promotion Optimization, advanced Choice models, Assortment Optimization, Consumer and Market Segmentation, and Customer Lifetime Value modeling • Hierarchical Bayesian statistical models, parallel code written in C++, & high performance computation cluster applied to Big Data
Steve Cohen
• Winner 2010 AMA Parlin Award for lifetime achievement in marketing research
• Winner 2012 NextGen MR Award as Individual Disruptive Innovator
• First to conduct Choice-based Conjoint Analysis in USA (1983)
• Introduced Menu-based Conjoint Analysis for BYO tasks (2001)
• Won 3 awards for introducing Maximum Difference Scaling (2002). Steve Cohen office: 781-444-1237 x104 mobile: 617-510-2144 web: www.in4ins.com LinkedIn: www.linkedin.com/in/stevenhcohen
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
32