What is Predictive Modeling?


[PDF]What is Predictive Modeling? - Rackcdn.comhttps://353c8ab08149982b7e80-c6ebd8af869818939f18e9eb2f500d48.ssl.cf2.rackcd...

0 downloads 246 Views 3MB Size

Predictive Modeling for Donor Acquisition August 8, 2017

Rodger Devine • Senior Executive Director of Business Intelligence at the Dornsife College of Letters, Arts and Sciences, University of Southern California • 15+ years in Higher Education • Previously Director of Information, Analytics and Annual Giving at Michigan Ross School of Business • M.S. in Information, Analysis and Retrieval and B.A. from University of Michigan Page 1

University of Southern California • Founded in 1880 • Los Angeles, California • Over 375,000 living alumni • 50%+ alumni in California • 21 Schools and Units • 42% USC Undergraduate Alumni Participation Rate – Dornsife is the largest academic unit with over 80 departments, programs, centers and institutes – 32% Dornsife Undergraduate Alumni Participation Rate • Mascot: Traveler Page 2

Agenda • Understanding the basics • Building a model using Excel • Applying scores to acquisition segments • Allocating resources based on the model • Assessing the model and your acquisition efforts

Page 3

UNDERSTANDING THE BASICS Page 4

POLL: How would you describe your overall level of familiarity with Predictive Modeling?

• No Experience. Help! • Familiar with the concepts • Have some experience with statistical tools and techniques • Regularly build predictive models

Page 5

What is Data Analytics?

Source: https://en.wikipedia.org/wiki/DIKW_pyramid Page 6

What is Analytics Maturity? •

Descriptive (Hindsight) – What happened? Why?



Predictive (Insight) – What will happen?



Prescriptive (Foresight) – What should we do?

Page 7

What is Predictive Modeling? • A process that uses statistical analysis and data mining techniques on past data (observations) to predict future outcomes (expectancies) • A decision support tool that can help guide annual giving segmentation strategy, planning, and resource allocation

Page 8

Why Use Predictive Modeling? • Provides actionable insights to help identify, compare, and prioritize prospects, outreach and solicitation efforts • Promotes increased revenue and efficiencies by: – Finding prospects most likely to give – Guiding strategy and ask amounts – Helping to identify new opportunities and prioritize activities

Page 9

What is Linear Regression? • Statistical method for analyzing a dataset and building a predictive model • Offers many practical uses for annual giving prediction and forecasting • Linear regression models describe the relationship between a dependent (outcome) variable Y and independent (input) variable X

Page 10

What is a Linear Regression Model? • An equation that describes a line with a yintercept α and slope βxi

yi = α + βxi • yi = predicted values of Y • α (“alpha”) = y-intercept (level of Y when X is 0) • βxi (“beta X”) = slope or coefficient ↑↓ for yi scores for each unit increase of variable xi

Page 11

What is Multiple Linear Regression? • A linear equation that predicts the value of a dependent variable (outcome) using two or more independent variables (predictors, inputs, explanatory, covariates, regressors, etc.)

Ŷi = α + β1X1 + β2 X2 + … + βiXk • Ŷi = predicted values of Y • α (“alpha”) = y-intercept • β1x1 (“beta X”) = coefficient ↑↓ for yi scores for each unit increase of variable x1 Page 12

Dependent vs. Independent Variables • Examples of Dependent Variables (Outcomes): – Total Annual Giving Amount (Predicted) – Number of Annual Gifts (Expected) • Examples of Independent Variables (Predictors): – Age – Level of Degree Attainment (Undergraduate, Graduate) – Contact and/or Engagement Frequency – Marital Status – Gender Page 13

What is the Method of Least Squares? • In linear regression, we use the least squares method to obtain the line or equation of best fit • Least squares is one of the most commonly used techniques to fit an equation line that best fits our data parameters • Variable coefficients provide a numerical representation of the effect upon the dependent variable with a single unit increase in the independent variable

Page 14

Multiple Linear Regression: Line of Best Fit

Ŷi = α + β1X1 + β2 X2 + … + βiXk Dependent Variable (Predicted Outcome) = Intercept + Independent Variable1 * Coefficient1 + Independent Variable2 * Coefficient2 Total Annual Giving = Intercept + Age * Coefficient (example) + Degree Level * Coefficient (example) Page 15

Important Things to Keep in Mind • Correlation does not equal causation • Linear regression assumes linear relationship between the outcome variable and independent variables • Linear regression assumes the error term (“residuals”, “noise”, “random error”) or deviation between predicted and observed values is equal • Multiple regression assumes that the variables are normally distributed Page 16

BUILDING A MODEL USING EXCEL Page 17

Define Your Business Purpose and Context • Annual Giving Scenario: – Use past annual giving data (training data) to build a predictive model that can help • Predictive Model Purpose: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) Page 18

Translate Purpose to Specification • Purpose: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) • Specification: – Dependent Variable (Outcome): Predicted giving total ($) – Independent Variables (Predictors): ? – Method: Multiple Linear Regression (one or more variables) ? – Data: Annual giving data (training and test set) ? – Goal: Create annual acquisition segments & recommendations Page 19

Collect, Sort & Document Data • Examples of variables (“features”) available: – Age – Degree Type – Contact and/or Degree of Engagement – Marital Status – Gender – Total Giving

Page 20

Prepare Your Data: Workflow Guidelines • 80% of your effort will typically involve pulling, preparing, cleaning and organizing your data for analysis • 20% of your effort will involve conducting analysis, building, testing, evaluating and interpreting your models • It is useful to document all data variables and methods of analysis to allow others to reproduce your results and guide future work Page 21

Inspect Your Data 1. Open the “Predictive Model Data” spreadsheet 2. A set of training data is provided in the worksheet labeled “Training Data” to build your regression model Page 22

Summarize Your Data: Part 1 1. Select the “Data Analysis” Tool 2. Select “Descriptive Statistics”

Page 23

Summarize Your Data: Part 2 3. Select “Input Range” 4. Select “Labels in First Row”

Page 24

Summarize Your Data: Part 3 5. Open “SUM” Worksheet 6. Review Summary Statistics

Page 25

Analyze Your Training Data

Data Insights (Examples): • Average age (Mean) is 31, ranging between 22 and 72 • Annual giving prospect count (N=250) • Average contact count is 1.1 • Average giving = $408, Median = $200, Max = $15,000 • Total giving for this test data set of 250 donors = $102,000 Page 26

Analyze Variable Relationships: Part 1 1. Select the “Data Analysis” Tool 2. Select “Correlation”

Page 27

Analyze Variable Relationships: Part 2 3. Select the “Input Range” Tool 4. Select “Labels in First Row”

Page 28

Analyze Variable Relationships: Part 3 5. Open “COR” Worksheet 6. Review Correlation Analysis

Correlation Value Range: -1 (negative) to 1 (positive)

Page 29

Interpret Your Variable Relationships

Total Giving Variable Correlation Analysis (sorted top down): • Age: 0.627 • Contact Count: 0.539 • Level of Degree Attainment: 0.527 • Gender: 0.127 • Marital Status: 0.063 Data Insight: • Age, contact activity and level of degree attainment are most strongly associated with total giving in this training dataset Page 30

Split Your Data into Training and Test Sets 1. Open the spreadsheet called “Predictive Model Data” 2. A set of training data is provided in the worksheet labeled “Training Data” to build your regression model 3. A set of test data is provided in a separate spreadsheet called “Acquisition Test Data” 4. You will build your regression model in the “Predictive Model Data” spreadsheet and use your predictive model (equation) on the test data to predict annual giving expectancies based on indicators available in the training data

Page 31

Build Your Model: Part 1 1. Select the “Data Analysis” Tool 2. Select “Regression”

Page 32

Build Your Model: Part 2 3. Select the “Input Y Range” (Dependent Variable) 4. Select “Input X Range” (Independent Variables)

Page 33

Inspect Your Model

Page 34

Review Your Regression Output

Page 35

Determine Equation of Best Fit

• In linear regression, we use the least squares method to obtain the equation of best fit – Dependent Variable = Total Giving – Independent Variables = Age, Degree Level, Contact Count, Marital Status, Gender

Page 36

Selecting Variables for Your Model

• Age and Degree Level variables are statistically significant in the training data set (p < 0.05) • We select Age and Degree Level as the independent variables (predictors) for our model

Page 37

Identify Equation of Best Fit

• In linear regression, we use the least squares method to obtain the equation of best fit: Ŷi = α + β1X1 + β2 X2 + … + βiXk Predicted Total Giving = Intercept + AGE * (60.086) + Degree Level * (319.923)

Page 38

Translate Equation into Excel Formula • Predicted Total Giving = Intercept (-1925.81) + AGE * (60.086) + Degree Level * (319.923)

Page 39

APPLYING SCORES TO ACQUISITION SEGMENTS Page 40

POLL: What area are you most interested in using analytics to help?

• Selecting acquisition segments • Assigning ask amounts • Everything

Page 41

Collect Your Test Data • Open “Acquisition Test Data” spreadsheet • FYI, there are 100 non-donor prospects in the test data

Page 42

Apply Your Model to Test Data

Page 43

Apply Your Model to Test Data 1. Open “Acquisition Test Data” spreadsheet 2. Open “AcquisitionTestData” worksheet 3. Open Cell G2 and review Equation of Best Fit 4. Equation of Best Fit = Predictive Model Ŷi = α + β1X1 + β2 X2 + … + βiXk

Predicted Total Giving = Intercept + AGE * (60.086) + Degree Level * (319.923) Page 44

Organize Your Annual Giving Segments • Retention (LYBUNTS, SYBUNTS, etc.) • Upgrade (LYBUNTS, SYBUNTS, etc.) • Acquisition (Non-donors) • Reactivation (Lapsed donors)

Page 45

Identify Acquisition Channels and Goals • Acquisition: Outreach or solicitation of prospects to acquire new members or donors – Typically a more expensive, long-term investment with lower initial ROI than retention and upgrade segments • Channels: – Direct Mail – Email – Phone • Predictive Modeling Goals: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) Page 46

Define Acquisition Strategy – Examples • Increase Direct Mail Revenue – Leadership vs. Mass Appeal, Peer Solicitation • Improve Email Targeting and Relevance – E-Solicitation, Event Invitation • Leverage Phone Outreach – Engagement Survey, Event Reminder

Page 47

Define Acquisition Segments – Examples • Constituent Type (Alumni, Parent, Friend) • Degree Type (Undergraduate/Graduate) • Alumni Major, Affinity Groups, etc. • Geography or Region • Class/Reunion (Degree Year) • Age (Generation) • Other Bio/Demographic Features • Work and/or Industry Page 48

Map Channel, Strategy and Segments

Channels

Strategy

• Direct Mail

• Leadership Solicitation

• Email • Phone

Segments • Age (Generation) • Giving Level

• Event Invitation • Peer Solicitation

Page 49

ALLOCATING RESOURCES BASED ON THE MODEL Page 50

Apply the Model • Reminder – Predictive Modeling Goals: – Predict annual giving amounts for current nondonors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) • Predictive Modeling Application: – Use Predictive Modeling Scores (e.g. estimated total giving values) to SORT, RANK and PRIORITIZE your prospects into different segmentation strategies, channels, and program activities Page 51

Allocate Resources: Segmentation Strategy • Open “Acquisition Test Data” spreadsheet • Open “AcquisitionTestDataSegmentation” worksheet • Review “Segmentation Strategy” data-driven recommendations in Column I

Page 52

Allocate Resources: Leadership Event Invitation

Leadership Event Invitation w/ Reception, Solicitation and Outreach Segment (Non-Donor Prospects with Predictive Modeling Scores of $1,000+ Annual Giving Expectancy) Leadership Event Invitation, Upgrade Solicitation and Outreach Segment (Non-Donor Prospects with Predictive Modeling Scores of $500+ Annual Giving Expectancy) Page 53

Allocate Resources: Direct Mail & Email

Direct Mail Solicitation and Event Invitation (Non-Donor Prospects with Predictive Modeling Scores of $100-$500 Annual Giving Expectancy) E-solicitation and Event Invitation (Non-Donor Prospects with Predictive Modeling Scores of $50-100 Annual Giving Expectancy) Page 54

Allocate Resources: Engagement Survey

Event Invitation and Alumni Engagement Survey (Non-Donor Prospects with Predictive Modeling Scores of $0-50 Annual Giving Expectancy)

Page 55

ASSESSING THE MODEL AND YOUR ACQUISITION EFFORTS Page 56

Evaluate Your Model

Page 57

Evaluate Your Model with R² • The “R²” or “R-Squared” statistic (or Coefficient of Determination) quantifies the overall strength of linear relationship in your data • R² describes the % of variation observed in the dependent (outcome or response) variable that is explained by the independent (input) variables • The closer the R² value is to 1, the stronger the overall proportion of variation can be explained by your model Page 58

Keep R² in Context • You will encounter lower R² values (below 0.5) with statistically significant predictors • Fields such as the social sciences or those attempting to predict human behavior (e.g. philanthropy) frequently encounter R² values below 0.5 since human behavior is often more difficult to predict that other natural or physical processes • It is advisable to use Adjusted-R² value when using more than one independent variable Page 59

Evaluate Your Model with F-test • F-test indicates whether the proposed relationship between the dependent and independent variables in your model is statistically reliable • A significant F-test indicates the observed R² value is reliable and not the spurious result of random noise in the data set • F-test is useful when your goal is to make predictions, forecasts, etc. Page 60

Evaluate Your Model with RMSE • Root Mean Square Error (RMSE) is the square root of the variance of the residuals • Residuals are the differences between the expected vs. observed outcomes • RMSE is a metric that describes how close the observed data points are to predicted values

Page 61

R², F-test and RMSE Metrics • R² provides a relative measure of the linear response variation explained by your model • A significant F-test indicates that the observed R² value is reliable • RMSE provides an absolute measure of linear fit • RMSE is a useful measure of how accurately the model predicts the response

Page 62

Evaluate Your Acquisition Efforts • Define your acquisition segments, channels, and strategy recommendations • Tag a pool of non-donor prospects (if possible) and experiment with your acquisition strategies, channels, and segments • Track your acquisition results for non-donor segments in terms of new donor counts and dollars • Compare your predicted annual giving amount vs. observed results at the individual donor level for all acquisition segments

Page 63

Acquisition Model Recommendations • Seek to update your predictive model on a regularly scheduled basis (e.g. yearly) • Review your acquisition segments in terms of new donors and dollars on a monthly basis to identify new trends, patterns and correlations • If one segment exceedingly yields beyond-predicted results, analyze all available annual giving data to identify other significant variable correlations and indicators you can potentially incorporate into your predictive model • Refine annual giving program strategy for acquisition segments based on results and ROI Page 64

Key Takeaways • Start where you are and work with what you have available to you • Define the business purpose and context of your analysis before building your model • Document assumptions, data, challenges (current state), and enhancement ideas (desired state) • Become a learn-it-all, not a know-it-all

Page 65

Additional Takeaways • Use your predictive model results to guide strategy and annual giving segmentation recommendations • Evaluate the power and effectiveness of your models and your acquisition efforts by identifying segments and tracking results • Invite other data-minded colleagues to join you on your predictive modeling adventures • Be patient and remember predictive modeling is detective work and an iterative process of discovery • Build, Test, Run, Evaluate… and Repeat! Page 66

Additional Methods and Resources • Linear Regression Assumptions and Diagnostics – https://en.wikipedia.org/wiki/Linear_regression • Method of Least Squares – https://en.wikipedia.org/wiki/Least_squares • RFM Scores – https://en.wikipedia.org/wiki/RFM_(customer_value) • Logistic Regression for Predicting Categorical Outcomes – https://en.wikipedia.org/wiki/Logistic_regression Page 67

Become an AGN Plus Member Today! AGN Plus Members can get unlimited access to all AGN webinars and additional benefits.

Use your registration for this webinar as a credit towards the membership fee for the next 30 days!

To learn more, visit AnnualGiving.com or email [email protected]. Page 68

Page 69