Predictive Modeling for Donor Acquisition August 8, 2017
Rodger Devine • Senior Executive Director of Business Intelligence at the Dornsife College of Letters, Arts and Sciences, University of Southern California • 15+ years in Higher Education • Previously Director of Information, Analytics and Annual Giving at Michigan Ross School of Business • M.S. in Information, Analysis and Retrieval and B.A. from University of Michigan Page 1
University of Southern California • Founded in 1880 • Los Angeles, California • Over 375,000 living alumni • 50%+ alumni in California • 21 Schools and Units • 42% USC Undergraduate Alumni Participation Rate – Dornsife is the largest academic unit with over 80 departments, programs, centers and institutes – 32% Dornsife Undergraduate Alumni Participation Rate • Mascot: Traveler Page 2
Agenda • Understanding the basics • Building a model using Excel • Applying scores to acquisition segments • Allocating resources based on the model • Assessing the model and your acquisition efforts
Page 3
UNDERSTANDING THE BASICS Page 4
POLL: How would you describe your overall level of familiarity with Predictive Modeling?
• No Experience. Help! • Familiar with the concepts • Have some experience with statistical tools and techniques • Regularly build predictive models
Page 5
What is Data Analytics?
Source: https://en.wikipedia.org/wiki/DIKW_pyramid Page 6
What is Analytics Maturity? •
Descriptive (Hindsight) – What happened? Why?
•
Predictive (Insight) – What will happen?
•
Prescriptive (Foresight) – What should we do?
Page 7
What is Predictive Modeling? • A process that uses statistical analysis and data mining techniques on past data (observations) to predict future outcomes (expectancies) • A decision support tool that can help guide annual giving segmentation strategy, planning, and resource allocation
Page 8
Why Use Predictive Modeling? • Provides actionable insights to help identify, compare, and prioritize prospects, outreach and solicitation efforts • Promotes increased revenue and efficiencies by: – Finding prospects most likely to give – Guiding strategy and ask amounts – Helping to identify new opportunities and prioritize activities
Page 9
What is Linear Regression? • Statistical method for analyzing a dataset and building a predictive model • Offers many practical uses for annual giving prediction and forecasting • Linear regression models describe the relationship between a dependent (outcome) variable Y and independent (input) variable X
Page 10
What is a Linear Regression Model? • An equation that describes a line with a yintercept α and slope βxi
yi = α + βxi • yi = predicted values of Y • α (“alpha”) = y-intercept (level of Y when X is 0) • βxi (“beta X”) = slope or coefficient ↑↓ for yi scores for each unit increase of variable xi
Page 11
What is Multiple Linear Regression? • A linear equation that predicts the value of a dependent variable (outcome) using two or more independent variables (predictors, inputs, explanatory, covariates, regressors, etc.)
Ŷi = α + β1X1 + β2 X2 + … + βiXk • Ŷi = predicted values of Y • α (“alpha”) = y-intercept • β1x1 (“beta X”) = coefficient ↑↓ for yi scores for each unit increase of variable x1 Page 12
Dependent vs. Independent Variables • Examples of Dependent Variables (Outcomes): – Total Annual Giving Amount (Predicted) – Number of Annual Gifts (Expected) • Examples of Independent Variables (Predictors): – Age – Level of Degree Attainment (Undergraduate, Graduate) – Contact and/or Engagement Frequency – Marital Status – Gender Page 13
What is the Method of Least Squares? • In linear regression, we use the least squares method to obtain the line or equation of best fit • Least squares is one of the most commonly used techniques to fit an equation line that best fits our data parameters • Variable coefficients provide a numerical representation of the effect upon the dependent variable with a single unit increase in the independent variable
Page 14
Multiple Linear Regression: Line of Best Fit
Ŷi = α + β1X1 + β2 X2 + … + βiXk Dependent Variable (Predicted Outcome) = Intercept + Independent Variable1 * Coefficient1 + Independent Variable2 * Coefficient2 Total Annual Giving = Intercept + Age * Coefficient (example) + Degree Level * Coefficient (example) Page 15
Important Things to Keep in Mind • Correlation does not equal causation • Linear regression assumes linear relationship between the outcome variable and independent variables • Linear regression assumes the error term (“residuals”, “noise”, “random error”) or deviation between predicted and observed values is equal • Multiple regression assumes that the variables are normally distributed Page 16
BUILDING A MODEL USING EXCEL Page 17
Define Your Business Purpose and Context • Annual Giving Scenario: – Use past annual giving data (training data) to build a predictive model that can help • Predictive Model Purpose: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) Page 18
Translate Purpose to Specification • Purpose: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) • Specification: – Dependent Variable (Outcome): Predicted giving total ($) – Independent Variables (Predictors): ? – Method: Multiple Linear Regression (one or more variables) ? – Data: Annual giving data (training and test set) ? – Goal: Create annual acquisition segments & recommendations Page 19
Collect, Sort & Document Data • Examples of variables (“features”) available: – Age – Degree Type – Contact and/or Degree of Engagement – Marital Status – Gender – Total Giving
Page 20
Prepare Your Data: Workflow Guidelines • 80% of your effort will typically involve pulling, preparing, cleaning and organizing your data for analysis • 20% of your effort will involve conducting analysis, building, testing, evaluating and interpreting your models • It is useful to document all data variables and methods of analysis to allow others to reproduce your results and guide future work Page 21
Inspect Your Data 1. Open the “Predictive Model Data” spreadsheet 2. A set of training data is provided in the worksheet labeled “Training Data” to build your regression model Page 22
Summarize Your Data: Part 1 1. Select the “Data Analysis” Tool 2. Select “Descriptive Statistics”
Page 23
Summarize Your Data: Part 2 3. Select “Input Range” 4. Select “Labels in First Row”
Page 24
Summarize Your Data: Part 3 5. Open “SUM” Worksheet 6. Review Summary Statistics
Page 25
Analyze Your Training Data
Data Insights (Examples): • Average age (Mean) is 31, ranging between 22 and 72 • Annual giving prospect count (N=250) • Average contact count is 1.1 • Average giving = $408, Median = $200, Max = $15,000 • Total giving for this test data set of 250 donors = $102,000 Page 26
Analyze Variable Relationships: Part 1 1. Select the “Data Analysis” Tool 2. Select “Correlation”
Page 27
Analyze Variable Relationships: Part 2 3. Select the “Input Range” Tool 4. Select “Labels in First Row”
Page 28
Analyze Variable Relationships: Part 3 5. Open “COR” Worksheet 6. Review Correlation Analysis
Correlation Value Range: -1 (negative) to 1 (positive)
Page 29
Interpret Your Variable Relationships
Total Giving Variable Correlation Analysis (sorted top down): • Age: 0.627 • Contact Count: 0.539 • Level of Degree Attainment: 0.527 • Gender: 0.127 • Marital Status: 0.063 Data Insight: • Age, contact activity and level of degree attainment are most strongly associated with total giving in this training dataset Page 30
Split Your Data into Training and Test Sets 1. Open the spreadsheet called “Predictive Model Data” 2. A set of training data is provided in the worksheet labeled “Training Data” to build your regression model 3. A set of test data is provided in a separate spreadsheet called “Acquisition Test Data” 4. You will build your regression model in the “Predictive Model Data” spreadsheet and use your predictive model (equation) on the test data to predict annual giving expectancies based on indicators available in the training data
Page 31
Build Your Model: Part 1 1. Select the “Data Analysis” Tool 2. Select “Regression”
Page 32
Build Your Model: Part 2 3. Select the “Input Y Range” (Dependent Variable) 4. Select “Input X Range” (Independent Variables)
Page 33
Inspect Your Model
Page 34
Review Your Regression Output
Page 35
Determine Equation of Best Fit
• In linear regression, we use the least squares method to obtain the equation of best fit – Dependent Variable = Total Giving – Independent Variables = Age, Degree Level, Contact Count, Marital Status, Gender
Page 36
Selecting Variables for Your Model
• Age and Degree Level variables are statistically significant in the training data set (p < 0.05) • We select Age and Degree Level as the independent variables (predictors) for our model
Page 37
Identify Equation of Best Fit
• In linear regression, we use the least squares method to obtain the equation of best fit: Ŷi = α + β1X1 + β2 X2 + … + βiXk Predicted Total Giving = Intercept + AGE * (60.086) + Degree Level * (319.923)
Page 38
Translate Equation into Excel Formula • Predicted Total Giving = Intercept (-1925.81) + AGE * (60.086) + Degree Level * (319.923)
Page 39
APPLYING SCORES TO ACQUISITION SEGMENTS Page 40
POLL: What area are you most interested in using analytics to help?
• Selecting acquisition segments • Assigning ask amounts • Everything
Page 41
Collect Your Test Data • Open “Acquisition Test Data” spreadsheet • FYI, there are 100 non-donor prospects in the test data
Page 42
Apply Your Model to Test Data
Page 43
Apply Your Model to Test Data 1. Open “Acquisition Test Data” spreadsheet 2. Open “AcquisitionTestData” worksheet 3. Open Cell G2 and review Equation of Best Fit 4. Equation of Best Fit = Predictive Model Ŷi = α + β1X1 + β2 X2 + … + βiXk
Predicted Total Giving = Intercept + AGE * (60.086) + Degree Level * (319.923) Page 44
Organize Your Annual Giving Segments • Retention (LYBUNTS, SYBUNTS, etc.) • Upgrade (LYBUNTS, SYBUNTS, etc.) • Acquisition (Non-donors) • Reactivation (Lapsed donors)
Page 45
Identify Acquisition Channels and Goals • Acquisition: Outreach or solicitation of prospects to acquire new members or donors – Typically a more expensive, long-term investment with lower initial ROI than retention and upgrade segments • Channels: – Direct Mail – Email – Phone • Predictive Modeling Goals: – Predict annual giving amounts for current non-donors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) Page 46
Define Acquisition Strategy – Examples • Increase Direct Mail Revenue – Leadership vs. Mass Appeal, Peer Solicitation • Improve Email Targeting and Relevance – E-Solicitation, Event Invitation • Leverage Phone Outreach – Engagement Survey, Event Reminder
Page 47
Define Acquisition Segments – Examples • Constituent Type (Alumni, Parent, Friend) • Degree Type (Undergraduate/Graduate) • Alumni Major, Affinity Groups, etc. • Geography or Region • Class/Reunion (Degree Year) • Age (Generation) • Other Bio/Demographic Features • Work and/or Industry Page 48
Map Channel, Strategy and Segments
Channels
Strategy
• Direct Mail
• Leadership Solicitation
• Email • Phone
Segments • Age (Generation) • Giving Level
• Event Invitation • Peer Solicitation
Page 49
ALLOCATING RESOURCES BASED ON THE MODEL Page 50
Apply the Model • Reminder – Predictive Modeling Goals: – Predict annual giving amounts for current nondonors (test data) – Prioritize donor acquisition segments at the individual donor level (test data) • Predictive Modeling Application: – Use Predictive Modeling Scores (e.g. estimated total giving values) to SORT, RANK and PRIORITIZE your prospects into different segmentation strategies, channels, and program activities Page 51
Allocate Resources: Segmentation Strategy • Open “Acquisition Test Data” spreadsheet • Open “AcquisitionTestDataSegmentation” worksheet • Review “Segmentation Strategy” data-driven recommendations in Column I
Page 52
Allocate Resources: Leadership Event Invitation
Leadership Event Invitation w/ Reception, Solicitation and Outreach Segment (Non-Donor Prospects with Predictive Modeling Scores of $1,000+ Annual Giving Expectancy) Leadership Event Invitation, Upgrade Solicitation and Outreach Segment (Non-Donor Prospects with Predictive Modeling Scores of $500+ Annual Giving Expectancy) Page 53
Allocate Resources: Direct Mail & Email
Direct Mail Solicitation and Event Invitation (Non-Donor Prospects with Predictive Modeling Scores of $100-$500 Annual Giving Expectancy) E-solicitation and Event Invitation (Non-Donor Prospects with Predictive Modeling Scores of $50-100 Annual Giving Expectancy) Page 54
Allocate Resources: Engagement Survey
Event Invitation and Alumni Engagement Survey (Non-Donor Prospects with Predictive Modeling Scores of $0-50 Annual Giving Expectancy)
Page 55
ASSESSING THE MODEL AND YOUR ACQUISITION EFFORTS Page 56
Evaluate Your Model
Page 57
Evaluate Your Model with R² • The “R²” or “R-Squared” statistic (or Coefficient of Determination) quantifies the overall strength of linear relationship in your data • R² describes the % of variation observed in the dependent (outcome or response) variable that is explained by the independent (input) variables • The closer the R² value is to 1, the stronger the overall proportion of variation can be explained by your model Page 58
Keep R² in Context • You will encounter lower R² values (below 0.5) with statistically significant predictors • Fields such as the social sciences or those attempting to predict human behavior (e.g. philanthropy) frequently encounter R² values below 0.5 since human behavior is often more difficult to predict that other natural or physical processes • It is advisable to use Adjusted-R² value when using more than one independent variable Page 59
Evaluate Your Model with F-test • F-test indicates whether the proposed relationship between the dependent and independent variables in your model is statistically reliable • A significant F-test indicates the observed R² value is reliable and not the spurious result of random noise in the data set • F-test is useful when your goal is to make predictions, forecasts, etc. Page 60
Evaluate Your Model with RMSE • Root Mean Square Error (RMSE) is the square root of the variance of the residuals • Residuals are the differences between the expected vs. observed outcomes • RMSE is a metric that describes how close the observed data points are to predicted values
Page 61
R², F-test and RMSE Metrics • R² provides a relative measure of the linear response variation explained by your model • A significant F-test indicates that the observed R² value is reliable • RMSE provides an absolute measure of linear fit • RMSE is a useful measure of how accurately the model predicts the response
Page 62
Evaluate Your Acquisition Efforts • Define your acquisition segments, channels, and strategy recommendations • Tag a pool of non-donor prospects (if possible) and experiment with your acquisition strategies, channels, and segments • Track your acquisition results for non-donor segments in terms of new donor counts and dollars • Compare your predicted annual giving amount vs. observed results at the individual donor level for all acquisition segments
Page 63
Acquisition Model Recommendations • Seek to update your predictive model on a regularly scheduled basis (e.g. yearly) • Review your acquisition segments in terms of new donors and dollars on a monthly basis to identify new trends, patterns and correlations • If one segment exceedingly yields beyond-predicted results, analyze all available annual giving data to identify other significant variable correlations and indicators you can potentially incorporate into your predictive model • Refine annual giving program strategy for acquisition segments based on results and ROI Page 64
Key Takeaways • Start where you are and work with what you have available to you • Define the business purpose and context of your analysis before building your model • Document assumptions, data, challenges (current state), and enhancement ideas (desired state) • Become a learn-it-all, not a know-it-all
Page 65
Additional Takeaways • Use your predictive model results to guide strategy and annual giving segmentation recommendations • Evaluate the power and effectiveness of your models and your acquisition efforts by identifying segments and tracking results • Invite other data-minded colleagues to join you on your predictive modeling adventures • Be patient and remember predictive modeling is detective work and an iterative process of discovery • Build, Test, Run, Evaluate… and Repeat! Page 66
Additional Methods and Resources • Linear Regression Assumptions and Diagnostics – https://en.wikipedia.org/wiki/Linear_regression • Method of Least Squares – https://en.wikipedia.org/wiki/Least_squares • RFM Scores – https://en.wikipedia.org/wiki/RFM_(customer_value) • Logistic Regression for Predicting Categorical Outcomes – https://en.wikipedia.org/wiki/Logistic_regression Page 67
Become an AGN Plus Member Today! AGN Plus Members can get unlimited access to all AGN webinars and additional benefits.
Use your registration for this webinar as a credit towards the membership fee for the next 30 days!
To learn more, visit AnnualGiving.com or email
[email protected]. Page 68
Page 69