John Elder Workshop - Predict 2015
Intended Audience: Interested in the true nuts and bolts.
Knowledge Level: Familiar with the basics of predictive modeling.
Predictive analytics has proven capable of enormous returns across industries – but, with so many core methods for predictive modeling, there are some tough questions that need answering.
What you will learn: The tremendous value of learning from data How to create valuable predictive models for your business Best Practices by seeing their flip side: Worst Practices.
This one-day session surveys standard and advanced methods for predictive modeling.
Dr. Elder will describe the key inner workings of leading algorithms, demonstrate their performance with business case studies, compare their merits, and show you how to pick the method and tool best suited to each predictive analytics project. Methods covered include classical regression, decision trees, neural networks, ensemble methods, uplift modeling and more.
The key to successfully leveraging these methods is to avoid “worst practices”. It's all too easy to go too far in one's analysis and “torture the data until it confesses” or otherwise doom predictive models to fail where they really matter: on new situations.
Dr. Elder will share his (often humorous) stories from real-world applications, highlighting the Top 10 common, but deadly, mistakes. Come learn how to avoid these pitfalls by laughing (or gasping) at stories of barely averted disaster.
If you'd like to become a practitioner of predictive analytics – or if you already are, and would like to hone your knowledge across methods and best practices, this workshop is for you.
Date: Thursday the 17th September 2015
Time: 9:00am to 5:00pm
Location: RDS, Dublin (Room TBA)
Fee: Click here for pricing information
Extra: Attendees will get a free copy of John's book: Handbook of Statistical Analysis and Data Mining Applications
Course Outline:
I. Pattern Discovery: An Executive Summary
- Data Mining or Data Dredging?
- Computer vs. Human: Mining and Visualization
- Example Projects from Science and Business
- Ingredients for Success
- Modern Modeling Algorithms
- Bundling Models to Increase Accuracy
- Example: Identify Bat Species
II. Getting Going
- 5 Technical disciplines contribute
- 6 Stages of an analytic project
- Setting up the data file
- Example project: Fraud Detection
- Lift Charts to display model quality
- Decision Trees to fit data
III. Clustering and Nearness
- Commercial Products’ Algorithms
- Unsupervised Learning
- Clustering
- Principal Components
- Nearest Neighbor
- Mahalanobis distance
IV. Neural Networks
- Logistic (sigmoidal) transformation
- Example
V. Re-Sampling - essential for validation
- The danger of over-fit and over-search
- Cross-Validation
- Bootstrap
- Target Shuffling
- Example: find sweet spot for strikes in baseball
VI. Visualization
- Projections and projection pursuit
- Visualizing numbers, text, and links
- Density graphs: Drug discovery application
VII. Ensembles
- Bagging (with CART example)
- Boosting
- Bundling different models (with Credit Scoring example)
VIII. Top 10 Data Mining Mistakes
- Lack data
- Focus on Training
- Rely on 1 technique
- Ask the wrong question
- Listen (only) to the data
- Future leakage
- Discount pesky cases
- Extrapolate
- Answer every inquiry
- Sample without care
- Believe the best model