Part 1: Data Scientist Foundation
Predictive Modeling Foundations
This section introduces the fundamentals of predictive modeling, emphasizing binary classification concepts and the logistic regression technique. By splitting data for training and testing, evaluating model metrics, and interpreting regression coefficients, learners gain the skills to develop initial predictive workflows that can guide business or research decisions.
FUNDAMENTALS OF PREDICTIVE MODELING & LOGISTIC REGRESSION
Learning Objectives
Understand binary classification metrics (accuracy, precision, recall, F1, ROC/AUC)
Build logistic regression models; interpret coefficients as log-odds/odds ratios
Split data (train/test) and assess performance with confusion matrices, ROC curves
Indicative Content
Classification Essentials
Confusion matrix, threshold choices, sensitivity/specificity
Logistic Regression
Sigmoid function, logit transformation, coefficient implications
Model Evaluation
Precision-recall, ROC curve, AUC, threshold tuning
TOOLS & METHODOLOGIES (PREDICTIVE MODELING FOUNDATIONS)
Python Libraries
Machine Learning:
scikit-learn
for train/test splits, logistic regressionEvaluation Tools: Libraries/functions for confusion matrices, ROC plots, AUC calculation
Binary Classification
Key metrics (accuracy, precision, recall, F1)
Assessing threshold adjustments for sensitivity vs. specificity
Logistic Regression
Log-odds interpretation, intercept vs. coefficient meaning
Potential for threshold tuning, model calibration