Abstract

When machine learning systems bridge from prediction to intervention—such as in statistical profiling of job seekers—seemingly minor modeling decisions can have profound consequences for who ultimately receives support. This talk examines how different choices in the data science pipeline affect not just predictive accuracy, but the actual composition of individuals flagged for intervention.
Using German administrative labor market data, I present a comparative analysis of regression and machine-learning approaches for predicting long-term unemployment risk. While our models achieve comparable performance (ROC-AUC: 0.70-0.77), they show striking disagreement in which individuals are classified as high-risk, with Jaccard similarities as low as 0.45 between equally accurate models. These differences cascade through the intervention pipeline: classification thresholds, feature importance patterns, and model architectures each reshape the demographic and socioeconomic profile of those targeted for support.
This work highlights a critical challenge at the prediction-intervention interface: the data we use to train accurate prediction models may be sufficient for forecasting outcomes, but the choices we make in constructing those models introduce new forms of variation that directly impact intervention allocation. I discuss implications for documentation, transparency, and fairness in algorithmic decision-making systems, emphasizing that "letting the data speak" still requires researchers to make consequential choices about which predictive voice to amplify. The talk concludes with reflections on how the prediction-to-intervention pipeline demands richer evaluation frameworks that account for both accuracy and equity in resource allocation.

Video Recording