Results 591 - 600 of 23765

Provocations

Abstract not available

Provocations

No abstract available.

Provocations- "Evaluation in the Wild: The Need for Engagement"

Abstract not available

Non-parametric Causal Inference in Dynamic Thresholding Designs

Consider a setting where we regularly monitor patients' fasting blood sugar, and declare them to have prediabetes (and encourage preventative care) if this number crosses a pre-specified threshold. The sharp, threshold-based treatment policy suggests that we should be able to estimate the long-term benefit of this preventative care by comparing the health trajectories of patients with blood sugar measurements right above and below the threshold. A naive regression-discontinuity analysis, however, is not applicable here, as it ignores the temporal dynamics of the problem where, e.g., a patient just below the threshold on one visit may become prediabetic (and receive treatment) following their next visit. Here, we study thresholding designs in general dynamic systems, and show that simple reduced-form characterizations remain available for a relevant causal target, namely a dynamic marginal policy effect at the treatment threshold. We develop a local-linear-regression approach for estimation and inference of this estimand, and demonstrate promise of our approach in numerical experiments. More broadly, we will highlight the promise of policy-gradient methods for causal inference in observational studies.

Estimating the Value of Personalization

From medicine to marketing to social sciences, the promise of tailoring interventions to individual characteristics is undeniable. However, personalization often comes with costs— from logistical challenges to lack of shared context to concerns about fairness. In addition, personalized decision policies can be more fragile, because they typically require more data to learn accurately compared to identifying a single best intervention for all. In this talk I’ll introduce a new statistical estimator that quantifies, given historical data, if there is evidence that a personalized intervention policy provides significantly superior expected outcomes compared to deploying the best single overall intervention. We present results across four diverse datasets to highlight the wide range of settings where quantifying the impact of personalization can be helpful, and the strength of our proposed estimator over prior related approaches. Joint work with Zhaoqi Li.

LLM-Guided Reinforcement Learning for Mastery Learning: Large-Scale Field Evidence

We describe a large-scale RCT in collaboration with the Taipei Department of Education for ~1000 high school students obtaining Python certification. All students had access to a LLM-powered AI tutor, and had to solve a required number of weekly practice problems. Half the students were randomized to a fixed practice sequence, while half were randomized to a personalized practice sequence using a POMDP framework to infer the student's mastery before moving on to more difficult problems. Existing POMDP formulations have limited visibility into student progress, hindering their ability to effectively provide personalized support. We show that student interactions on the practice platform provide a powerful view into their mastery, so we can significantly improve performance by using LLM-extracted features from platform interactions (e.g., meaningful code edits) as the POMDP observations. At the end of the semester, all students took a written exam with no AI assistance to receive certification. Students in the personalized arm performed 0.15 SD better, equivalent to several months of additional schooling.