Results 601 - 610 of 23765
We study RCTs evaluating service interventions—proactive outreach by teachers, medication adherence support from healthcare providers, or social worker home visits—where treatment is delivered by capacity-constrained resources. When participants share finite service capacity, adding more participants can reduce the timeliness or intensity of service that others receive, introducing interference and hidden variation in treatment that we term "operational dosage." Using queueing theory, we show that treatment effects are both capacity- and sample-size-dependent, and can decrease once sample size exceeds a critical threshold. Consequently, statistical power in service intervention RCTs can peak at intermediate sample sizes, contradicting conventional power calculations. Simulations calibrated to a tuberculosis intervention trial in Kenya demonstrate that high-capacity/small-sample designs can achieve the same power as low-capacity/large-sample designs. Our results highlight the importance of capacity selection in experiment design and provide a mechanism for replication failures and implementation challenges at scale.
Risk assessment instruments, or ``risk scores,'' are widely used in high-stakes decision-making settings such as medicine and the criminal justice system. A risk score predicts the likelihood of an undesired outcome if no intervention is made. Thus, a sufficiently high score is often interpreted as a recommendation to intervene. However, risk scores fail to account for what would happen if a decision-maker does intervene. This is problematic because effective decision-making requires consideration of how the intervention affects outcomes. We propose ``triage scores,'' which generalize risk scores using counterfactual utilities. Unlike risk scores, triage scores incorporate counterfactual outcomes under alternative decisions, enabling decision-makers to incorporate a wide range of ethical and practical factors. We illustrate the use of triage scores with an application to our own randomized controlled trial evaluating a pre-trial risk assessment instrument. Our analysis demonstrates that triage scores are able to capture richer utility structures than risk scores and yield substantively distinct results regarding policy evaluation and learning.
Machine learning models are often assessed by the quality of their predictions, yet their real-world impact extends far beyond these metrics. Models function as interventions within complex social systems, influencing stakeholders, infrastructure, and...