Abstract

We develop a method to estimate a game’s primitives in complex dynamic environments. Because of the environment’s complexity, agents may not know or understand some key features of their interaction. Instead of equilibrium assumptions, we impose an asymptotic ε-regret (ε-AR) condition on the observed play. According to ε-AR, the time average of the counterfactual increase in past payoffs, had each agent changed each past play of a given action with its best replacement in hindsight, becomes small in the long run. We first prove that the time average of play satisfies εAR if and only if it converges to the set of Bayes correlated ε-equilibrium predictions of the stage game. Next, we use the static limiting model to construct a set estimator of the parameters of interest. The estimator’s coverage properties directly arise from the theoretical convergence results. The method applies to panel data as well as to cross-sectional data interpreted as long-run outcomes of learning dynamics. We apply the method to pricing data in an online marketplace. We recover bounds on the distribution of sellers’ marginal costs that are useful to inform policy experiments.

Video Recording