Abstract

We consider revenue maximization in online auctions and pricing. A seller sells an identical item in each period to a new buyer, or a new set of buyers. For the online posted pricing problem, we show regret bounds that scale with the \emph{best fixed price}, rather than the range of the values. We also show regret bounds that are \emph{almost scale free}, and match the offline sample complexity, when comparing to a benchmark that requires a \emph{lower bound on the market share}. These results are obtained by generalizing the classical learning from experts and multi-armed bandit problems to their \emph{multi-scale} versions. In this version, the reward of each action is in a  \emph{different range}, and the regret w.r.t. a given action scales with its \emph{own range}, rather than the maximum range.