Abstract
Machine learning (ML) and statistical techniques are crucial for transforming Big Data into actionable knowledge. However, the complexity of existing ML algorithms is often overwhelming. Many end-users do not understand the trade-offs and challenges of parameterizing and choosing between different learning techniques. Furthermore, existing scalable systems that support ML are typically not accessible to ML developers without a strong background in distributed systems and low-level primitives. In this talk I will provide an overview of MLbase, a system designed to make it easier to (1) use machine learning and to (2) implement distributed ML algorithms. I will explain the declarative approach of ML as well as the high-level operators that will enable ML developers to scalably implement a wide range of ML methods without deep systems knowledge.