Results 1591 - 1600 of 23856
These days articles commonly appear about individuals, universities, and law firms facing what is called “a new McCarthyism.” Steve Smale was a student and Chandler Davis a faculty member at the University of Michigan during the actual McCarthy era. They were also members of the Communist Party. This talk will demonstrate the challenges that they overcame at an institution where the administration and many faculty felt impelled to sabotage the career of anyone on the far left.
In this talk, I will reflect on my journey spanning over two decades in the exploration of machine learning and intelligence—a path that began as the final PhD student of Professor Steve Smale at UC Berkeley. Inspired by Smale's 18th problem, posed in 1998, my research has been driven by a deep curiosity to understand the fundamental limitations of intelligence and the distinctions between artificial and human intelligence. Along this journey, I have delved into the intersections of dynamical algorithms, topology learning, and the development of trustworthy AI. This presentation is also a tribute to Steve Smale on his 95th birthday, a milestone that brings to mind our first meeting during his 69th—a moment that profoundly shaped my career and perspective.
Is mathematics invented or discovered? Recent progress in formulating fundamental principles underlying the stunning success of deep learning (DL) provides a new light on this age-old questions. I will discuss why deep networks seem to escape the curse of dimensionality. The answer lies in a key property of all functions that are efficiently Turing computable: they are compositionally sparse. This property enables the effective use of deep (and sparse) networks — the engine powering deep networks and related systems such as LLMs. It is however difficult to select a "good" decomposition exploiting sparse compositionality: each efficiently Turing computable function admits many sparse decompositions. How then can deep network learn reusable sparse decompositions? One way is to impose a curriculum similar to a chain of
thoughts in which the constituent functions are common across different tasks.
A trained Large Language Model (LLM) contains much of human knowledge. Remarkably, many concepts can be recovered from the internal activations of neural networks via linear "probes", which are, mathematically, single index models. I will discuss how such probes can be constructed and used based on Recursive Feature Machines — a feature-learning kernel method originally designed for extracting relevant features from tabular data.