Yahoo Web Search

Search results

  1. Articles 1–20. ‪Meta AI / FAIR Labs‬ - ‪‪Cited by 15,592‬‬ - ‪Language Models‬ - ‪Machine Learning‬ - ‪Optimization‬ - ‪Algorithms‬ - ‪Theory‬.

  2. Apr 9, 2024 · Zeyuan ALLEN-ZHU is a principal researcher at Microsoft Research Redmond, where he studies the physics of language models and AI. He has a Ph.D. in computer science from MIT and a B.S. in mathematics and physics from Tsinghua, and has won several awards in algorithm competitions and research.

  3. About. My current research focuses on investigating the physics of language models and AI in a broader sense. This involves designing experiments to elucidate the...

    • 500+
    • 1.8K
    • Meta
    • Research Interests
    • Acknowledgements

    My current research focuses on investigating the physics of language models and AI in a broader sense. This involves designing experiments to elucidate the underlying fundamental principles governing how transformers/GPTs learn to accomplish diverse AI tasks. By probing into the neurons of the pre-trained transformers, my goal is to uncover and com...

    I would love to thank my wonderful collaborators without whom these results below would never have been accomplished. In inverse chronological order: Cathy Li (1), Emily Wenger (1), Francois Charton (1), Kristin Lauter (1), Edward J. Hu (1), Yelong Shen (1), Phillip Wallis (1), Shean Wang (1), Faeze Ebrahimian (1), Yingyu Liang(1), Zhao Song(2), Mi...

  4. Nov 9, 2018 · Zeyuan Allen-Zhu is a co-author of a paper that proves the global convergence of stochastic gradient descent on over-parameterized deep neural networks. The paper applies to various network architectures and activation functions, and shows polynomial time complexity.

    • Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song
    • 2018
  5. Zeyuan Allen-Zhu received a B.Sci. (summa cum laude) degree in math and physics from Tsinghua University in 2010, and M.S. and Sci.D. degrees in computer science from MIT in 2012 and 2015 respectively. From 2015 to 2017, he is a postdoctoral researcher jointly hosted by Princeton University and Institute for Advanced Study.

  6. Zeyuan Allen-Zhu is a postdoctoral researcher jointly hosted by Princeton University and the Institute for Advanced Study. He obtained his B.S. with highest honors in math and physics from Tsinghua University, and he earned his S.M. and Sc.D. in computer science from the Massachusetts Institute of Technology, under the supervision of Jonathan ...