About Me
Howdy! My name is Yufeng Yang (杨钰峰 ,pronounced as u-feng iang). I’m a third year PhD student at CSE department, Texas A&M University, advised by Prof. Yi Zhou. My general research interests revolve around the foundations of machine learning. Specially, I’m interested in the following fields:
Stochastic Optimization: I have been working on this topic since the beginning of my Ph.D. program. My research focuses on designing stochastic algorithms (typically first-order or zeroth-order methods) to solve tractable formulations inspired by large-scale machine learning, such as distributionally robust optimization (DRO), multi-objective optimization, and stochastic programming with heavy-tailed noise etc.
Mathematics of Deep Learning: In the era of large language models (LLMs), I am particularly interested in the mathematical insights behind preconditioned optimization methods—such as Muon(and shampoo) and Adam (and its variants)—as well as heuristics like learning rate scheduling and warm-up strategies. I am also passionate about exploring the mathematical foundations and mechanical interpretations of deep networks, especially for transformer-based LLMs, including phenomena such as transformer training dynamics, LLM grokking, in-context learning, and scaling laws.
Reinforcement Learning Theory and Post-Training of Agentic LLM I am interested in reinforcement learning (RL) post-training/alignment methods, especially evolving on-policy methods such as PPO, GRPO etc. I’m also intersted in RL-theory intersecting with stochastic optimization, including distributional RL, Multi-objective/agentic RL methods.
If you believe our research interests align, or want to know more about me, our research group, department and lives in College Station. Feel free to drop me an email at ynyang94@tamu.edu or add me on WeChat: ynyang94 (please indicate your purpose when connecting).
I’m actively looking for machine learning engineer, data science and quantitative internship opportunities during 2026 summer. Here is my brief CV.
News
📄Papers
Nested SGD for Sinkhorn distance-regularized Distributionally Robust Optimization[arXiv], [code],Submitted; [short version] accepted at OPT workshop, Neurips 2024 [poster].
Adaptive Gradient Normalization and Independent Sampling for (Stochastic) Generalized-Smooth Optimization[arxiv], [code], [slides], TMLR.
📚Education
– Ph.D. in Computer Science, Texas A&M University, 2024-now
– Ph.D.(Transfer Out) in Electrical Engineering, University of Utah, 2023-2024
– M.S. in Computational Science and Engineering, Rice University, 2021-2023
– B.S. in Applied Math (SSE), Chinese University of Hong Kong(Shenzhen), 2017-2021
Prior to university, I grew up and finished my elementary education in Jiayuguan, a small town in Gansu Province, China, located near the western starting point of the Great Wall.
Academic Services
– Conference: AISTATS
– Journal Reviewer: Journal of Combinatorial Optimization; IEEE Transactions on Signal Processing
– Workshop Reviewer: NeurIPS-OPT workshop
Teaching
– At Rice: ELEC241, Fundamentals of Electrical Engineering I (Role: Grader).