I am a researcher working on the training of Generative Artificial Intelligence. My research
interests cover the area of computational techniques for learning from data
that are highly scalable with the availability of the compute. More specifically, I am working on optimizers
and model architecture scaling schemes that would allow predictable
and efficient training of Transformer models containing hundreds of billions of parameters.
In 2022, I graduated with a Ph.D. from the group of Professor Javad Lavaei at the Department of
Industrial Engineering and Operations Research at University of California, Berkeley, where I worked
on problems of algorithmic analysis and optimal control of complex safety-critical systems, such as power systems,
transportation and telecommunication networks, AI recommendation and navigation systems, robotic systems, and others.
My research spanned the theory of non-convex and conic optimization, stochastic control, machine learning,
and computational and sampling complexity of learning algorithms. I designed data processing algorithms
that are robust to noise and highly scalable with the amount of available computational resources.
August 2024: I am participating in the AI Panel on October 11, 2024.
July 2024: At noon on 1 Aug, I am giving a talk “Scaling generative modeling across model sizes and data modalities” at the lab meeting of New York Genome Center.
July 2024: At noon on 26 July, HawAII is happy to host Nikita Zhivotovsky from UC Berkeley who will deliver a talk on Improving Risk Bounds with Unbounded Losses via Data-Dependent Priors. Join us in POST 302!
June 2024: Our paper on the economic trajectory of AI development “What if synthetic data is all you need” is available online!
May 2024: A new HawAII reading group just started! Please join us on Fridays 12:00-13:05 in POST 302 or reach out for a Zoom link.
April 2024: Our paper Effective Long-Context Scaling of Foundation Models will be presented at NAACL 2024 conference!
March 2024: We are thrilled to announce an upcoming collaboration with Epoch AI on the technical and economical aspects of scaling Artificial Intelligence systems.
March 2024: At 4 PM on 24 Apr, HawAII is happy to host Daniel Abramovitch (Agilent Labs), Richard Braatz (MIT), and Kam Leang (U. Utah) in Webster Hall 203 with a talk on automatic control! zoom link; recording; .
February 2024: Our research group has been awarded compute resources by the Google Cloud Research Credits Program
February 2024: Our paper HaSa: Hardness and Structure-Aware Contrastive Knowledge Graph Embedding has been accepted at The Web Conference 2024
January 2024: Our research group has been awarded compute resources by the ACCESS Allocations program.
January 2024: I am presenting the overview of my research to the Department of Information and Computer Sciences at UH Manoa on Tuesday, Jan 9, 2024, 12:00-13:00, in Keller Hall 103. (recording)
January 2024: We are delighted to announce the Hawai'i Artificial Intelligence Initiative aiming to accelerate AI research at UHM by fostering collaboration
December 2023: On 18 Dec, we are happy to host Prof. Xiao Li from CUHK with a talk on Convergence Guarantees for SGD with Random
Reshuffling. Please join us at HH 386 at 11 am! (recording)
October 2023: I am chairing session WA67 “Challenges in the Large-scale Model Training” at INFORMS Annual Meeting 2023!
October 2023: I have presented a talk on the Large Language Modeling practice at the CS Department seminar of UH Manoa: recording, slides
September 2023: A new paper on extending the context length of a trained RoPE transformer: Effective Long-Context Scaling of Foundation Models
July 2023: Our paper Llama 2: Open Foundation and Fine-Tuned Chat Models landed with a splash, making it to the top of hackernews.
May 2023: I will present the paper A Theory on Adam Instability in Large-Scale Machine Learning at the FAIR <> GenAI Workshop. slides
April 2023: Our paper Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points got accepted at the 2023 International Conference on Machine Learning (ICML).
March 2023: New paper A Theory on Adam Instability in Large-Scale Machine Learning is accessible online.
January 2023: New paper Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points is accessible online.
December 2022: I am going to join the Department of Electrical and Computer Engineering at University of Hawai'i at Manoa as a faculty member!
October 2022: I am invited to organize a session “Large-scale smooth optimization for Generative AI” at the INFORMS Annual Meeting 2023.
June 2022: I am joining Meta AI (FAIR) as a Research Scientist specializing on large-scale optimization.
May 2022: I defended my thesis and graduated with a Ph.D. in Engineering!
April 2022: My Thesis The complexity of non-convex and conic optimization problems in data science applications is available online.
February 2022: I will give a talk at the Department of Electrical and Computer Engineeting of University of Hawai'i at Manoa. slides
December 2021: I will give a talk “Computation-information complexity trade-off in Tensor PCA” Random Matrices and Random Landscapes seminar at Mathematical Sciences Research Institute, Berkeley, CA. (slides)
November 2021: I will give a talk “Topological complexity of polynomials” at the Control and Optimization seminar, IEOR Department at University of California, Berkeley. (slides)
September 2021: I am organising session “Reaching Global Optimum in Non-Convex Optimization Problems” at the INFORMS Annual Meeting 2021.
July 2021: I will present the paper When Does MAML Objective Have Benign Landscape? on the 2021 IEEE Conference on Control Technology and Applications (CCTA).
May 2021: I will present the paper No spurious solutions in non-convex matrix sensing: Structure compensates for isometry on the 2021 American Control Conference (ACC).
December 2020: Our paper Role of sparsity and structure in the optimization landscape of non-convex matrix sensing got accepted for publication in Mathematical Programming.
September 2020: Our paper Conic Optimization for Quadratic Regression Under Sparse Noise got accepted for publication in the Journal of Machine Learning Research.
June 2020: Our new paper Global convergence of MAML for LQR is accessible online.
August 2019: I gave a talk “Frontiers of Deep Learning: overview of Simon's Institute summer workshops” at the Control and Optimization seminar, IEOR Department at University of California, Berkeley. (slides)
July 2019: The paper Towards Robust and Scalable Power System State Estimation to appear in Proc. 58th IEEE Conference on Decision and Control
May 2019: New paper on non-convex learning: No Spurious Solutions in Non-convex Matrix Sensing: Structure Compensates for Isometry
January 2019: New paper on data analytics: Conic Optimization for Robust Quadratic Regression
December 2018: Our paper On Sampling Complexity of the Semidefinite Affine Rank Feasibility Problem has been designated for oral presentation on Thirty-Third AAAI Conference on Artificial Intelligence
November 2018: I will give a talk “Conic Optimization For Robust Quadratic Regression” at the 57th IEEE Conference on Decision and Control
October 2018: Our paper On Sampling Complexity of the Semidefinite Affine Rank Feasibility Problem was accepted on Thirty-Third AAAI Conference on Artificial Intelligence
September 2018: I successfully passed Doctoral Qualifying Examination.
September 2018: I gave a talk “Geometry of SDP relaxations for rank constrained problems” at the Power Systems Seminar for IEOR Department at University of California, Berkeley.
September 2018: New paper on SDP relaxations of rank-constrained problems: On Sampling Complexity of the Semidefinite Affine Rank Feasibility Problem.
July 2018: Our paper Conic Optimization for Robust Quadratic Regression: Deterministic Bounds and Statistical Analysis to appear in IEEE Conference on Decision and Control, 2018.
July 2018: I will give a talk on “Conic Optimization For Robust State Estimation: Deterministic Bounds And Statistical Analysis” at INFORMS Annual Meeting
May 2018: I successfully passed Ph.D. Preliminary Exam.
April 2018: I gave a talk on “Conic Relaxations for State Estimation under Sparse Noise” at the Power Systems Seminar for IEOR Department at University of California, Berkeley.
March 2018: New paper on robust nonlinear regression for bad data detection: Conic Optimization for Robust Quadratic Regression: Deterministic Bounds and Statistical Analysis.
August 2017: I joined the department of Industrial Engineering and Operations Research at University of California, Berkeley as a M.Sc/PhD Scholar.