Welcome, it’s a pleasure to connect with you! I’m Bingcong Li, a postdoctoral researcher at ETH Zurich collaborating with Prof. Niao He and the ODI group. Prior to this, I received doctoral degree from the University of Minnesota under the supervision of Prof. Georgios B. Giannakis, and then I gained industry experience, dedicating a year to LLMs.

Find me via bingtsongli[@]gmail.com or bingcong.li[@]inf.ethz.ch.

General interests

My primary focus is to understand the optimization dynamics of neural networks and use these insights to make pretraining, fine-tuning, and inference of LLMs more efficient. Unlike black-box approaches, my work incorporates architectural characteristics, such as attention and normalization layers, into the optimization process to accelerate convergence. By bridging theoretical foundations with practical algorithm design, I aim to advance the development of scalable, efficient, and reliable LLM systems.

I enjoy cycling 🚴🏻 and skiing 🎿 outside offices.

Recent updates

  • 10/2025. We are hosting the Efficient LLMs Fine-tuning (ELF) Track in AI+X summit. See you in October, Zurich!
  • 07/2025. I will talk about Riemannian optimization and its provable merits for fine-tuning LLMs in EUROPT 2025.
  • 06/2025. I will talk about “LoRA sugery” at Efficient Machine Learning Reading Group.
  • 06/2025. [New paper] LoRA does not use allocated rank effectively. This can be addressed with PoLAR, a co-design of architecture and optimizer. Check out our paper.
  • 06/2025. [New paper] RefLoRA optimally rescales/refactorizes LoRA per training step to make fine-tuning LLMs faster. Check out our paper.
  • 06/2025. [New paper] Zeroth-order methods provably find flat minima. Check it out here.
  • 05/2025. [ICML 2025] Transfer learning provably benefits RLHF. Check out our paper.
  • 04/2025. Talked about “Fine-tuning LLMs cost-efficiently” at Peking University.
  • 01/2025. [ICLR 2025] We prove that initialization exponentially impacts the convergence behavior of ScaledGD on LoRA type problems (i.e., linear –> quadratic rates).
  • 12/2024. Talked about “Architecture-Aware Optimization” at ELLIS UnConference.
  • 12/2024. [ICASSP 2025] A new variant of SAM is released.
  • 09/2024. [NeurIPS 2024] We study the implicit regularization of sharpness-aware minimization (SAM) and explicify it to alleviate computational burdern of SAM. The resultant approach is useful for finetuning LLMs with LoRA.
  • 05/2024. [ICML 2024] Memory-efficient private finetuning for LLMs. We also have a paper at Theoretical Foundations of Foundation Models (TF2M) workshop.
  • 01/2024. Start as a postdoc in ETH Zurich, working with Prof. Niao He.
  • 12/2023. [ICASSP 2024] Universal ‘preconditioner’ for meta learning.
  • 09/2023. [NeurIPS 2023] Improving generalization by refining optimization of sharpness-aware minimization; see here.