I am Weiwei Qi (齐巍巍), a second-year PhD student in Cyberspace Security at Zhejiang University, advised by Tianhang Zheng, Zhan Qin, and Kui Ren. I study how to make large language models safer and more reliable. My research includes both attack and defense, including jailbreak attacks, safety alignment, and efficient post-training for LLMs and agents. My goal is to help build safer, more reliable, and more trustworthy intelligent systems.

LLM Safety Safety Alignment Reinforcement Learning Agent Safety Jailbreak Red-Teaming

🔥 News

  • 🎉 2026.04.06: Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models was accepted to ACL 2026 Findings.
  • 🥳 2025.11.08: MAJIC: Markovian Adaptive Jailbreaking via Iterative Composition of Diverse Innovative Strategies was accepted to AAAI 2026.

📝 Publications

Weiwei Qi, Zefeng Wu, Tianhang Zheng, Zikang Zhang, Xiaojun Jia, Zhan Qin, Kui Ren
ACL 2026 Findings (CCF-A) | First Author
Weiwei Qi, Shuo Shao, Wei Gu, Tianhang Zheng, Puning Zhao, Zhan Qin, Kui Ren
AAAI 2026 (CCF-A) | First Author

🎓 Education

  • Zhejiang University: PhD in Cyberspace Security (ongoing, Year 2).
  • Harbin Institute of Technology: BEng in Computer Science and Technology.

🛠️ Projects

  • 2026.04: Auto-Resubmit is a practical open-source tool for lossless migration of LaTeX paper projects across conference templates such as ACL, NeurIPS, ICML, ICLR, CVPR, and AAAI, helping automate content extraction, template reassembly, compilation, and packaging for paper resubmission. Code GitHub stars
  • 2025.06: DeepSeek-R1-Safe is a safety-enhanced reasoning LLM jointly developed with Huawei on Ascend and MindSpeed-LLM, where I contributed to 1024-GPU large-scale training, safety data construction, safety-oriented supervised fine-tuning, and scalable deployment. The project emphasizes multidimensional safety corpus design, safety core reasoning pre-alignment, dynamic efficiency compensation during SFT, and fine-grained safety RL for jointly optimizing safety, alignment, and general reasoning capability. Code GitHub stars

🏆 Awards

  • 2025.10: Outstanding Graduate Student (Academic Innovation Award).
  • 2023.05: Outstanding Prize, 2nd National College Students Olympic Mathematics Competition (Summer).
  • 2023.03: Second Prize, National College Students Mathematics Competition Finals.
  • 2021.11: First Prize, 13th National College Students Mathematics Competition.
  • 2021.11: First Prize (Heilongjiang), National College Students Mathematical Contest in Modeling.
  • 2021.02 and 2022.01: Honorable Mention, MCM/ICM.