MICA: An End-to-End Compiler Stack for Mesh Accelerators

Yeqi Huang, Congjie He, Haocheng Xiao, Yanwei Ye, Yi-Chieh Wang, Boyao Song, Ziming Miao, Lingxiao Ma, Fan Yang, Luo Mai

2026

Swarmpilot: A Scheduler Agent Framework for Large Agentic Workflow Clusters

Yeqi Huang, Yanwei Ye, Guomin Chen, Wenhao Su, Bin Gong, Jialian Li, Yao Fu, Yinsicheng Jiang, Xuan Sun, Le Xu, Luo Mai

2026

BARSA: An Adaptive Test-Time Scaling Strategy for Mathematical Reasoning under Global Compute Budgets

Yufan Zhao, Yinsicheng Jiang, Cheng Deng, Yeqi Huang, Tairan Xu, Zhan Lu, Luo Mai, Wenda Li

ICML 2026 Workshop

2026

Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers

Yeqi Huang, Yue Chen, Yanwei Ye, Guanhao Su, Luo Mai

ACL 2026 Demo

2026

ContextPilot: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai

MLSys 2026

2025

WaferLLM: Large Language Model Inference at Wafer Scale

Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai

OSDI 2025

2025

LLM-Monitor: Efficient Privacy Violation Monitoring for LLMs

Chuanming Zha, Jiamin Zheng, Yeqi Huang, et al.

2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Yao Fu, Yinsicheng Jiang, Yeqi Huang, Ping Nie, Zhan Lu, Leyang Xue, Congjie He, Man-Kit Sit, Jilong Xue, Li Dong, Ziming Miao, Dayou Du, Tairan Xu, Kai Zou, Edoardo Ponti, Luo Mai

NeurIPS 2025

2024

ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

OSDI 2024

2024

Symplectic Structure-Preserving Particle-in-Cell Whole-Volume Simulation of Tokamak Plasmas

Xiao-Long Chen, Lin-Feng Wang, Yeqi Huang, et al.

SC 2021

2021