MICA: An End-to-End Compiler Stack for Mesh Accelerators

Yeqi Huang, Congjie He, Haocheng Xiao, Yanwei Ye, Yi-Chieh Wang, Boyao Song, Ziming Miao, Lingxiao Ma, Fan Yang, Luo Mai

Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)

2026

SwarmX: A Scheduler Agent Framework for Large Agentic Workflow Clusters

Yeqi Huang, Yanwei Ye, Guomin Chen, Wenhao Su, Bin Gong, Jialian Li, Yao Fu, Yinsicheng Jiang, Xuan Sun, Le Xu, Luo Mai

Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)

2026

ContextPilot: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai

9th Conference on Machine Learning and Systems (MLSys) MLSys 2026

2025

WaferLLM: Large Language Model Inference at Wafer Scale

Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai

17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25) OSDI 2025

2025

BioVLM: Evidence-Enriched Data Synthesis from Biomedical Papers

Yeqi Huang, Yue Chen, Yanwei Ye, et al.

Submitted to ACL 2025 System Demonstration ACL 2025 Demo (Under Review)

2025

LLM-Monitor: Efficient Privacy Violation Monitoring for LLMs

Chuanming Zha, Jiamin Zheng, Yeqi Huang, et al.

Submitted to ACL 2025 System Demonstration ACL 2025 Demo (Under Review)

2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Yao Fu, Yinsicheng Jiang, Yeqi Huang, Ping Nie, Zhan Lu, Leyang Xue, Congjie He, Man-Kit Sit, Jilong Xue, Li Dong, Ziming Miao, Dayou Du, Tairan Xu, Kai Zou, Edoardo Ponti, Luo Mai

Advances in Neural Information Processing Systems (NeurIPS) NeurIPS 2025

2024

(OSDI 2024) ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

2024

Symplectic Structure-Preserving Particle-in-Cell Whole-Volume Simulation of Tokamak Plasmas

Xiao-Long Chen, Lin-Feng Wang, Yeqi Huang, et al.

SC21: International Conference for High Performance Computing, Networking, Storage and Analysis SC 2021

2021