论文发表
MICA: An End-to-End Compiler Stack for Mesh Accelerators
Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)
2026
SwarmX: A Scheduler Agent Framework for Large Agentic Workflow Clusters
Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)
2026
ContextPilot: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse
9th Conference on Machine Learning and Systems (MLSys) MLSys 2026
2025
WaferLLM: Large Language Model Inference at Wafer Scale
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25) OSDI 2025
2025
BioVLM: Evidence-Enriched Data Synthesis from Biomedical Papers
Submitted to ACL 2025 System Demonstration ACL 2025 Demo (Under Review)
2025
LLM-Monitor: Efficient Privacy Violation Monitoring for LLMs
Submitted to ACL 2025 System Demonstration ACL 2025 Demo (Under Review)
2025
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
Advances in Neural Information Processing Systems (NeurIPS) NeurIPS 2025
2024
(OSDI 2024) ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models
2024
Symplectic Structure-Preserving Particle-in-Cell Whole-Volume Simulation of Tokamak Plasmas
SC21: International Conference for High Performance Computing, Networking, Storage and Analysis SC 2021
2021