Yeqi Huang

Yeqi Huang

黄业琦

无服务器方向博士生

爱丁堡大学

个人简介

我是黄业琦,爱丁堡大学 Edinburgh-AISys 研究组的博士生,研究方向为 AI 系统。我的研究致力于提升系统能力以支持大规模 AI 应用,涵盖训练和推理过程。我喜欢参与各种 AI 项目。然而,当前的 AI 仍然不够强大。我的目标是让 AI 对所有人更加易用,并在实际工业应用中发挥实质性的作用。

这一宏大目标需要系统性的方法来实现。我之前的研究集中在高性能计算领域的问题上,因此我对科学和工业界的前沿问题有深入了解。要将 AI 引入这些领域,我们需要服务更大的 AI 模型,并提升 AI 模型训练和推理的性能。为了解决这个问题,我尝试从更细粒度的角度出发,专注于改进系统和编译器基础设施。我近期的研究集中在最新一代的 2D Mesh 架构 AI 芯片上,如 TPU 和 Cerebras。

此外,我是一名开源开发者。我喜欢参加黑客马拉松,也喜欢在 Github 上分享我的想法。这是我的博客:个人博客

技能

技术

机器学习/深度学习
数据科学
系统/架构
物理

爱好

徒步
阅读
写作
云牧

研究兴趣

  • 计算机系统
  • 分布式机器学习
  • 无服务器系统

教育经历

无服务器系统方向博士, 2023

爱丁堡大学

计算机科学学士, 2017

中国科学技术大学

近期论文

SwarmX: A Scheduler Agent Framework for Large Agentic Workflow Clusters

Yeqi Huang, Yanwei Ye, Guomin Chen, Wenhao Su, Bin Gong, Jialian Li, Yao Fu, Yinsicheng Jiang, Xuan Sun, Le Xu, Luo Mai

Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)

2026

MICA: An Efficient Compiler for Mesh-Based AI Accelerators

Yeqi Huang, Congjie He, Haocheng Xiao, Yanwei Ye, Yi-Chieh Wang, Boyao Song, Ziming Miao, Lingxiao Ma, Fan Yang, Luo Mai

Submitted to 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 26) OSDI 2026 (Under Review)

2026

RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai

Submitted to 8th Conference on Machine Learning and Systems (MLSys 2026) MLSys 2026 (Under Review)

2025

WaferLLM: Large Language Model Inference at Wafer Scale

Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai

17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25) OSDI 2025

2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Yinsicheng Jiang, Yao Fu, Yeqi Huang, Ping Nie, Zhan Lu, Leyang Xue, Congjie He, Man-Kit Sit, Jilong Xue, Li Dong, Ziming Miao, Dayou Du, Tairan Xu, Kai Zou, Edoardo Ponti, Luo Mai

arXiv preprint arXiv

2024

(OSDI 2024) ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

2024

项目

BTMR-Paper

BTMR-Paper

Insanely Fast Paper Reading Tool - An AI-powered web application for extracting, analyzing, and summarizing academic papers

AI LLM Research Python React
Bili-Investigate

Bili-Investigate

A Streamlit-based web application for tracking Bilibili content creators' video updates with smart incremental fetching

Python Web Application Streamlit Data Collection
YA-PapersWithCode

YA-PapersWithCode

Yet Another Papers With Code - A modern recreation of the Papers With Code platform with AI-powered semantic search

AI LLM Research Python React TypeScript
ContextKeeper

ContextKeeper

AI assistant for RTX GPU users with extensible plugin ecosystem - A community fork of NVIDIA G-Assist

AI Python C++ Plugin System Voice Assistant
Autoreader

Autoreader

Semantic searching in daily arXiv papers.

LLM VectorDB
Nerif

Nerif

LLM-powered Python, make you write Python code with natural language.

LLM

联系方式

yeqi.huang@ed.ac.uk

10 Crichton Street, Edinburgh, United Kingdom

Informatics Forum Room 1.43