Yeqi Huang

Yeqi Huang

黄业琦

PhD of Serverless

University of Edinburgh

Biography

I am Yeqi Huang, a PhD student specializing in AI-System at the Edinburgh-AISys Group. My research centers on enhancing system capabilities to support large-scale AI applications, encompassing training and inference processes. I enjoy engaging with various AI projects. However, AI nowadays are still not strong enough. My objective is to enhance the accessibility of AI for all individuals and to imbue AI with substantial utility in practical industrial applications.

This ambitious objective necessitates a methodical approach for its attainment. My prior research has centered on issues within High Performance Computing. So, I know well about the cutting edge problems in real sience and industry. To introduce AI into those field, we need serving larger AI models and we need to improve the performance of AI model's training and inference. In an attempt to address this issue, I endeavored to approach it from a more granular perspective by focusing on enhancing the system and compiler infrastructure. My recent research has centered on the latest generation of 2D Mesh Architecutre AI chips such as TPU and Cerebras.

BTW, I am an open-source developer. I love playing in Hackathons and shring my ideas on Github. And this is my blog: Personl Blog

Skills

TECHNICAL

MACHINE LEARNING/DEEP LEARNING
DATA SCIENCE
SYSTEM/ARCHITECTURE
PHSYICS

HOBBIES

HIKING
READING
WRITING
CLOUDHERD

Interests

  • Computer System
  • Distributed Machine Learning
  • Serverless System

Education

PhD in Serverless System, 2023

University of Edinburgh

BSc in Computer Science, 2017

University of Science and Technology of China

Recent Publications

RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai

Submitted to 8th Conference on Machine Learning and Systems (MLSys 2026) MLSys 2026 (Under Review)

2025

WaferLLM: Large Language Model Inference at Wafer Scale

Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai

17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25) OSDI 2025

2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems

Yinsicheng Jiang, Yao Fu, Yeqi Huang, Ping Nie, Zhan Lu, Leyang Xue, Congjie He, Man-Kit Sit, Jilong Xue, Li Dong, Ziming Miao, Dayou Du, Tairan Xu, Kai Zou, Edoardo Ponti, Luo Mai

arXiv preprint arXiv

2024

(OSDI 2024) ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

2024

Projects

BTMR-Paper

BTMR-Paper

Insanely Fast Paper Reading Tool - An AI-powered web application for extracting, analyzing, and summarizing academic papers

AI LLM Research Python React
Bili-Investigate

Bili-Investigate

A Streamlit-based web application for tracking Bilibili content creators' video updates with smart incremental fetching

Python Web Application Streamlit Data Collection
YA-PapersWithCode

YA-PapersWithCode

Yet Another Papers With Code - A modern recreation of the Papers With Code platform with AI-powered semantic search

AI LLM Research Python React TypeScript
ContextKeeper

ContextKeeper

AI assistant for RTX GPU users with extensible plugin ecosystem - A community fork of NVIDIA G-Assist

AI Python C++ Plugin System Voice Assistant
Autoreader

Autoreader

Semantic searching in daily arXiv papers.

LLM VectorDB
Nerif

Nerif

LLM-powered Python, make you write Python code with natural language.

LLM

Contact

yeqi.huang@ed.ac.uk

10 Crichton Street, Edinburgh, United Kingdom

Informatics Forum Room 1.43