Hi, I'm Lucas.

Current Track: AI Systems

Research Portfolio / Systems-Oriented Builder

AI
&SYSTEMS

AI Systems / Efficient Inference / Hardware-Aware Execution

AI Systems Researcher and Engineer

I study and build efficient AI systems across models, inference pipelines, and hardware-aware execution. My current interests include multimodal learning, CUDA optimization, runtime scheduling, and edge-oriented AI infrastructure.

Efficient InferenceCUDA & SystemsMultimodal AI

View Research Open Resume

Beyond model usage, toward systems thinking.

I am interested in how AI models actually run in practice, including inference efficiency, memory behavior, runtime scheduling, and the coupling between models and hardware.

From CUDA kernel optimization and RAG-based code analysis to multimodal research and system-level inference thinking.

5+Selected technical projects

1Manuscript in preparation

50+Technical blog posts

3.7/4.0Master's CAP

Explore research and selected work01

Scroll

Yiming HuangResearch + EngineeringModel to Hardware

Focus

Research Focus

My current focus is how modern AI models can run more efficiently in real systems, especially under constraints of memory, latency, and hardware resources.

From inference efficiency to compiler passes and hardware realities, I focus on how systems actually land.

Efficient AI Inference

I care about real deployment behavior, not only benchmark numbers. That includes latency, memory movement, and inference efficiency under practical constraints.

Compiler and Runtime Co-optimization

I am interested in treating quantization, operator fusion, code generation, and runtime scheduling as one connected systems problem.

Hardware-Aware Systems Thinking

My perspective is shaped by memory hierarchy, data locality, register pressure, instruction throughput, and the realities of edge and GPU platforms.

Work

Selected Projects

These projects show how I approach AI systems problems through both research reasoning and hands-on implementation.

View Full Projects

AI Systems

CUDA GEMM Optimization and Architectural Analysis

Implemented and optimized GEMM kernels with shared memory tiling, register blocking, and profiling-guided analysis to improve arithmetic intensity and execution performance.

CUDAProfilingMemory Hierarchy

AI Systems

LLM + RAG Code Architecture Analysis System

Built a repository analysis system that combines LLMs, AST-based chunking, vector retrieval, and CUDA-aware parsing for structured code understanding.

LLMRAGAST

AI Systems

Multimodal Video Captioning Research

Designed a multimodal video-to-text pipeline with transformer-based alignment and efficiency-oriented system thinking.

MultimodalViTInference

View Full Projects

Profile

Background

My background combines research-oriented study with practical engineering experience across data systems, maintenance, networking, and technical tool building.

Capability Thread

01-05

01Master's research in multimodal AI and efficient inference
02Publication manuscript under preparation
03Engineering experience in data pipelines and automation
04Earlier systems and network operations experience
05Ongoing study in CSAPP, system programming, and performance analysis

Links

Materials

Currently focused on machine learning systems, low-level operator optimization, GPU programming, and AI infrastructure engineering.

Keep formal materials accessible while making technical output visible.

GitHub Blog Email

Tech InsightsOpen

WeChat

Tech Insights

A small window into my ongoing technical notes, system-level observations, and engineering reflections. Scan the QR code to follow the account.

On mobile, tap to open the panel and long-press the QR code to scan.

AI&SYSTEMS

AI Systems Researcher and Engineer

Research Focus

Efficient AI Inference

Compiler and Runtime Co-optimization

Hardware-Aware Systems Thinking

Selected Projects

CUDA GEMM Optimization and Architectural Analysis

LLM + RAG Code Architecture Analysis System

Multimodal Video Captioning Research

Background

Materials

Tech Insights

AI
&SYSTEMS