Theme
Memory Wall in Edge Inference
I am interested in how transformer-based perception modules suffer from memory bandwidth and locality bottlenecks on edge devices such as Jetson-class platforms.
My current research direction centers on efficient AI inference, especially how model-level choices interact with compiler passes, runtime policies, and hardware realities.
Research Overview
This page captures the current technical direction behind the site: efficient inference, system optimization, and hardware-aware execution.
Theme
I am interested in how transformer-based perception modules suffer from memory bandwidth and locality bottlenecks on edge devices such as Jetson-class platforms.
Theme
Rather than treating quantization and fusion as isolated stages, I want to study them as a shared search space shaped by instruction-level and hardware-level constraints.
Theme
I care about how runtime policies, especially CPU-GPU coordination and dynamic scheduling, affect deterministic inference behavior in resource-constrained systems.
Agenda
Outputs