-
Thought on TTT-E2E
Just read the TTT-E2E paper and had some informal thoughts.
-
Recent Trend in Architectures
A recurring recipe seems to be “more parallel computation paths, each narrower.”
-
Published 4 Papers!
A quick update on four 2025 papers.
-
November & December 2025 Reading List
This November and December, I was interested on MoE efficiency, speculative decoding, and efficient attention methods.
-
Discrete Diffusion for Text Infilling
Project page and paper for flexible-length text infilling with discrete diffusion models.
-
Multimodal Paper Reviews
Paper reviews from my graduate multimodal class.