Everything
-
Thought on TTT-E2E
-
Recent Trend in Architectures
A recurring recipe seems to be βmore parallel computation paths, each narrower.β
-
November & December 2025 Reading List
This November and December, I was interested on MoE efficiency, speculative decoding, and efficient attention methods.
-
Published 4 Papers!
-
Discrete Diffusion for Text Infilling
-
Multimodal Paper Reviews