Andrew Zhang
AWS Trainium inference
Virginia Tech BS/MS
azhang42 [at] vt [dot] edu
I currently work on accelerating inference on AWS Trainium devices. I did a 4 year BS/MS at Virginia Tech under Dr. Chris Thomas. I am interested in multi-token generation (discrete diffusion and speculative decoding) and hardware-aware algorithms.
You can find me on LinkedIn, GitHub, X, and Google Scholar.
Get in Touch
Please email me at azhang42 [at] vt [dot] edu or DM me on X.
latest posts
| Jan 13, 2026 | Thought on TTT-E2E |
|---|---|
| Jan 06, 2026 | Recent Trend in Architectures |
| Dec 04, 2025 | Published 4 Papers! |
| Dec 04, 2025 | November & December 2025 Reading List |
| Oct 21, 2025 | Discrete Diffusion for Text Infilling |
| Mar 25, 2025 | Multimodal Paper Reviews |
selected publications
- Flexible-length text infilling for discrete diffusion modelsIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
- SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language ModelsIn Findings of the Association for Computational Linguistics (EMNLP Findings), 2025
- Zero-Shot Fine-Grained Image Classification Using Large Vision-Language ModelsIn Findings of the Association for Computational Linguistics (EMNLP Findings), 2025
- Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal RetrievalIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025