Andrew Zhang

andrew-zhang-profile.webp

AWS Trainium inference

Virginia Tech BS/MS

azhang42 [at] vt [dot] edu

I currently work on accelerating inference on AWS Trainium devices. I did a 4 year BS/MS at Virginia Tech under Dr. Chris Thomas. I am interested in multi-token generation (discrete diffusion and speculative decoding) and hardware-aware algorithms.

You can find me on LinkedIn, GitHub, X, and Google Scholar.

Get in Touch

Please email me at azhang42 [at] vt [dot] edu or DM me on X.

latest posts

selected publications

  1. Flexible-length text infilling for discrete diffusion models
    Andrew Zhang, Anushka Sivakumar, Chia-Wei Tang, and Chris Thomas
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
  2. SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
    Anushka Sivakumar, Andrew Zhang, Zaber Ibn Abdul Hakim, and Chris Thomas
    In Findings of the Association for Computational Linguistics (EMNLP Findings), 2025
  3. Zero-Shot Fine-Grained Image Classification Using Large Vision-Language Models
    Md. Atabuzzaman, Andrew Zhang, and Chris Thomas
    In Findings of the Association for Computational Linguistics (EMNLP Findings), 2025
  4. Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval
    Hani Alomari, Anushka Sivakumar, Andrew Zhang, and Chris Thomas
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025