publications

Since I don’t find the typical dump of publication references with no elaboration in CVs particularly helpful for actually knowing someone, I’ve instead provided some tl;drs and topic identifiers for a selected portion of my works below, which should give you a proper overview of me in a more self-contained manner.

For a full publication list, please refer to my Google Scholar — be warned that a large share of my citations comes from surveys I didn’t lead.

Selected Publications

2025

  1. Preprint
    Position: Want Better ML Reviews? Stop Asking Nicely and Start Incentivizing with a Credit System
    Shaochen (Henry) Zhong
  2. Preprint
    Sweeping Promptable Spoofs under the DirtyRAG: A Practical, Query-Blind RAG Attack Done Right
    Shaochen (Henry) Zhong*, Jiamu Zhang*, Hoang Anh Duy Le, and 15 more authors
  3. Preprint
    FAFO: Lossless KV Cache Compression with Draftless Fumble Decoding
    Hoang Anh Duy Le*, Shaochen (Henry) Zhong*, Yifan Lu, and 7 more authors
  4. EMNLP 2025 Main Oral
    Word Salad Chopper: Reasoning Models Waste A Ton Of Decoding Budget On Useless Repetitions, Self-Knowingly
    Wenya Xie*, Shaochen (Henry) Zhong*, Hoang Anh Duy Le, and 3 more authors
  5. EMNLP 2025 Findings
    LoRATK: LoRA Once, Backdoor Everywhere in the Share-and-Play Ecosystem
    Hongyi Liu*, Shaochen (Henry) Zhong*, Xintong Sun*, and 12 more authors
  6. NeurIPS 2025
    70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
    Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong, and 4 more authors

2024

  1. ICLR 2025 Spotlight
    MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations
    Shaochen (Henry) Zhong*, Yifan Lu*, Lize Shao, and 13 more authors
  2. EMNLP 2025 Findings
    KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
    Jiayi Yuan*, Hongyi Liu*, Shaochen (Henry) Zhong*, and 9 more authors
  3. ICML 2024
    KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
    Zirui Liu*, Jiayi Yuan*, Hongye Jin, and 5 more authors
  4. EMNLP 2024 Main
    Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
    Guanchu Wang*, Yu-Neng Chuang*, Ruixiang Tang, and 8 more authors
  5. ICML 2024
    GNNs Also Deserve Editing, and They Need It More Than Once
    Shaochen (Henry) Zhong*, Hoang Anh Duy Le*, Zirui Liu, and 10 more authors

2023

  1. NeurIPS 2023
    One Less Reason for Filter Pruning: Gaining Free Adversarial Robustness with Structured Grouped Kernel Pruning
    Shaochen (Henry) Zhong, Zaichuan You, Jiamu Zhang, and 7 more authors

2021

  1. ICLR 2022
    Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions
    Shaochen (Henry) Zhong, Guanqun Zhang, Ningjia Huang, and 1 more author