Ji Shiyu

CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis

AAAI 2026

Xu, Yuzhuang and Han, Xu and Zhang, Yuanchi and Wang, Yixuan and Liu, Yijun and Ji, Shiyu and Zhu, Qingfu and Che, Wanxiang

CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis

AAAI 2026

Xu, Yuzhuang and Han, Xu and Zhang, Yuanchi and Wang, Yixuan and Liu, Yijun and Ji, Shiyu and Zhu, Qingfu and Che, Wanxiang

Judge Q: Trainable Queries for Optimized Information Retention in KV Cache Eviction

AAAI 2026

Liu, Yijun and Wang, Yixuan and Xu, Yuzhuang and Ji, Shiyu and Xu, Yang and Zhu, Qingfu and Che, Wanxiang

Judge Q: Trainable Queries for Optimized Information Retention in KV Cache Eviction

AAAI 2026

Liu, Yijun and Wang, Yixuan and Xu, Yuzhuang and Ji, Shiyu and Xu, Yang and Zhu, Qingfu and Che, Wanxiang

CRVQ: Channel-Relaxed Vector Quantization for Extreme Compression of LLMs

Transactions of the Association for Computational Linguistics

Xu, Yuzhuang and Ji, Shiyu and Zhu, Qingfu and Che, Wanxiang

CRVQ: Channel-Relaxed Vector Quantization for Extreme Compression of LLMs

Transactions of the Association for Computational Linguistics

Xu, Yuzhuang and Ji, Shiyu and Zhu, Qingfu and Che, Wanxiang

Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 34146--34162, 2025.

Wang, Yixuan and Ji, Shiyu and Liu, Yijun and Xu, Yuzhuang and Xu, Yang and Zhu, Qingfu and Che, Wanxiang

Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 34146--34162, 2025.

Wang, Yixuan and Ji, Shiyu and Liu, Yijun and Xu, Yuzhuang and Xu, Yang and Zhu, Qingfu and Che, Wanxiang