#量化 共 2 个条目 论文 (2) BinaryAttention Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression