Skip to content

Popular repositories Loading

  1. dflash dflash Public

    DFlash: Block Diffusion for Flash Speculative Decoding

    Python 5.2k 374

  2. paroquant paroquant Public

    [ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference

    Python 309 30

  3. sparselora sparselora Public

    [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity

    Python 76 6

  4. flash-colreduce flash-colreduce Public

    Fast, memory-efficient attention column reduction (e.g., sum, mean, max)

    Python 48 2

Repositories

Showing 4 of 4 repositories

Top languages

Loading…

Most used topics

Loading…