chen-ace/LLM-Prefill-Decode-Benchmark

By chen-ace

通过实验对比LLM推理中Prefill和Decoding阶段的吞吐量差异，揭示性能瓶颈，解释PD分离优化技术的原理。包含CUDA和Apple MPS (M系列芯片) 的测试脚本。

GitHub Python MIT License Updated 22 May 2026

Live Snapshot

⭐

Stars

🍴

Forks

📄

License

MIT License

🧩

Type

Python

📘

About this open-source project

Live information fetched from GitHub.

通过实验对比LLM推理中Prefill和Decoding阶段的吞吐量差异，揭示性能瓶颈，解释PD分离优化技术的原理。包含CUDA和Apple MPS (M系列芯片) 的测试脚本。

🌿

Default Branch

main

🐞

Open Issues

👀

Watchers

Source GitHub

Owner chen-ace

License MIT License

Updated 22 May 2026

Golden Eagle IT Technologies can help with setup, customization, deployment, AI integration and monthly support.