←
Back to Open Source
🐙 GitHub Detail
F
changjonathanc/flex-nano-vllm
By changjonathanc
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
GitHub
Python
MIT License
Updated 25 May 2026
Live Snapshot
⭐
Stars
347
🍴
Forks
19
📄
License
MIT License
🧩
Type
Python
📘
About this open-source project
Live information fetched from GitHub.
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
🌿
Default Branch
main
🐞
Open Issues
3
👀
Watchers
347