🐙 GitHub Detail
patilyashvardhan2002-byte/lazy-moe
By patilyashvardhan2002-byte
The GPU-free LLM inference engine. Combines lazy expert loading + TurboQuant KV compression to run models that shouldn't fit on your hardware. Built from scratch, fully local, zero cloud.
Live Snapshot
⭐
Stars
23
🍴
Forks
4
📄
License
Unknown
🧩
Type
Python
About this open-source project
Live information fetched from GitHub.
The GPU-free LLM inference engine. Combines lazy expert loading + TurboQuant KV compression to run models that shouldn't fit on your hardware. Built from scratch, fully local, zero cloud.
Default Branch
main
Open Issues
0
Watchers
23