🐙 GitHub Detail

patilyashvardhan2002-byte/lazy-moe

By patilyashvardhan2002-byte

The GPU-free LLM inference engine. Combines lazy expert loading + TurboQuant KV compression to run models that shouldn't fit on your hardware. Built from scratch, fully local, zero cloud.

GitHub Python Updated 25 May 2026

Open Source ↗ Find Similar 🔎 Submit to Directory ＋

Live Snapshot

⭐

Stars

🍴

Forks

📄

License

Unknown

🧩

Type

Python

📘

About this open-source project

Live information fetched from GitHub.

The GPU-free LLM inference engine. Combines lazy expert loading + TurboQuant KV compression to run models that shouldn't fit on your hardware. Built from scratch, fully local, zero cloud.

🌿

Default Branch

main

🐞

Open Issues

👀

Watchers

Project Details

Source GitHub

Owner patilyashvardhan2002-byte

License Unknown

Updated 25 May 2026

Need help using this?

Golden Eagle IT Technologies can help with setup, customization, deployment, AI integration and monthly support.

Get Support →