LightWeight Logo

LightWeight

Run the world's heaviest AI models right on your everyday laptop. No expensive servers, no privacy worries.

Star
0
BMCBuy me a coffee
0 Active Users
Abdullayev L
Abdullayev L@zecoryx
Abu Bakr
Abu Bakr@nafderlin

01. Why LightWeight?

Running a massive AI like Llama or Qwen usually requires a $5,000 server. We built an engine that squeezes these giants so they run cool and fast on your normal laptop.

Big AI Model

Previously needed

64GB+ VRAM

Costs ~$5,000+

With LightWeight

16GB RAM

Your normal laptop

Ultra-Heavy Model

Previously needed

140GB+ VRAM

A massive server room

With LightWeight

32GB RAM

Your personal computer

Giant Model (Kimi)

Previously needed

300GB+ VRAM

A massive server room

With LightWeight

24-32GB RAM

Your personal computer

02. How is it possible?

01 / Built for Your PC

Our engine scans your RAM and GPU to create a custom 'fit'. It's like a tailor making a custom suit for your laptop's power.

02 / No Heat, No Noise

We intelligently limit CPU usage so your fans don't scream. Your laptop stays cool and silent even during deep AI thinking.

03 / Smart Memory Logic

LightWeight doesn't hog your RAM. If other apps need space, it instantly releases memory back to your computer. No more crashing.

04 / Truly Portable

One file, zero setup headaches. It works anywhere from a basic office laptop to a high-end gaming beast without extra drivers.

03. Real-world Speed

Your DeviceAI TypeFeeling
Old Office LaptopO'rtacha (7B-8B)Perfectly readable
Modern LaptopKatta (14B-32B)Fast human typing
Gaming PC / MacGigant (70B-671B)Instant responses

04. Simple Start

Windows (PowerShell):

$irm https://lightweight.zecoryx.uz/install.ps1 | iexcopy

macOS / Linux:

$curl -fsSL https://lightweight.zecoryx.uz/install.sh | shcopy

Pull a model then start chatting immediately:

$lightweight pull qwen:32bcopy
$lightweight chat qwen:32bcopy

View all downloaded models on your machine:

$lightweight listcopy

Remove a model to free up disk space:

$lightweight rm llama3:8bcopy

Turn your machine into an OpenAI-compatible local API endpoint:

$lightweight serve --port 8000copy
$# Server starts at http://0.0.0.0:8000copy

Send a request from any app:

$curl http://localhost:8000/v1/chat/completions \copy
$ -H "Content-Type: application/json" \copy
$ -d '{"model": "qwen:32b", "messages": [{"role": "user", "content": "Hello!"}]}'copy

Endpoints

POST /v1/chat/completions

GET /v1/models

GET /v1/health

Options

--port 8000 — custom port

--host 0.0.0.0 — network access

--help — all options

Check if your hardware can run a specific model:

$lightweight check llama3:70bcopy

Get detailed information about any model:

$lightweight info qwen:32bcopy

Configure global CLI settings:

$lightweight config --editcopy

Coming
Soon

01

Voice Tasks

02

Image to Text

03

Image to Video

04

Better CLI

05

Improve Performance