Quick Start
This tutorial walks you through setting up Gambi from scratch. By the end, you’ll have a hub running with participants sharing LLMs on your network.
What You’ll Need
Section titled “What You’ll Need”- A machine to run the hub — any computer on the network (doesn’t need a GPU, it just routes traffic)
- At least one LLM endpoint — Ollama, LM Studio, vLLM, or any OpenAI-compatible API
- Node.js only if you plan to install the CLI through npm
Installation
Section titled “Installation”The CLI allows you to start hubs, create rooms, and join as a participant. Pre-built binaries are available for Linux (x64/arm64), macOS (Apple Silicon and Intel), and Windows (x64).
curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.sh | bashThe script auto-detects your OS and architecture, downloads the correct binary, and installs it to /usr/local/bin.
irm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.ps1 | iexThe script downloads the binary to %LOCALAPPDATA%\gambi\ and adds it to your user PATH.
# Via npmnpm install -g gambi
# Via bunbun add -g gambiThe published gambi package is a wrapper that installs only the matching platform binary for your machine. npm users only need Node.js; Bun is not required at runtime.
Verify the installation:
gambi --versionUninstallation
Section titled “Uninstallation”curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.sh | bashirm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.ps1 | iex# If installed via npmnpm uninstall -g gambi
# If installed via bunbun remove -g gambiThe SDK provides Vercel AI SDK integration for using shared LLMs in your TypeScript/JavaScript applications. It defaults to the Responses API and also supports Chat Completions.
npm install gambi-sdk# orbun add gambi-sdkBasic Usage
Section titled “Basic Usage”All CLI commands support interactive mode — run without flags and you’ll be guided through each option. Flags still work for scripting.
1. Start the Hub Server
Section titled “1. Start the Hub Server”Pick a machine on the network to be the hub. It doesn’t need a GPU — the hub only routes requests between participants.
# Interactive — prompts for port, host, mDNS:gambi hub serve
# Or with flags:gambi hub serve --port 3000 --mdnsThe --mdns flag enables auto-discovery so other machines on the network can find the hub automatically.
2. Create a Room
Section titled “2. Create a Room”# Interactive — prompts for room name and password:gambi room create
# Or with flags:gambi room create --name "My Room"# Output: Room created! Code: ABC123Share this code with everyone who wants to join — via chat, projector, sticky note, whatever works.
You can also create a password-protected room:
gambi room create --name "Private Room" --password secret1233. Join with Your LLM
Section titled “3. Join with Your LLM”Each person with an LLM endpoint joins the room:
gambi participant join \ --room ABC123 \ --participant-id joao-1 \ --model llama3With flags, the default endpoint is http://localhost:11434 (Ollama). For other providers, use --endpoint:
# LM Studiogambi participant join \ --room ABC123 \ --participant-id joao-lmstudio \ --model mistral \ --endpoint http://localhost:1234
# vLLMgambi participant join \ --room ABC123 \ --participant-id joao-vllm \ --model llama3 \ --endpoint http://localhost:8000The CLI probes your local endpoint, detects available models and protocol capabilities, registers the participant, opens a tunnel back to the hub, and keeps the session alive until interrupted.
Important implication: your provider endpoint can stay on localhost, even when the hub is running on another machine in the same trusted network. You no longer need to publish a LAN-reachable provider URL just to join a room. For the reasoning behind this, see How Tunnels Work.
It also shares your machine specs (CPU, RAM, GPU) automatically. Use --no-specs if you prefer not to share them.
Once joined, your LLM is available to everyone in the room.
4. Use the SDK
Section titled “4. Use the SDK”Now anyone can use the shared LLMs from their code:
import { createGambi } from "gambi-sdk";import { generateText } from "ai";
const gambi = createGambi({ roomCode: "ABC123", hubUrl: "http://localhost:3000",});
// Send to any available participantconst result = await generateText({ model: gambi.any(), prompt: "Hello, Gambi!",});
console.log(result.text);You can also target specific models or participants:
// Use a specific modelconst result = await generateText({ model: gambi.model("llama3"), prompt: "Explain quantum computing",});
// Use a specific participantconst result = await generateText({ model: gambi.participant("joao"), prompt: "Write a haiku",});To use Chat Completions instead of the default Responses API:
const gambi = createGambi({ roomCode: "ABC123", hubUrl: "http://localhost:3000", defaultProtocol: "chatCompletions",});5. Use the API Directly
Section titled “5. Use the API Directly”No SDK needed. The hub is an OpenAI-compatible API — use it from any language or tool:
curl -X POST http://localhost:3000/rooms/ABC123/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "*", "input": "Hello!" }'Any tool that accepts a custom OpenAI base URL works — Lovable, Cursor, Open WebUI, Python’s openai library, etc. Just point it at:
http://<hub-ip>:<port>/rooms/<ROOM_CODE>/v1See the API Reference for all available endpoints.
Next Steps
Section titled “Next Steps”- Learn about CLI commands
- Explore SDK usage
- See the full API Reference
- Using cloud LLMs? See Remote Providers
- Building multi-LLM experiences? See Multi-LLM Patterns