Skip to content

Quick Start

This tutorial walks you through setting up Gambi from scratch. By the end, you’ll have a hub running with participants sharing LLMs on your network.

  • A machine to run the hub — any computer on the network (doesn’t need a GPU, it just routes traffic)
  • At least one LLM endpoint — Ollama, LM Studio, vLLM, or any OpenAI-compatible API
  • Node.js only if you plan to install the CLI through npm

The CLI allows you to start hubs, create rooms, and join as a participant. Pre-built binaries are available for Linux (x64/arm64), macOS (Apple Silicon and Intel), and Windows (x64).

Terminal window
curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.sh | bash

The script auto-detects your OS and architecture, downloads the correct binary, and installs it to /usr/local/bin.

Verify the installation:

Terminal window
gambi --version
Terminal window
curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.sh | bash

The SDK provides Vercel AI SDK integration for using shared LLMs in your TypeScript/JavaScript applications. It defaults to the Responses API and also supports Chat Completions.

Terminal window
npm install gambi-sdk
# or
bun add gambi-sdk

All CLI commands support interactive mode — run without flags and you’ll be guided through each option. Flags still work for scripting.

Pick a machine on the network to be the hub. It doesn’t need a GPU — the hub only routes requests between participants.

Terminal window
# Interactive — prompts for port, host, mDNS:
gambi serve
# Or with flags:
gambi serve --port 3000 --mdns

The --mdns flag enables auto-discovery so other machines on the network can find the hub automatically.

Terminal window
# Interactive — prompts for room name and password:
gambi create
# Or with flags:
gambi create --name "My Room"
# Output: Room created! Code: ABC123

Share this code with everyone who wants to join — via chat, projector, sticky note, whatever works.

You can also create a password-protected room:

Terminal window
gambi create --name "Private Room" --password secret123

Each person with an LLM endpoint joins the room:

Terminal window
# Interactive — select provider, model, set nickname:
gambi join
# Or with flags:
gambi join --code ABC123 --model llama3

In interactive mode, you’ll select your LLM provider from a list (Ollama, LM Studio, vLLM, or custom URL), then pick from detected models. You can also set a nickname and room password.

With flags, the default endpoint is http://localhost:11434 (Ollama). For other providers, use --endpoint:

Terminal window
# LM Studio
gambi join --code ABC123 --model mistral --endpoint http://localhost:1234
# vLLM
gambi join --code ABC123 --model llama3 --endpoint http://localhost:8000

The CLI will probe your local endpoint, detect available models and protocol capabilities, and register you in the room. If you join a remote hub from another machine while using localhost, Gambi automatically tries to publish a LAN-reachable URL instead. Use --network-endpoint only when you need to override that published URL manually.

It also shares your machine specs (CPU, RAM, GPU) automatically — use --no-specs if you prefer not to share.

Once joined, your LLM is available to everyone in the room.

Now anyone can use the shared LLMs from their code:

import { createGambi } from "gambi-sdk";
import { generateText } from "ai";
const gambi = createGambi({
roomCode: "ABC123",
hubUrl: "http://localhost:3000",
});
// Send to any available participant
const result = await generateText({
model: gambi.any(),
prompt: "Hello, Gambi!",
});
console.log(result.text);

You can also target specific models or participants:

// Use a specific model
const result = await generateText({
model: gambi.model("llama3"),
prompt: "Explain quantum computing",
});
// Use a specific participant
const result = await generateText({
model: gambi.participant("joao"),
prompt: "Write a haiku",
});

To use Chat Completions instead of the default Responses API:

const gambi = createGambi({
roomCode: "ABC123",
hubUrl: "http://localhost:3000",
defaultProtocol: "chatCompletions",
});

No SDK needed. The hub is an OpenAI-compatible API — use it from any language or tool:

Terminal window
curl -X POST http://localhost:3000/rooms/ABC123/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "*",
"messages": [{"role": "user", "content": "Hello!"}]
}'

Any tool that accepts a custom OpenAI base URL works — Lovable, Cursor, Open WebUI, Python’s openai library, etc. Just point it at:

http://<hub-ip>:<port>/rooms/<ROOM_CODE>/v1

See the API Reference for all available endpoints.