Quick Start

This tutorial walks you through setting up Gambi from scratch. By the end, you’ll have a hub running with participants sharing LLMs on your network.

What You’ll Need

A machine to run the hub — any computer on the network (doesn’t need a GPU, it just routes traffic)
At least one LLM endpoint — Ollama, LM Studio, vLLM, or any OpenAI-compatible API
Node.js only if you plan to install the CLI through npm

Installation

CLI

The CLI allows you to start hubs, create rooms, and join as a participant. Pre-built binaries are available for Linux (x64/arm64), macOS (Apple Silicon and Intel), and Windows (x64).

curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.sh | bash

The script auto-detects your OS and architecture, downloads the correct binary, and installs it to /usr/local/bin.

irm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.ps1 | iex

The script downloads the binary to %LOCALAPPDATA%\gambi\ and adds it to your user PATH.

# Via npm
npm install -g gambi

# Via bun
bun add -g gambi

The published gambi package is a wrapper that installs only the matching platform binary for your machine. npm users only need Node.js; Bun is not required at runtime.

Verify the installation:

gambi --version

Uninstallation

curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.sh | bash

irm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.ps1 | iex

# If installed via npm
npm uninstall -g gambi

# If installed via bun
bun remove -g gambi

SDK

The SDK provides Vercel AI SDK integration for using shared LLMs in your TypeScript/JavaScript applications. It defaults to the Responses API and also supports Chat Completions.

npm install gambi-sdk
# or
bun add gambi-sdk

Basic Usage

All CLI commands support interactive mode — run without flags and you’ll be guided through each option. Flags still work for scripting.

1. Start the Hub Server

Pick a machine on the network to be the hub. It doesn’t need a GPU — the hub only routes requests between participants.

# Interactive — prompts for port, host, mDNS:
gambi serve

# Or with flags:
gambi serve --port 3000 --mdns

The --mdns flag enables auto-discovery so other machines on the network can find the hub automatically.

2. Create a Room

# Interactive — prompts for room name and password:
gambi create

# Or with flags:
gambi create --name "My Room"
# Output: Room created! Code: ABC123

Share this code with everyone who wants to join — via chat, projector, sticky note, whatever works.

You can also create a password-protected room:

gambi create --name "Private Room" --password secret123

3. Join with Your LLM

Each person with an LLM endpoint joins the room:

# Interactive — select provider, model, set nickname:
gambi join

# Or with flags:
gambi join --code ABC123 --model llama3

In interactive mode, you’ll select your LLM provider from a list (Ollama, LM Studio, vLLM, or custom URL), then pick from detected models. You can also set a nickname and room password.

With flags, the default endpoint is http://localhost:11434 (Ollama). For other providers, use --endpoint:

# LM Studio
gambi join --code ABC123 --model mistral --endpoint http://localhost:1234

# vLLM
gambi join --code ABC123 --model llama3 --endpoint http://localhost:8000

The CLI will probe your local endpoint, detect available models and protocol capabilities, and register you in the room. If you join a remote hub from another machine while using localhost, Gambi automatically tries to publish a LAN-reachable URL instead. Use --network-endpoint only when you need to override that published URL manually.

It also shares your machine specs (CPU, RAM, GPU) automatically — use --no-specs if you prefer not to share.

Once joined, your LLM is available to everyone in the room.

4. Use the SDK

Now anyone can use the shared LLMs from their code:

import { createGambi } from "gambi-sdk";
import { generateText } from "ai";

const gambi = createGambi({
  roomCode: "ABC123",
  hubUrl: "http://localhost:3000",
});

// Send to any available participant
const result = await generateText({
  model: gambi.any(),
  prompt: "Hello, Gambi!",
});

console.log(result.text);

You can also target specific models or participants:

// Use a specific model
const result = await generateText({
  model: gambi.model("llama3"),
  prompt: "Explain quantum computing",
});

// Use a specific participant
const result = await generateText({
  model: gambi.participant("joao"),
  prompt: "Write a haiku",
});

To use Chat Completions instead of the default Responses API:

const gambi = createGambi({
  roomCode: "ABC123",
  hubUrl: "http://localhost:3000",
  defaultProtocol: "chatCompletions",
});

5. Use the API Directly

No SDK needed. The hub is an OpenAI-compatible API — use it from any language or tool:

curl -X POST http://localhost:3000/rooms/ABC123/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "*",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Any tool that accepts a custom OpenAI base URL works — Lovable, Cursor, Open WebUI, Python’s openai library, etc. Just point it at:

http://<hub-ip>:<port>/rooms/<ROOM_CODE>/v1

See the API Reference for all available endpoints.

Next Steps

Learn about CLI commands
Explore SDK usage
See the full API Reference
Using cloud LLMs? See Remote Providers
Running a group event? See Challenges & Dynamics