knot.ai
Docs

From zero to live chatbot.

Four steps, two minutes. No code beyond pasting one snippet — but a full API and SSE stream waits for you when you want to go deeper.

Quickstart

The whole flow, at a glance.

From visitor question → grounded answer in under 300ms. Here's every hop.

Visitor
asks question
Semantic embed
query → vector
Vector search
top-K chunks
Prompt build
grounded context
LLM stream
SSE tokens
1
Step 1

Create a chatbot

From the dashboard, click New chatbot. Give it a name, a brief description, and pick a tone. You can change everything later.

New chatbot
Name
Support · Acme
Description
Front-line support for our online store.
Tone
Friendly · Concisedefault
2
Step 2

Add knowledge

Drop in documents or structured FAQs. Each item is chunked, semantically embedded, and indexed in a vector namespace dedicated to your chatbot.

Knowledge sources
product-faq.pdf
indexed142 chunks
shipping-policy.md
indexed38 chunks
https://acme.com/help
indexing0 chunks
3
Step 3

Embed on your site

Copy the <script> snippet from the Embed tab and paste it before </body>. Or use the iframe variant for in-page placement. The widget inherits your theme.

index.html
html
<script
  src="https://knot.ai/widget.js"
  data-chatbot-id="cb_abc123"
  data-api="https://api.knot.ai"
  defer
></script>
4
Step 4

Watch it work

Conversations, messages, fallback rate and every embedded site appear in real time under Analytics and Embedded sites.

Conversations · 7d
+24%
1,284
API keys

Bring your own provider key.

The dashboard's API keys page is the single place to attach a provider key to your workspace. While a key is active, every chat your bots serve routes through your provider account instead of ours, and the monthly message cap is boosted 5×.

Test before save

When you paste a key and click Test & save, we hit the provider with one tiny probe. Invalid keys are rejected before they ever land in storage — no broken chats.

Masked + revocable

Only the first 4 and last 4 chars of the key are returned to the browser. Remove the key any time — chats drop back to the platform default on the next request.

5× message boost

With BYOK active, your plan's monthly message ceiling multiplies by 5. The Usage page surfaces the boosted cap and how much you've consumed against it.

Strict precedence

Resolution per chat: per-chatbot key (legacy) → your workspace BYOK → platform default. Nothing else is consulted.

Plan limits

Limits you can plan around.

Every write endpoint enforces your plan's caps server-side. Volume is counted from an append-only ledger, so deleting conversations doesn't refund the count. The Usage page in the dashboard shows live gauges against the current cap.

Messages · monthly

One visitor message = one ledger row. Visible in real time on the Usage page. BYOK multiplies the cap by 5.

Chatbots

Soft-capped per plan. Hitting the limit returns 402 with a structured error code so the UI can prompt an upgrade.

Knowledge items

Total across all of your chatbots. Re-ingesting an existing item doesn't count against the cap.

Embedded sites

Counts distinct origins that have loaded your widget. Deactivating an origin from the dashboard frees the slot.

API & SSE

Stream answers from anywhere.

Every chatbot is reachable over a simple POST endpoint that streams tokens via Server-Sent Events. Drop it into your app, Slack bot, CLI — whatever.

curl
bash
curl -N "https://api.knot.ai/chat" \
  -H "Content-Type: application/json" \
  -H "x-chatbot-id: cb_abc123" \
  -d '{
    "message": "What is your refund policy?",
    "session_id": "anon-9f23"
  }'
JavaScript
ts
const res = await fetch("/api/chat", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-chatbot-id": "cb_abc123",
  },
  body: JSON.stringify({ message }),
});

const reader = res.body!.getReader();
const dec = new TextDecoder();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  process.stdout.write(dec.decode(value));
}
REST endpoints

CRUD for chatbots, knowledge, conversations, embedded sites. Bearer-token auth.

SSE streaming

Token-by-token responses with retries, abort, and structured event types.

Webhooks (Pro)

Push every conversation, message, or fallback event into your stack in real time.

Analytics

See what your bot is actually doing.

Conversations, messages, fallback rate, and every embedded site appear in real time under Analytics. Filter by time, source, or origin.

Conversations · 7d
1,284
+24%
Messages · 7d
8,902
+18%
Fallback rate
3.4%
−1.2%
Conversations · 7d
+24%
1,284
Security

Isolation, end to end.

Row-level isolation

Every database table enforces row-level isolation scoped to the signed-in user. No bypass route — admin credentials never leave the backend.

Per-bot vector namespace

Each chatbot writes & reads inside its own vector namespace. Cross-tenant retrieval is structurally impossible.

Origin allow-list

Restrict the chat endpoint to a list of trusted origins. Empty list = open; one entry = locked down.

Bring-your-own LLM key

Pro and Enterprise can route everything through their own provider keys — we never see the prompts or responses on the LLM side.

FAQ

Frequently asked.

When a visitor asks a question, we embed it as a semantic vector, retrieve the top-K most similar chunks from your private vector namespace, build a grounded prompt with those chunks, and stream the LLM response token-by-token over Server-Sent Events.

Ready to ship? Two minutes.

The first paste is the only one that matters.