Datalumo

API Reference

API v1 Reference

Base URL: /api/v1

All endpoints require a bearer token unless noted otherwise.

Authentication

Include your API token in every request:

Authorization: Bearer YOUR_API_TOKEN
Accept: application/json
Content-Type: application/json

If authentication fails, the API returns 401 Unauthorized.

Pagination

Endpoints marked (paginated) accept the following query parameters:

Param Type Default Description
page integer 1 Page number to return
per_page integer 15 Results per page. Maximum 100 — values above the maximum are clamped, not rejected.
curl "https://datalumo.app/api/v1/collections?page=2&per_page=50" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Every paginated response includes links and meta blocks alongside data:

{
  "data": [],
  "links": { "first": "...", "last": "...", "prev": null, "next": null },
  "meta": { "current_page": 1, "last_page": 1, "per_page": 15, "total": 0 }
}

The search endpoint also accepts page and per_page, but caps per_page at 50 rather than 100.

Endpoint Index

Collections & Entries (Data Management)

Method Path Description
GET /collections List collections (paginated)
POST /collections Create a collection
PATCH /collections/{collection} Update a collection
DELETE /collections/{collection} Delete a collection
GET /collections/{collection}/entries List entries (paginated)
POST /collections/{collection}/entries Create an entry
PUT /collections/{collection}/entries/upsert Upsert an entry by source
POST /collections/{collection}/entries/batch Batch upsert entries
PATCH /collections/{collection}/entries/{entry} Update an entry
DELETE /collections/{collection}/entries/{entry} Delete an entry
DELETE /collections/{collection}/entries/by-source/{sourceType}/{sourceId} Delete an entry by source

Integrations (Search, Chat & Widgets)

Method Path Description
GET /integrations List integrations (paginated)
POST /integrations Create an integration
GET /integrations/{integration} Get integration details
PATCH /integrations/{integration} Update an integration
DELETE /integrations/{integration} Delete an integration
GET /integrations/{integration}/search Search via integration
GET /integrations/{integration}/summarise Summarise via integration
POST /integrations/{integration}/chat Chat via integration
POST /integrations/{integration}/events Record integration events

Collections

Collections are containers for related entries. Use collections and entries to manage your data. All collection endpoints use the collection's UUID as the URL parameter.

List Collections

GET /collections

List all collections (paginated).

curl https://datalumo.app/api/v1/collections \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Response 200

{
  "data": [
    {
      "id": "9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c",
      "organisation_id": 12,
      "name": "Support Articles",
      "created_at": "2026-03-10T08:00:00.000000Z",
      "updated_at": "2026-03-10T08:00:00.000000Z"
    }
  ],
  "links": { "first": "...", "last": "...", "prev": null, "next": null },
  "meta": { "current_page": 1, "last_page": 1, "per_page": 15, "total": 1 }
}

Create Collection

POST /collections

Create a collection.

{
  "name": "Support Articles"
}

Response 201

{
  "data": {
    "id": "9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c",
    "organisation_id": 12,
    "name": "Support Articles",
    "created_at": "2026-03-10T08:00:00.000000Z",
    "updated_at": "2026-03-10T08:00:00.000000Z"
  }
}

Update Collection

PATCH /collections/{collection}

Update a collection's name.

{
  "name": "Updated Name"
}

Delete Collection

DELETE /collections/{collection}

Delete a collection and all its entries. Returns 204 No Content.

Entries

Entries are the individual pieces of content inside a collection. Each entry is automatically chunked and embedded for semantic search.

Entries support two types of metadata:

  • meta - stored with the entry and available for filtering, but not included in the search index.
  • searchable_meta - stored with the entry and included in the search index. Use this for fields that should influence search results (e.g. SKUs, brand names, tags).

List Entries

GET /collections/{collection}/entries

List entries in a collection (paginated).

curl https://datalumo.app/api/v1/collections/{collection}/entries \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Response 200

{
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "collection_id": "9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c",
      "title": "Refund policy",
      "raw_text": "Refunds are available for 30 days...",
      "meta": { "category": "policy" },
      "searchable_meta": { "tags": ["billing", "refunds"] },
      "source_url": "https://example.com/refunds",
      "source_type": "web",
      "source_id": "refund-policy-v2",
      "created_at": "2026-03-10T08:00:00.000000Z",
      "updated_at": "2026-03-10T08:00:00.000000Z"
    }
  ],
  "links": { "first": "...", "last": "...", "prev": null, "next": null },
  "meta": { "current_page": 1, "last_page": 1, "per_page": 15, "total": 1 }
}

Create Entry

POST /collections/{collection}/entries

Create an entry in the collection.

{
  "title": "Refund policy",
  "raw_text": "Refunds are available for 30 days...",
  "meta": { "category": "policy" },
  "searchable_meta": { "tags": ["billing", "refunds"] },
  "source_url": "https://example.com/refunds",
  "source_type": "web",
  "source_id": "refund-policy-v2"
}

Only raw_text is required. All other fields are optional.

Response 201 - returns the created entry resource.

Upsert Entry

PUT /collections/{collection}/entries/upsert

Create or update an entry matched by source_type + source_id. If a matching entry exists, it is updated; otherwise a new entry is created.

{
  "title": "Refund policy",
  "raw_text": "Updated refund policy content...",
  "source_type": "web",
  "source_id": "refund-policy-v2"
}

raw_text, source_type, and source_id are required.

Returns 201 Created for new entries or 200 OK for updates.

Batch Upsert

POST /collections/{collection}/entries/batch

Upsert up to 50 entries in a single request. Each entry is matched by source_type + source_id.

{
  "entries": [
    {
      "title": "Refund policy",
      "raw_text": "Refunds are available for 30 days...",
      "source_type": "web",
      "source_id": "refund-policy-v2"
    },
    {
      "title": "Shipping info",
      "raw_text": "We ship worldwide within 5 business days...",
      "source_type": "web",
      "source_id": "shipping-info"
    }
  ]
}

Response 200

{
  "created": 1,
  "updated": 1
}

Update Entry

PATCH /collections/{collection}/entries/{entry}

Update an existing entry.

{
  "title": "Updated title",
  "raw_text": "Updated content..."
}

Delete Entry

DELETE /collections/{collection}/entries/{entry}

Delete an entry. Returns 204 No Content.

Delete Entry by Source

DELETE /collections/{collection}/entries/by-source/{sourceType}/{sourceId}

Delete an entry by its source type and source ID. Useful when you don't have the entry's UUID but know its external source reference. Returns 204 No Content.

curl -X DELETE https://datalumo.app/api/v1/collections/{collection}/entries/by-source/web/refund-policy-v2 \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Integrations

Integrations are the consumption layer in Datalumo. While collections and entries manage your data, integrations are how you search, summarise, and chat with that data. Each integration connects to one or more collections and can be configured with its own persona, settings, and access controls.

All integration endpoints use the integration's UUID as the URL parameter.

List Integrations

GET /integrations

List all integrations (paginated). Optionally filter by type or project.

curl "https://datalumo.app/api/v1/integrations?type=chatbot" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Response 200

{
  "data": [
    {
      "id": "b81d17eb-99c7-408a-8081-09319dff64e0",
      "name": "Support Bot",
      "project": "example.com",
      "type": "chatbot",
      "allowed_domains": ["example.com"],
      "is_active": true,
      "settings": {
        "welcome_message": "Ask me anything about support.",
        "instructions": "Only answer from provided collection data.",
        "persona": "customer-support",
        "unknown_answer_message": null
      },
      "collection_ids": ["9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c"],
      "created_at": "2026-03-10T08:00:00.000000Z",
      "updated_at": "2026-03-10T08:00:00.000000Z"
    }
  ],
  "links": { "first": "...", "last": "...", "prev": null, "next": null },
  "meta": { "current_page": 1, "last_page": 1, "per_page": 15, "total": 1 }
}

Create Integration

POST /integrations

Create an integration. The type determines which settings fields are relevant.

Chatbot example

{
  "type": "chatbot",
  "name": "Support Bot",
  "allowed_domains": ["example.com"],
  "collection_ids": ["9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c"],
  "welcome_message": "Ask me anything about support.",
  "instructions": "Only answer from provided collection data.",
  "persona": "customer-support"
}

Search Box example

{
  "type": "search_box",
  "name": "Docs Search",
  "allowed_domains": ["example.com"],
  "collection_ids": ["9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c"],
  "placeholder": "Search our docs...",
  "results_display": "summary_results"
}

Required fields

Field Type Description
type string chatbot, search_box, or custom
name string Display name
collection_ids array Array of collection UUIDs to connect

Optional fields (all types)

Field Type Description
project string|null Project name for grouping integrations
allowed_domains array|null Restrict embed to these domains
is_active boolean Enable/disable the integration (default: true)

Chatbot fields

Field Type Description
welcome_message string|null Initial greeting message
instructions string|null System instructions for the AI
persona string customer-support, sales-assistant, onboarding-guide, technical-expert, or custom
unknown_answer_message string|null Message shown when the AI cannot find an answer

Search Box fields

Field Type Description
placeholder string Search input placeholder text (default: Search...)
results_display string results, summary, or summary_results

Custom fields

Field Type Description
label string|null Label for the custom integration

Show Integration

GET /integrations/{integration}

Get a single integration's details. Returns the same shape as list items above.

Update Integration

PATCH /integrations/{integration}

Update an integration. Any field from the create request can be sent (except type).

Delete Integration

DELETE /integrations/{integration}

Delete an integration. Returns 204 No Content.

GET /integrations/{integration}/search

Semantic search across all collections connected to an integration.

Query Parameters

Param Type Required Description
query string Yes The search query
threshold float No Similarity threshold 0-1 (default: 0.4)
meta array No Filter by metadata (see filtering below)
boost array No Reorder results by meta signals (see boosting below)
sort array No Replace relevance ordering with an explicit field order (see sorting below)
per_page integer No Results per page, max 50 (default: 15)
page integer No Page number (default: 1)
curl "https://datalumo.app/api/v1/integrations/{integration}/search?query=how+do+refunds+work&threshold=0.3" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Filtering by metadata

Filter results by values in meta or searchable_meta fields. Filters match against both fields automatically.

Exact match (single value):

meta[category]=billing

OR (match any of the values):

meta[category][or][]=billing&meta[category][or][]=support

AND (match all values, useful for array fields):

meta[tags][and][]=coffee&meta[tags][and][]=decaf

Combining filters (different keys are always AND):

meta[post_type][or][]=post&meta[post_type][or][]=page&meta[source]=blog

This matches entries where post_type is "post" or "page", AND source is "blog".

Response 200

{
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "collection_id": "9e3a7c2f-1b4d-4e8a-9f2c-3d5e6f7a8b9c",
      "title": "Refund policy",
      "raw_text": "Refunds are available for 30 days...",
      "meta": { "category": "policy" },
      "searchable_meta": { "tags": ["billing", "refunds"] },
      "source_url": "https://example.com/refunds",
      "source_type": "web",
      "source_id": "refund-policy-v2",
      "created_at": "2026-03-10T08:00:00.000000Z",
      "updated_at": "2026-03-10T08:00:00.000000Z"
    }
  ],
  "summarisable": true,
  "links": { "first": "...", "last": "...", "prev": null, "next": null },
  "meta": { "current_page": 1, "last_page": 1, "per_page": 15, "total": 1 }
}

The summarisable field indicates whether the results are good candidates for AI summarisation.

Each search result includes a score_breakdown object with similarity and final_score so you can inspect how each entry was ranked.

For background on when to use boost versus sort, how decay works, and missing-field behaviour, see the Search Ranking guide.

Boost

Reorder results by meta signals while keeping relevance in charge. Send up to 3 boosts.

type Field type Extra parameters
recency ISO date scale optional, default "90d"
prefer_low number scale required
prefer_high number scale required
near_value number target required, scale required

All types accept optional strength: low, medium (default), or high.

{
  "query": "your search query",
  "boost": [
    { "field": "published_at", "type": "recency", "strength": "medium" }
  ]
}

Sort

Replace relevance ordering with an explicit field order. Send up to 3 sort keys.

Field Required Values
field Yes meta key
direction Yes asc, desc
type No date, number, text (default text)
{
  "query": "your search query",
  "sort": [
    { "field": "published_at", "direction": "desc", "type": "date" }
  ]
}

Boost and sort cannot be combined in the same request. A request with both returns 422 Unprocessable Entity.

Summarise

GET /integrations/{integration}/summarise

Generate an AI summary of matching entries across the integration's collections.

Query Parameters

Param Type Required Description
query string Yes The question or topic to summarise
threshold float No Similarity threshold 0-1 (default: 0.4)
stream boolean No Stream the response via SSE (default: true)
format string No markdown (default) or html
locale string No Response language (e.g. en, nl)
curl "https://datalumo.app/api/v1/integrations/{integration}/summarise?query=what+is+the+refund+window&stream=false" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Accept: application/json"

Response 200 (stream=false)

{
  "has_relevant_results": true,
  "summary": "Refunds are available up to 30 days after purchase...",
  "references": [
    {
      "ref": 1,
      "title": "Refund policy",
      "url": "https://example.com/refunds"
    }
  ],
  "data": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "title": "Refund policy",
      "raw_text": "Refunds are available for 30 days..."
    }
  ]
}

When stream=true (default), the response is returned as Server-Sent Events (SSE). See Streaming for the event format.

Chat

POST /integrations/{integration}/chat

Have a conversation grounded in the integration's content. Supports multi-turn conversations via conversation_id.

{
  "message": "How long do customers have to request a refund?",
  "conversation_id": null,
  "stream": false
}

Response 200 (stream=false)

{
  "conversation_id": "conv_abc123",
  "message": "Customers can request a refund within 30 days."
}

To continue a conversation, pass the conversation_id from a previous response:

{
  "message": "What about digital products?",
  "conversation_id": "conv_abc123",
  "stream": false
}

When stream=true (default), the response is returned as Server-Sent Events (SSE). The X-Conversation-Id header contains the conversation ID for continuing the conversation.

Streaming

The summarise and chat endpoints support streaming via Server-Sent Events (SSE) when stream=true (the default). Each event is a JSON object prefixed with data: , and the stream ends with data: [DONE].

Event types

stream_start - stream initialised

data: {"type":"stream_start","provider":"openai","model":"gpt-4o-mini","timestamp":1711270800}

text_delta - incremental text chunk (most frequent)

data: {"type":"text_delta","delta":"Refunds are ","message_id":"msg-uuid","timestamp":1711270800}

citation - source reference

data: {"type":"citation","citation":{"title":"Refund policy","url":"https://example.com/refunds"},"timestamp":1711270800}

stream_end - stream complete

data: {"type":"stream_end","reason":"stop","usage":{"prompt_tokens":150,"completion_tokens":45},"timestamp":1711270800}

End of stream

data: [DONE]

Integration Events

POST /integrations/{integration}/events

Record events (e.g. clicks, feedback) for an integration. Used for analytics tracking.

{
  "event_type": "thumbs_up",
  "meta": {
    "url": "https://example.com/page"
  }
}
Event Type Description
click User clicked a result link
thumbs_up Positive feedback on a response
thumbs_down Negative feedback on a response

Response 200

{
  "ok": true
}

Errors

422 Unprocessable Entity

{
  "message": "The name field is required.",
  "errors": {
    "name": ["The name field is required."]
  }
}

401 Unauthorized

{
  "message": "Unauthenticated."
}

402 Payment Required

{
  "message": "Usage quota exceeded."
}

404 Not Found

{
  "message": "Not found."
}

429 Too Many Requests

{
  "message": "Too many requests."
}

Enum Values

  • type: chatbot, search_box, custom
  • persona: customer-support, sales-assistant, onboarding-guide, technical-expert, custom
  • results_display: results, summary, summary_results