Real-Time Content Ingestion API Guide

The Real-Time Ingestion API allows developers to upload text content directly for immediate processing and vectorization. Unlike batch upload methods, this API processes content in real-time, making it suitable for dynamic content that needs to be immediately available for search or retrieval.

API Endpoint

POST /ingestion/v1/real_time_upload

Base URL: https://platform.ai.gloo.com

Authentication

The API requires a JWT token with specific claims to authorize access.

The token must be associated with an Client ID that has access to the specified publisher. The system validates that the organization associated with your Client ID has permission to access the publisher specified in your request.

Headers

Authorization: Bearer <your_access_token>
Content-Type: application/json

JWT Token Requirements

Your JWT token must include:

A sub claim containing your API client ID
- A scope claim that includes api/access
  
  The API client must be associated with an API key that belongs to the same organization as the publisher you're uploading content for.

📘
NOTE: You will only be able to send content to your own publisher, so please double check the publisherId field in the request body for accuracy.

Request Body

The request body should be a JSON object with the following structure:

{
  "content": "This is the full text content that needs to be processed and indexed for search. It can be as long as needed to represent the document content.",
  "filename": "sample_document_name.txt",
  "producer_id": "producer-123",
  "publisher_id": "550e8400-e29b-41d4-a716-446655440000",
  "denomination": "Catholic",
  "evergreen": true,
  "drm": ["aspen", "kallm"],
  "author": ["Jane Doe", "John Smith"],
  "isbn": "978-3-16-148410-0",
  
  "item_title": "Main Document Title",
  "item_subtitle": "An Informative Subtitle",
  "item_image": "https://example.com/images/document-cover.jpg",
  "item_url": "https://example.com/original-content",
  "item_file": "https://example.com/downloads/document.pdf",
  "item_summary": "A brief summary of the document's content and purpose.",
  "item_number": "DOC-2023-001",
  "item_extra": "Additional information about this item",
  "item_tags": ["documentation", "api", "tutorial", "reference"],
  
  "h2_title": "Section Heading",
  "h2_subtitle": "Section Subheading",
  "h2_image": "https://example.com/images/section-image.jpg",
  "h2_url": "https://example.com/section",
  "h2_file": "https://example.com/downloads/section.pdf",
  "h2_summary": "Summary of this specific section",
  "h2_number": "2.1",
  "h2_extra": "Additional section metadata",
  
  "h3_title": "Subsection Heading",
  "h3_subtitle": "Subsection Subheading",
  "h3_image": "https://example.com/images/subsection-image.jpg",
  "h3_url": "https://example.com/subsection",
  "h3_file": "https://example.com/downloads/subsection.pdf",
  "h3_summary": "Summary of this specific subsection",
  "h3_number": "2.1.3",
  "h3_extra": "Additional subsection metadata",
  
  "type": "Article",
  "duration": "15 minutes",
  "pages": "12",
  "publication_date": "2023-07-15",
  "hosted_url": "https://cdn.example.com/hosted-content",
  "pub_type": "technical"
}

Required and Optional Fields

Core Required Fields

Field	Type	Description
publisher_id	UUID string	UUID of the publisher associated with the content *This must be associated with your organization.
content	String	The actual text content to be ingested and chunked

All Available Metadata

Metadata Fields

The below are helpful metadata fields that will be beneficial for further retrieval

Field	Type	Required	Description
content	String	Yes	The actual text content to be ingested and chunked
filename	String	No	Custom filename for this content
type	String	No	Content type (e.g., article, blog, tutorial)
item_title	String	No*	Title of the content item
item_subtitle	String	No	Subtitle of the content item
item_summary	String	No	Brief summary of the content
item_image	String	No	URL to image associated with the content
item_url	String	No	URL to the original content
item_file	String	No	URL to a file associated with the content
item_number	String	No	Identifying number for the content item
item_extra	String	No	Additional information about the content item
item_tags	Array or String	No	Tags associated with the content (can be array or comma-separated string)

Author and Publishing Information

Field	Type	Required	Description
author	Array or String	No	Author(s) of the content (can be array or comma-separated string)
isbn	String	No	ISBN if content is from a book
publication_date	String	No	Date when the content was published (recommended format: YYYY-MM-DD)
producer_id	String	No	ID of the content producer
denomination	String	No	Religious denomination (if applicable)
pub_type	String	No	Publication type
hosted_url	String	No	URL where the content is hosted
pages	String	No	Number of pages (for documents)
duration	String	No	Duration (for audio/video content)

Hierarchical Structure (for organized content)

Field	Type	Required	Description
h2_title	String	No	Title for level 2 heading/section
h2_subtitle	String	No	Subtitle for level 2 heading/section
h2_image	String	No	Image URL for level 2 heading/section
h2_url	String	No	URL for level 2 heading/section
h2_file	String	No	File URL for level 2 heading/section
h2_summary	String	No	Summary for level 2 heading/section
h2_number	String	No	Number for level 2 heading/section
h2_extra	String	No	Additional info for level 2 heading/section
h3_title	String	No	Title for level 3 heading/section
h3_subtitle	String	No	Subtitle for level 3 heading/section
h3_image	String	No	Image URL for level 3 heading/section
h3_url	String	No	URL for level 3 heading/section
h3_file	String	No	File URL for level 3 heading/section
h3_summary	String	No	Summary for level 2 heading/section

Real Time Ingestion

Real-Time Content Ingestion API Guide

API Endpoint

Authentication

Headers

JWT Token Requirements

📘
NOTE: You will only be able to send content to your own publisher, so please double check the `publisherId` field in the request body for accuracy.

Request Body

Required and Optional Fields

Core Required Fields

All Available Metadata

Metadata Fields

Author and Publishing Information

Hierarchical Structure (for organized content)

Real-Time Content Ingestion API Guide

API Endpoint

Authentication

Headers

JWT Token Requirements

📘NOTE: You will only be able to send content to your own publisher, so please double check the publisherId field in the request body for accuracy.

Request Body

Required and Optional Fields

Core Required Fields

All Available Metadata

Metadata Fields

Author and Publishing Information

Hierarchical Structure (for organized content)

📘
NOTE: You will only be able to send content to your own publisher, so please double check the `publisherId` field in the request body for accuracy.