GitHub
Examples

Data Ingestion

Fundamentals of loading data from files, URLs, and JSON into the system.

I/O Module

Modes of Ingestion

Getting data IN is the first step. The SDK supports structured (JSON) and unstructured (Text/PDF/URL) data.

Unstructured

Files, URLs, Raw Text. The system automatically chunks and extracts entities.

Structured

JSON data. Preserves the schema and relationships better.

Unified

ingest_unified auto-detects the best method.

Code Example

# 1. Ingest a URL (Wikipedia)
# The system scrapes, cleans, and builds a KG from the page
client.ingestion.upload_url(
url="https://en.wikipedia.org/wiki/Marathon",
kg_name="KG_Companion_Demo",
source_name="Wiki: Marathon"
)
# 2. Ingest Structured JSON
# Best for data that is already rigorous
training_data = {
"week_1": "Run 1 mile",
"week_2": "Run 2 miles"
}
client.ingestion.ingest_json(
data=training_data,
kg_name="KG_Companion_Demo",
source_name="Training Plan",
processing_mode="schema_aware"
)

Key Takeaways

  • Ingestion is asynchronous. These methods return a job_id.
  • Everything ingested ends up in a Knowledge Graph (specified by kg_name).
  • Use ingest_json when you have structured data to avoid parsing errors.