Basic Concepts

Understanding the core concepts of Functor will help you build more effective applications. This guide covers knowledge graphs, memory systems, vector databases, and agent integration.

Knowledge Graphs

Knowledge graphs are structured representations of information as entities and their relationships. In Functor, they serve as the foundation for intelligent data retrieval.

Components

Entities

The "nodes" representing concepts, people, places, or things.

{ id: 'ML_001', type: 'Concept', name: 'Machine Learning' }

Relations

The "edges" connecting entities with semantic meaning.

{ from: 'ML_001', to: 'AI_001', type: 'IS_SUBSET_OF' }

Attributes

Properties providing additional context about entities.

{ definition: '...', confidence: 0.95, source: 'doc_123' }

Knowledge Graph Structure

# Example: Querying knowledge graph structure
from functor_sdk import FunctorClient

client = FunctorClient()

# Get KG statistics
kgs = client.knowledge_graphs.list(include_stats=True)
for kg in kgs:
    print(f"{kg.name}:")
    print(f"  Entities: {kg.entities_count:,}")
    print(f"  Relations: {kg.relations_count:,}")
    print(f"  Sources: {kg.sources_count}")

Vector Databases

Vector databases store high-dimensional embeddings that capture semantic meaning. They enable similarity search and context-aware retrieval.

How Embeddings Work

Text → Embedding → Search

1Input Text: "Machine learning algorithms"

2Embedding: [0.23, -0.45, 0.89, ...] (768 dimensions)

3Search: Find similar vectors in database

4Results: Semantically similar content

Semantic Search

# Semantic search finds conceptually similar content
query = "How do neural networks learn?"

# These documents would be retrieved even without exact keyword matches:
# - "Training deep learning models with backpropagation"
# - "Gradient descent optimization in AI"
# - "Supervised learning algorithms"

result = client.queries.execute(
    query=query,
    max_results=5
)

# Results are ranked by semantic similarity
for citation in result.citations:
    print(f"Relevance: {citation.relevance_score}")
    print(f"Content: {citation.chunk_text[:100]}...")

Memory Systems

Functor implements three types of memory inspired by cognitive science:

1. Episodic Memory

Stores experiences and events

Records what happened, when, and in what context. Used for conversation history, user interactions, and temporal reasoning.

from functor_sdk import FunctorClient

client = FunctorClient(base_url="http://localhost:8000")
client.connect()

# Store an episode
episode_id = client.add_episode(
    content="User asked about machine learning applications",
    user_id="alice",
    metadata={
        "timestamp": "2024-01-01T10:30:00Z",
        "session_id": "sess_123",
        "intent": "question"
    }
)

# Retrieve episodes
episodes = client.retrieve_episodes(
    user_id="alice",
    limit=10
)

for episode in episodes:
    print(f"[{episode.timestamp}] {episode.content}")

2. Semantic Memory

Stores facts and knowledge

General knowledge independent of personal experience. Used for factual information, definitions, and relationships.

# Store semantic facts
client.add_fact(
    fact="Machine learning is a subset of artificial intelligence",
    category="definitions",
    metadata={"domain": "AI", "confidence": 0.95}
)

# Retrieve related facts
facts = client.get_semantic_facts(
    query="What is machine learning?",
    category="definitions",
    limit=5
)

for fact in facts:
    print(f"Fact: {fact.content}")
    print(f"Confidence: {fact.confidence}")

3. Procedural Memory

Stores how-to knowledge

Skills and procedures for performing tasks. Used for workflows, strategies, and action sequences.

# Store a procedure
client.add_procedure(
    name="document_analysis_workflow",
    steps=[
        "Extract text from document",
        "Identify key entities",
        "Build knowledge graph",
        "Generate summary"
    ],
    metadata={"category": "data_processing"}
)

# Retrieve and execute
procedure = client.get_procedure("document_analysis_workflow")
for step in procedure.steps:
    print(f"Step: {step}")

Data Ingestion Pipeline

Understanding the ingestion pipeline helps you optimize document processing:

Processing Stages

Document Upload

File or URL submitted to ingestion API

Text Extraction

Content extracted from PDF, HTML, DOCX, etc.

Chunking

Text split into semantic chunks (typically 500-1000 tokens)

Entity Extraction

NER identifies entities (people, places, concepts)

Relation Extraction

Relationships between entities identified

Embedding Generation

Vector embeddings created for semantic search

Storage

Data stored in Neo4j (graph), Qdrant (vectors), SQLite (metadata)

Query Processing

When you execute a query, multiple systems work together:

Query Flow

1. Query Understanding

Intent classification and entity recognition

2. Pipeline Selection

Choose appropriate retrieval strategy (knowledge graph, vector search, hybrid)

3. Knowledge Retrieval

Search knowledge graph and vector database

4. Context Assembly

Combine retrieved information with user context

5. LLM Generation

Generate natural language response

6. Citation & Validation

Add source citations and validate answer quality

# Query with full context
result = client.queries.execute(
    query="What are the applications of machine learning?",
    user_id="alice",                    # User context
    max_results=10,                     # Retrieval depth
    validate_answer=True,               # Answer validation
    include_citations=True,             # Source tracking
    kg_names=["KG_Universal", "KG_AI"]  # Target KGs
)

# Understand the response
print(f"Pipeline: {result.pipeline_used}")  # Which strategy was used
print(f"Confidence: {result.confidence}")    # Answer quality score
print(f"Processing: {result.processing_time_ms}ms")
print(f"\nAnswer: {result.answer}")

# Examine sources
print(f"\nCitations ({len(result.citations)}):")
for citation in result.citations:
    print(f"- {citation.source} (relevance: {citation.relevance_score})")

Agent Integration

Functor can be integrated into agent frameworks through the SDK:

LangChain Integration

from functor_sdk import FunctorClient

# Create Functor client
client = FunctorClient(base_url="http://localhost:8000")

# Create LangChain tools
langchain_tools = LangChainAdapter.create_tools(client)

# Use in LangChain agent
from langchain.agents import initialize_agent
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
agent = initialize_agent(
    tools=langchain_tools,
    llm=llm,
    agent="zero-shot-react-description"
)

# Agent can now use Functor for memory and retrieval
response = agent.run("Remember that Alice likes machine learning")
response = agent.run("What does Alice like?")

Multi-Agent Systems

# Multiple agents can share memory through Functor
agent1 = FunctorClient(base_url="http://localhost:8000")
agent2 = FunctorClient(base_url="http://localhost:8000")

# Agent 1 stores knowledge
agent1.connect()
agent1.add_fact(
    fact="Project deadline is January 15",
    category="project_info"
)

# Agent 2 retrieves it
agent2.connect()
facts = agent2.get_semantic_facts(
    query="When is the deadline?",
    category="project_info"
)

print(f"Retrieved: {facts[0].content}")

Best Practices

Knowledge Graph Design

Organize by domain: Create separate KGs for different domains (medical, technical, etc.)
Use consistent naming: Standardize entity and relation types
Add rich metadata: Include confidence scores, sources, timestamps
Monitor growth: Track entity and relation counts

Memory Management

Use appropriate memory types: Episodic for events, semantic for facts, procedural for workflows
Set retention policies: Configure TTLs for short-term memory
Consolidate regularly: Archive old episodes to long-term memory
Add context: Include user_id, session_id, timestamps

Query Optimization

Be specific: More specific queries yield better results
Use appropriate max_results: Don't retrieve more than needed
Target specific KGs: Specify kg_names when possible
Monitor performance: Track processing_time_ms

Basic Concepts

Knowledge Graphs

Components

Entities

Relations

Attributes

Knowledge Graph Structure

Vector Databases

How Embeddings Work

Text → Embedding → Search

Semantic Search

Memory Systems

1. Episodic Memory

2. Semantic Memory

3. Procedural Memory

Data Ingestion Pipeline

Processing Stages

Query Processing

Query Flow

Agent Integration

LangChain Integration

Multi-Agent Systems

Best Practices

Knowledge Graph Design

Memory Management

Query Optimization

Next Steps

Advanced Querying

Memory Operations

Data Ingestion

Agent Integration