GitHub

Basic Concepts

Understanding the core concepts of Functor will help you build more effective applications. This guide covers knowledge graphs, memory systems, vector databases, and agent integration.

Knowledge Graphs

Knowledge graphs are structured representations of information as entities and their relationships. In Functor, they serve as the foundation for intelligent data retrieval.

Components

Entities

The "nodes" representing concepts, people, places, or things.

{ id: 'ML_001', type: 'Concept', name: 'Machine Learning' }

Relations

The "edges" connecting entities with semantic meaning.

{ from: 'ML_001', to: 'AI_001', type: 'IS_SUBSET_OF' }

Attributes

Properties providing additional context about entities.

{ definition: '...', confidence: 0.95, source: 'doc_123' }

Knowledge Graph Structure

# Example: Querying knowledge graph structure
from functor_sdk import FunctorClient
client = FunctorClient()
# Get KG statistics
kgs = client.knowledge_graphs.list(include_stats=True)
for kg in kgs:
print(f"{kg.name}:")
print(f" Entities: {kg.entities_count:,}")
print(f" Relations: {kg.relations_count:,}")
print(f" Sources: {kg.sources_count}")

Vector Databases

Vector databases store high-dimensional embeddings that capture semantic meaning. They enable similarity search and context-aware retrieval.

How Embeddings Work

Text → Embedding → Search

1Input Text: "Machine learning algorithms"
2Embedding: [0.23, -0.45, 0.89, ...] (768 dimensions)
3Search: Find similar vectors in database
4Results: Semantically similar content

Semantic Search

# Semantic search finds conceptually similar content
query = "How do neural networks learn?"
# These documents would be retrieved even without exact keyword matches:
# - "Training deep learning models with backpropagation"
# - "Gradient descent optimization in AI"
# - "Supervised learning algorithms"
result = client.queries.execute(
query=query,
max_results=5
)
# Results are ranked by semantic similarity
for citation in result.citations:
print(f"Relevance: {citation.relevance_score}")
print(f"Content: {citation.chunk_text[:100]}...")

Memory Systems

Functor implements three types of memory inspired by cognitive science:

1. Episodic Memory

Stores experiences and events

Records what happened, when, and in what context. Used for conversation history, user interactions, and temporal reasoning.

from functor_sdk import FunctorClient
client = FunctorClient(base_url="http://localhost:8000")
client.connect()
# Store an episode
episode_id = client.add_episode(
content="User asked about machine learning applications",
user_id="alice",
metadata={
"timestamp": "2024-01-01T10:30:00Z",
"session_id": "sess_123",
"intent": "question"
}
)
# Retrieve episodes
episodes = client.retrieve_episodes(
user_id="alice",
limit=10
)
for episode in episodes:
print(f"[{episode.timestamp}] {episode.content}")

2. Semantic Memory

Stores facts and knowledge

General knowledge independent of personal experience. Used for factual information, definitions, and relationships.

# Store semantic facts
client.add_fact(
fact="Machine learning is a subset of artificial intelligence",
category="definitions",
metadata={"domain": "AI", "confidence": 0.95}
)
# Retrieve related facts
facts = client.get_semantic_facts(
query="What is machine learning?",
category="definitions",
limit=5
)
for fact in facts:
print(f"Fact: {fact.content}")
print(f"Confidence: {fact.confidence}")

3. Procedural Memory

Stores how-to knowledge

Skills and procedures for performing tasks. Used for workflows, strategies, and action sequences.

# Store a procedure
client.add_procedure(
name="document_analysis_workflow",
steps=[
"Extract text from document",
"Identify key entities",
"Build knowledge graph",
"Generate summary"
],
metadata={"category": "data_processing"}
)
# Retrieve and execute
procedure = client.get_procedure("document_analysis_workflow")
for step in procedure.steps:
print(f"Step: {step}")

Data Ingestion Pipeline

Understanding the ingestion pipeline helps you optimize document processing:

Processing Stages

1
Document Upload

File or URL submitted to ingestion API

2
Text Extraction

Content extracted from PDF, HTML, DOCX, etc.

3
Chunking

Text split into semantic chunks (typically 500-1000 tokens)

4
Entity Extraction

NER identifies entities (people, places, concepts)

5
Relation Extraction

Relationships between entities identified

6
Embedding Generation

Vector embeddings created for semantic search

7
Storage

Data stored in Neo4j (graph), Qdrant (vectors), SQLite (metadata)

Query Processing

When you execute a query, multiple systems work together:

Query Flow

1. Query Understanding

Intent classification and entity recognition

2. Pipeline Selection

Choose appropriate retrieval strategy (knowledge graph, vector search, hybrid)

3. Knowledge Retrieval

Search knowledge graph and vector database

4. Context Assembly

Combine retrieved information with user context

5. LLM Generation

Generate natural language response

6. Citation & Validation

Add source citations and validate answer quality

# Query with full context
result = client.queries.execute(
query="What are the applications of machine learning?",
user_id="alice", # User context
max_results=10, # Retrieval depth
validate_answer=True, # Answer validation
include_citations=True, # Source tracking
kg_names=["KG_Universal", "KG_AI"] # Target KGs
)
# Understand the response
print(f"Pipeline: {result.pipeline_used}") # Which strategy was used
print(f"Confidence: {result.confidence}") # Answer quality score
print(f"Processing: {result.processing_time_ms}ms")
print(f"\nAnswer: {result.answer}")
# Examine sources
print(f"\nCitations ({len(result.citations)}):")
for citation in result.citations:
print(f"- {citation.source} (relevance: {citation.relevance_score})")

Agent Integration

Functor can be integrated into agent frameworks through the SDK:

LangChain Integration

from functor_sdk import FunctorClient
# Create Functor client
client = FunctorClient(base_url="http://localhost:8000")
# Create LangChain tools
langchain_tools = LangChainAdapter.create_tools(client)
# Use in LangChain agent
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
agent = initialize_agent(
tools=langchain_tools,
llm=llm,
agent="zero-shot-react-description"
)
# Agent can now use Functor for memory and retrieval
response = agent.run("Remember that Alice likes machine learning")
response = agent.run("What does Alice like?")

Multi-Agent Systems

# Multiple agents can share memory through Functor
agent1 = FunctorClient(base_url="http://localhost:8000")
agent2 = FunctorClient(base_url="http://localhost:8000")
# Agent 1 stores knowledge
agent1.connect()
agent1.add_fact(
fact="Project deadline is January 15",
category="project_info"
)
# Agent 2 retrieves it
agent2.connect()
facts = agent2.get_semantic_facts(
query="When is the deadline?",
category="project_info"
)
print(f"Retrieved: {facts[0].content}")

Best Practices

Knowledge Graph Design

  • Organize by domain: Create separate KGs for different domains (medical, technical, etc.)
  • Use consistent naming: Standardize entity and relation types
  • Add rich metadata: Include confidence scores, sources, timestamps
  • Monitor growth: Track entity and relation counts

Memory Management

  • Use appropriate memory types: Episodic for events, semantic for facts, procedural for workflows
  • Set retention policies: Configure TTLs for short-term memory
  • Consolidate regularly: Archive old episodes to long-term memory
  • Add context: Include user_id, session_id, timestamps

Query Optimization

  • Be specific: More specific queries yield better results
  • Use appropriate max_results: Don't retrieve more than needed
  • Target specific KGs: Specify kg_names when possible
  • Monitor performance: Track processing_time_ms

Next Steps