KLBase¶

KLBase is the central orchestration layer in AgentHeaven’s knowledge management system. It provides a unified interface for managing knowledge items (UKF objects) across multiple storage backends (KLStore) and search engines (KLEngine). By combining different storage and engine implementations, KLBase enables flexible, efficient knowledge retrieval and manipulation tailored to specific application needs.

1. Core Concepts¶

KLBase coordinates three main components:

UKF (Universal Knowledge Framework): The data model representing knowledge items (see UKF Guide)
KLStore: Storage backends that persist UKF objects (e.g., database, cache, file system)
KLEngine: Search and retrieval engines that index and query UKF objects (e.g., faceted search, autocomplete, vector search)

KLBase acts as the orchestration layer that:

Routes CRUD operations to appropriate storage backends
Maintains consistency across multiple storages and engines
Provides unified search interfaces across different engine types
Exposes tool specifications for agent integration via @reg_toolspec decorator

2. Basic Architecture¶

KLBase maintains two primary collections that work together to provide comprehensive knowledge management capabilities:

Storage Layer (storages: Dict[str, BaseKLStore]):

A dictionary of named storage backends that persist UKF objects
Each storage provides CRUD operations (create, read, update, delete) for knowledge items
Available storage implementations:
- CacheKLStore: In-memory or file-based cache for fast temporary storage
- DatabaseKLStore: Relational database backend (PostgreSQL, MySQL, SQLite) for persistent storage
- VectorDBKLStore: Specialized vector database storage for embedding-based operations
- CascadeKLStore: Hierarchical storage chain that queries multiple backends in sequence

Engine Layer (engines: Dict[str, BaseKLEngine]):

A dictionary of named search and retrieval engines that index and query UKF objects
Each engine provides specialized search capabilities optimized for different query patterns
Available engine implementations:
- FacetKLEngine: Multi-dimensional faceted search supporting complex filtering and aggregation
- DAACKLEngine: Dynamic autocomplete engine for prefix-based and fuzzy matching
- VectorKLEngine: Semantic similarity search using vector embeddings

Orchestration Model: KLBase coordinates between these layers by routing data operations to the appropriate backends. When a UKF object is inserted or updated, KLBase propagates the changes to all specified storages and engines to maintain consistency. When searching, KLBase delegates queries to the appropriate engine and retrieves full objects from the associated storage backend.

3. Class Definition¶

3.1. Constructor¶

class KLBase(ToolRegistry):
    def __init__(
        self,
        storages: Optional[Union[List[BaseKLStore], Dict[str, BaseKLStore]]] = None,
        engines: Optional[Union[List[BaseKLEngine], Dict[str, BaseKLEngine]]] = None,
        name: Optional[str] = None,
        *args,
        **kwargs,
    ):

Parameters:

storages (Optional[Union[List[BaseKLStore], Dict[str, BaseKLStore]]]):
- Storage backends for persisting UKF objects
- Can be provided as a list (indexed by storage.name) or dict (custom keys)
- Defaults to empty dict if None
engines (Optional[Union[List[BaseKLEngine], Dict[str, BaseKLEngine]]]):
- Search/retrieval engines for querying UKF objects
- Can be provided as a list (indexed by engine.name) or dict (custom keys)
- Defaults to empty dict if None
name (Optional[str]):
- Identifier for this KLBase instance
- Defaults to “default” if None

3.2. Storage Management¶

3.2.1. Adding Storage¶

def add_storage(self, storage: BaseKLStore, name: Optional[str] = None):

Registers a new storage backend.

Parameters:

storage (BaseKLStore): The storage instance to add
name (Optional[str]): Registration key (defaults to storage.name)

3.2.2. Removing Storage¶

def del_storage(self, name: str):

Unregisters and removes a storage backend.

Parameters:

name (str): The registration key of the storage to remove

3.3. Engine Management¶

3.3.1. Adding Engine¶

def add_engine(self, engine: BaseKLEngine, name: Optional[str] = None):

Registers a new search/retrieval engine.

Parameters:

engine (BaseKLEngine): The engine instance to add
name (Optional[str]): Registration key (defaults to engine.name)

3.3.2. Removing Engine¶

def del_engine(self, name: str):

Unregisters and removes an engine.

Parameters:

name (str): The registration key of the engine to remove

4. Data Operations¶

All data operations support selective storage/engine targeting via the storages and engines parameters. If not specified, operations apply to all registered storages/engines.

4.1. Insert/Upsert Operations¶

4.1.1. Single Upsert¶

def upsert(self, kl: BaseUKF, storages: List[str] = None, engines: List[str] = None):

Insert or update a single UKF object across specified storages and engines.

Parameters:

kl (BaseUKF): The knowledge item to upsert
storages (List[str], optional): Target storage names (defaults to all)
engines (List[str], optional): Target engine names (defaults to all)

4.1.2. Single Insert¶

def insert(self, kl: BaseUKF, storages: List[str] = None, engines: List[str] = None):

Insert a new UKF object (may raise error if already exists, depending on backend).

Parameters: Same as upsert

4.1.3. Batch Upsert¶

def batch_upsert(self, kls: List[BaseUKF], storages: List[str] = None, engines: List[str] = None):

Bulk upsert multiple UKF objects (more efficient than individual upserts).

Parameters:

kls (List[BaseUKF]): List of knowledge items to upsert
storages/engines: Same as single upsert

4.1.4. Batch Insert¶

def batch_insert(self, kls: List[BaseUKF], storages: List[str] = None, engines: List[str] = None):

Bulk insert multiple UKF objects.

Parameters: Same as batch_upsert

4.1.5. Selective Synchronization¶

Control which storages/engines receive updates:

# Only update database storage
klbase.upsert(kl, storages=["db_store"], engines=[])

# Only update search engines
klbase.upsert(kl, storages=[], engines=["facet_engine", "vec_engine"])

# Update everything (default)
klbase.upsert(kl)

4.2. Remove Operations¶

4.2.1. Single Remove¶

def remove(self, key: Union[int, str, BaseUKF], storages: List[str] = None, engines: List[str] = None):

Remove a single UKF object by key.

Parameters:

key (Union[int, str, BaseUKF]): ID, name, or UKF instance to remove
storages/engines: Target backends (defaults to all)

4.2.2. Batch Remove¶

def batch_remove(self, keys: List[Union[int, str, BaseUKF]], storages: List[str] = None, engines: List[str] = None):

Bulk remove multiple UKF objects.

Parameters:

keys (List[Union[int, str, BaseUKF]]): List of identifiers to remove
storages/engines: Target backends (defaults to all)

4.2.3. Clear All¶

def clear(self, storages: List[str] = None, engines: List[str] = None):

Remove all UKF objects from specified storages and engines.

Parameters:

storages/engines: Target backends to clear (defaults to all)

4.3. Search Operations¶

def search(self, engine: str, *args, **kwargs) -> Iterable[Dict[str, Any]]:

Execute a search query using a specified engine.

Parameters:

engine (str): Name of the engine to use
***args, kwargs: Engine-specific search parameters

Returns:

Iterable[Dict[str, Any]]: List of search results as dictionaries

Standard Result Keys (engine-dependent):

"id" (int): UKF object identifier
"kl" (BaseUKF): The actual UKF instance (if engine is recoverable)
Additional keys vary by engine type (e.g., "score" for vector search, "rank" for autocomplete)

Example:

# Faceted search
results = klbase.search(engine='facet_engine', mode='facet', tags=KLKLKLFilter.NF(slot="TOPIC", value="math"))

# Vector search
results = klbase.search(engine='vec_engine', query="fibonacci sequence", topk=5)

# Autocomplete
results = klbase.search(engine='ac_engine', query="Fibo", topk=10)

4.4. Listing Search Methods¶

def list_search(self) -> List[Tuple[str, Optional[str]]]:

Enumerate all available search methods across registered engines.

Returns:

List[Tuple[str, Optional[str]]]: List of (engine_name, search_mode) pairs
- engine_name: Name of the engine
- search_mode: Engine’s search mode (optional, None if engine has only one mode)

Example:

search_methods = klbase.list_search()
# [('facet_engine', 'facet'), ('ac_engine1', None), ('vec_engine', 'vector')]

5. Creating Custom KLBase Applications¶

The typical workflow for creating a KLBase-powered application:

5.1. Inherit KLBase¶

from ahvn.klbase import KLBase
from ahvn.klstore import DatabaseKLStore, CacheKLStore
from ahvn.klengine import FacetKLEngine, DAACKLEngine, VectorKLEngine
from ahvn.tool.mixin import reg_toolspec

class MyKLBase(KLBase):
    def __init__(self, name: str, path: Optional[str] = None):
        super().__init__(name=name)
        self.path = path or f"./.ahvn/{self.name}/"
        
        # Configure storages and engines (see below)

5.2. Configure Storages and Engines¶

In the __init__ method, set up your storage and engine combinations:

# Example: Multiple storage backends
self.add_storage(
    CacheKLStore(name="cache_store", cache=JsonCache(path=f"{self.path}/cache"))
)
self.add_storage(
    DatabaseKLStore(name="db_store", provider="pg", database="my_db")
)

# Example: Faceted search engine
self.add_engine(
    FacetKLEngine(
        name="facet_engine",
        storage=self.storages["db_store"],
        inplace=True,  # Use storage's native query capabilities
    )
)

# Example: Autocomplete engine for knowledge items
self.add_engine(
    DAACKLEngine(
        name="ac_by_name",
        storage=self.storages["db_store"],
        path=f"{self.path}/ac_index",
        encoder=lambda kl: [kl.name or "", kl.content or ""],
        condition=lambda kl: kl.type == "knowledge",
    )
)

# Example: Vector search engine for experience items
self.add_engine(
    VectorKLEngine(
        name="vec_engine",
        provider="pgvector",
        collection="vec_collection",
        storage=self.storages["db_store"],
        encoder=(
            lambda kl: f"{kl.name} | {kl.content}",
            lambda query: query
        ),
        embedder="embedder",  # References configured embedder
        condition=lambda kl: kl.type == "experience",
    )
)

Key Considerations:

Storage Selection: Choose based on persistence needs (cache for temporary, database for long-term)
Engine-Storage Pairing: Engines typically reference a storage backend for data retrieval
Inplace vs. External Indexing:
- inplace=True: Use storage’s native query (e.g., SQL WHERE clauses)
- inplace=False: Maintain separate index structures
Conditional Indexing: Use condition lambda to selectively index UKF objects (e.g., by type)

5.3. Define Tool Specifications¶

Use the @reg_toolspec() decorator to expose methods as agent-callable tools:

from typing import Dict, Any
from ahvn.utils.klop import KLOp

class MyKLBase(KLBase):
    # ... __init__ configuration ...
    
    @reg_toolspec()
    def facet_search(
        self, 
        facets: Dict[str, Any], 
        topk: int = 20, 
        offset: int = 0
    ) -> Dict[str, Any]:
        """Perform a facet search on the KLBase.
        
        Args:
            facets (Dict[str, Any]): A dictionary of facet criteria.
                Each facet is a key-value pair where the key is the facet name
                and the value is the desired value or condition.
                For complex conditions, use KLOp.NF, KLOp.LIKE, KLOp.OR, etc.
                Example:
                    {
                        "tags": KLKLKLFilter.NF(slot="TOPIC", value=KLKLKLFilter.LIKE("math%")),
                        "priority": KLKLKLFilter.OR([42, 7]),
                        "type": "knowledge"
                    }
                Supported fields: id, name, type, content, tags, synonyms, priority
            topk (int, optional): Maximum number of results. Defaults to 20.
            offset (int, optional): Number of initial results to skip. Defaults to 0.
            
        Returns:
            Dict[str, Any]: Dictionary with keys:
                - "cnt" (int): Total count of matching items
                - "kls" (List[BaseUKF]): Top-k matching knowledge items
        """
        ids = self.engines['facet_engine'].search(
            mode='facet', include=['id'], **facets
        )
        return {
            "cnt": len(ids),
            "kls": [
                self.engines['facet_engine'].storage.get(id=id_) 
                for id_ in ids[offset:offset+topk]
            ]
        }
    
    @reg_toolspec()
    def vector_search(
        self, 
        query: str, 
        topk: int = 5
    ) -> List[Dict[str, Any]]:
        """Perform semantic vector search.
        
        Args:
            query (str): Natural language search query
            topk (int, optional): Number of results to return. Defaults to 5.
            
        Returns:
            List[Dict[str, Any]]: Search results with keys:
                - "id" (int): UKF object ID
                - "kl" (BaseUKF): The knowledge item
                - "score" (float): Similarity score
        """
        return self.search(
            engine='vec_engine',
            query=query,
            include=["id", "kl", "score"],
            topk=topk
        )

Best Practices for Tool Definitions:

Comprehensive Docstrings: Include detailed descriptions, parameter types, and examples (automatically parsed by @reg_toolspec)
Sensible Defaults: Provide default values for optional parameters
Structured Returns: Return dictionaries or structured objects for easy agent parsing
Error Handling: Consider wrapping engine calls in try-except for graceful failure
Parameter Validation: Validate inputs before passing to underlying engines

5.4. Export Tools for Agent Use¶

# Instantiate your KLBase
klbase = MyKLBase(name="my_app_kb")

# Export all @reg_toolspec tools as ToolSpec objects
tools = klbase.to_toolspecs()

# List available tools
tool_names = klbase.list_toolspecs()
print(tool_names)  # ['facet_search', 'vector_search']

# Use tools directly
result = tools['facet_search'](facets={"type": "knowledge"}, topk=10)

# Or integrate with agent frameworks
# (e.g., pass tools to LLM function calling, MCP servers, etc.)

6. Complete Example¶

Below is a complete example demonstrating KLBase usage (from db_demo.py):

from ahvn.ukf import ptags
from ahvn.ukf.templates.basic import KnowledgeUKFT, ExperienceUKFT
from ahvn.cache import InMemCache, JsonCache
from ahvn.klstore import DatabaseKLStore, CacheKLStore
from ahvn.klengine import FacetKLEngine, DAACKLEngine, VectorKLEngine
from ahvn.klbase import KLBase
from ahvn.tool.mixin import reg_toolspec
from ahvn.utils.klop import KLOp

class MyKLBase(KLBase):
    def __init__(self, name: str, path: Optional[str] = None):
        super().__init__(name=name)
        self.path = path or f"./.ahvn/{self.name}/"
        
        # Storage 1: JSON cache for fast local access
        self.add_storage(
            CacheKLStore(
                name="store1",
                cache=JsonCache(path=f"{self.path}/store1")
            )
        )
        
        # Storage 2: PostgreSQL database for persistent storage
        self.add_storage(
            DatabaseKLStore(
                name="store2",
                provider="pg",
                database="store2"
            )
        )
        
        # Engine 1: Faceted search on database
        self.add_engine(
            FacetKLEngine(
                name="facet_engine",
                storage=self.storages["store2"],
                inplace=True,
            )
        )
        
        # Engine 2: Autocomplete by name and content
        self.add_engine(
            DAACKLEngine(
                name="ac_engine1",
                storage=self.storages["store2"],
                path=f"{self.path}/ac_index_by_name_content",
                encoder=lambda kl: [kl.name or "", kl.content or ""],
                condition=lambda kl: kl.type == "knowledge",
            )
        )
        
        # Engine 3: Autocomplete by synonyms
        self.add_engine(
            DAACKLEngine(
                name="ac_engine2",
                storage=self.storages["store2"],
                path=f"{self.path}/ac_index_by_synonyms",
                encoder=lambda kl: [syn or "" for syn in kl.synonyms],
                condition=lambda kl: kl.type == "knowledge",
            )
        )
        
        # Engine 4: Vector search for experiences
        self.add_engine(
            VectorKLEngine(
                name="vec_engine",
                provider="pgvector",
                collection="vec_store2",
                storage=self.storages["store2"],
                inplace=False,
                include=["id"],
                encoder=(
                    lambda kl: f"{kl.name or ''} | {kl.content or ''}",
                    lambda query: query
                ),
                embedder="embedder",
                condition=lambda kl: kl.type == "experience",
            )
        )
    
    @reg_toolspec()
    def facet_search(
        self,
        facets: Dict[str, Any],
        topk: int = 20,
        offset: int = 0
    ) -> Dict[str, Any]:
        """Perform a facet search on the KLBase."""
        ids = self.engines['facet_engine'].search(
            mode='facet', include=['id'], **facets
        )
        return {
            "cnt": len(ids),
            "kls": [
                self.engines['facet_engine'].storage.get(id=id_)
                for id_ in ids[offset:offset+topk]
            ]
        }

# Usage example
if __name__ == "__main__":
    klbase = MyKLBase(name="my_klbase")
    klbase.clear()
    
    # 1. Create and insert knowledge items
    kl1 = KnowledgeUKFT(
        name="Fibonacci Sequence",
        type="knowledge",
        content="The Fibonacci sequence is a series of numbers...",
        tags=ptags(TOPIC="math", ENTITY="object"),
        synonyms=["Fibonacci numbers", "Fibonacci series"],
    )
    klbase.batch_upsert([kl1])
    
    # 2. Generate experiences from function cache
    cache = InMemCache()
    
    @cache.memoize()
    def fibonacci(n: int) -> int:
        if n <= 1:
            return n
        return fibonacci(n - 1) + fibonacci(n - 2)
    
    fibonacci(100)
    
    exps = [
        ExperienceUKFT.from_cache_entry(
            entry,
            tags=ptags(
                TOPIC="math",
                NUMBER=[str(entry.inputs['n']), str(entry.output)]
            ),
            synonyms=[f"fibonacci({entry.inputs['n']})", f"{entry.output}"]
        )
        for entry in cache
    ]
    klbase.batch_upsert(exps)
    
    # 3. Search using different engines
    # Faceted search
    for kl in klbase.search(
        engine='facet_engine',
        mode='facet',
        tags=KLKLKLFilter.NF(slot="TOPIC", value=KLKLKLFilter.LIKE("math%"))
    ):
        print(kl)
    
    # Autocomplete search
    for kl in klbase.search(engine='ac_engine1', query="Fibonacci"):
        print(kl)
    
    # Vector search
    for kl in klbase.search(
        engine='vec_engine',
        query="fibonacci 63",
        include=["id", "kl", "score"],
        topk=3
    ):
        print(kl)

Further Exploration¶

Tip: For deeper understanding of KLBase components, see:

UKF - Data model for knowledge items

KLStore - Storage layer for knowledge objects

KLEngine - Search engine implementations built on top of KLStores

Tool Specifications - Agent integration details