Solutions IoT Systems Compliance AI & Data AI Agents Labs Contact
Back to System Logs
Security & Data Architecture RAG March 26, 2026 • 6 min read

Solving AI Context Bloat: Building the Organization Librarian

Sotirios Tsartsaris

Digital Infrastructure Architect

The fastest way to fail a security audit in 2026 is to deploy a naive RAG (Retrieval-Augmented Generation) pipeline.

The standard industry approach to AI over the last two years has been reckless: take thousands of company PDFs, chunk them, embed them into a massive, global vector database, and let an LLM query them via semantic similarity.

But what happens when a junior analyst asks the AI a generic question about "company payroll," and the vector search mistakenly retrieves a founder's unredacted employment contract because it was the closest semantic match?

RAG without strict governance is not a feature; it is a data breach waiting to happen. At ByteTect, we had to architect a system that guarantees data privacy and compliance (essential for SFDR/ESG frameworks) before the LLM ever sees a single token.

Here is how we built the Librarian Node and our Role-Based Vector infrastructure.

1. Hard-Isolated Multi-Tenancy

You cannot rely on application-layer filtering to separate client data. If you put multiple companies into one index, a bug in your query logic will leak data across business boundaries.

In our vector_store.py infrastructure, we enforce physical isolation at the Elasticsearch level. Every tenant receives a dynamically provisioned, sanitized index: corp_know_{company_id}. When the Librarian agent initiates a search, the routing logic physically cannot cross-pollinate data between different business clients.

2. Role-Based Access Control (RBAC) at the Embedding Layer

Even within a single organization, data must be siloed. We enforce RBAC directly inside the Elasticsearch query, not as an afterthought.

When a document is ingested into the OMAS platform, it is assigned a min_role (e.g., "member" or "admin"). Our indexing logic expands this into an allowed_roles array. When a user interacts with the system, their active session role is injected directly into the Elasticsearch must filters:

app/infrastructure/vector_store.py PYTHON
must_filters =[
    {"term": {"type": "project_chunk"}},
    {"term": {"allowed_roles": user_role}}, # RBAC enforced at the database level
]

If a "member" triggers a search, the vector database will return zero hits for "admin" level chunks, regardless of how semantically perfect the match is. The LLM cannot hallucinate data it is mathematically prevented from retrieving.

3. Taming the Chaos: The TaggingService and Business Folksonomy

Unstructured data is chaotic. If an employee uploads a contract and tags it "Client X", another tags it "client-x", and a third tags it "Client_X", standard keyword filtering breaks down.

To solve this, we built an automated ingestion pipeline that brings order to the chaos. When a document is uploaded, our background TaggingService forces the AI to classify it against a rigid taxonomy of High-Level Categories (e.g., Legal & Compliance, Financial Report).

Next, the raw user tags are passed through our label_sanitizer.py.

app/utils/label_sanitizer.py PYTHON
def get_or_create_label(db: Session, label_name: str) -> Label:
    slug = slugify(label_name)
    # Soft Match check: if slug exists, return the existing label
    statement = select(Label).where(Label.slug == slug)
    existing_label = db.exec(statement).first()
    
    if existing_label:
        return existing_label

This ensures a clean, deduplicated business folksonomy. The Librarian node can now execute hybrid searches—combining the broad recall of dense vector similarity with the razor-sharp precision of sanitized keyword filters.

4. The Organization Librarian: Entity-Aware Retrieval

Finally, we don't let the LLM search the database directly. We route the request to a dedicated Librarian Node.

Before executing a search, the Librarian runs a fast Entity Extraction pass. If a user asks, "What are the compliance requirements for AdddZero Ltd?", the Librarian extracts the entity "AdddZero Ltd", queries our PostgreSQL CRM to find the exact client_id, and then injects that UUID into the vector search filter.

It does not search the entire company archive; it surgically retrieves data bound to that specific client’s context.

Infrastructure, Not Wrappers

Business AI requires Business-Grade Architecture. If you are relying on generic vector databases without strict RBAC, automated taxonomies, and entity-aware retrieval, you are building on quicksand.

The Nexus Multi-Agent System isn't just smart—it is secure by design.

Deploy Nexus in Your Business

We are currently onboarding early-adopter partners for the Nexus Multi-Agent System. Stop wrestling with disjointed data pipelines and black-box wrappers.

Request an Architecture Briefing