LegalFab AI & LLM

Version: 1.1 Last Updated: January 2026


Component Overview

The AI & LLM Architecture provides the intelligence layer for the LegalFab platform. Built on LightLLM, the platform is LLM-provider-agnostic, enabling flexible model selection while maintaining consistent security controls. The architecture ensures reproducible, auditable AI operations through schema validation, ontology-based execution, and comprehensive provenance tracking.

Key Security Characteristics:

  • Prompt injection defense
  • Data isolation prevents cross-tenant information leakage
  • Secure API integration with LLM providers
  • Comprehensive content filtering and safety guardrails
  • Privacy-preserving retrieval
  • Human-in-the-loop controls for high-risk operations
  • Output consistency through schema validation and deterministic workflows
  • Complete provenance tracking for audit and reproducibility

LightLLM Router Architecture

LegalFab uses LightLLM as a unified routing layer for all LLM provider access:

┌─────────────────────────────────────────────────────────────────────┐
│                    LIGHTLLM ROUTER                                  │
├─────────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐          │
│  │  Unified    │  │  Load       │  │  Fallback           │          │
│  │  API        │  │  Balancing  │  │  Routing            │          │
│  └─────────────┘  └─────────────┘  └─────────────────────┘          │
│                              │                                      │
│         ┌────────────────────┼────────────────────┐                 │
│         ▼                    ▼                    ▼                 │
│  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐          │
│  │  Provider A │      │  Provider B │      │  Self-      │          │
│  │             │      │             │      │  Hosted     │          │
│  └─────────────┘      └─────────────┘      └─────────────┘          │
└─────────────────────────────────────────────────────────────────────┘

LightLLM Security Benefits

Benefit Description
Unified Security Controls Single point for authentication, logging, and filtering
Provider Abstraction Security controls applied consistently across all providers
Automatic Failover Seamless switching to backup providers
Centralized Credential Management All API keys managed in one secure location
Consistent Audit Logging Uniform logging regardless of backend provider
Cost Controls Budget limits enforced across all providers

LLM Output Consistency

Traditional LLM applications can produce different outputs for identical inputs. LegalFab addresses this through a multi-layered architectural approach that constrains and validates LLM outputs at every step.

Schema-Validated Extraction

Control Implementation
Template Validation All LLM extractions validated against user-defined schemas
Field Type Enforcement Extracted values must match schema-defined data types
Constraint Checking Values validated against defined ranges and formats
Hallucination Prevention System cannot invent fields or values outside schema definitions
Flagging Non-Conformance Extractions that don’t match schema are flagged for review

Ontology-Based Execution

The ontology defines relationships between entities, execution sequences, and dependencies among analytical steps. This ensures consistent reasoning pathways regardless of when or how often an analysis is performed.

Control Implementation
Execution Order Ontology dictates the sequence of data science flows
Dependency Enforcement Steps execute only when dependencies are satisfied
Workflow Consistency Same dataset follows same logical pathways every time
Computational Graph Reasoning chains follow deterministic graph structure

Reasoning Chain Transparency

Each analytical step documents its reasoning for complete traceability:

Captured Element Description
Input Data Source data and documents consulted
Reasoning Process Analytical methods applied
Intermediate Conclusions Stepwise findings with justification
Confidence Scores Certainty levels at each step
Supporting Facts Specific evidence with document references

Deterministic vs. Probabilistic Components

Component Type Behavior Examples
Deterministic Always produces identical results Database queries, schema validation, rule-based logic
Probabilistic (Constrained) LLM operations with controls Temperature minimization, structured output formats, ensemble validation

LLM Variability Controls:

Control Implementation
Temperature Settings Minimized to reduce output variability
Structured Output Format enforced through schema layer
Validation Checks Outputs deviating from expected patterns rejected
Ensemble Approaches Multiple LLM calls must agree for critical decisions

LLM Monitoring and Observability

Phoenix Monitoring Integration

LegalFab integrates with LLM observability tools for comprehensive monitoring:

Metric Description Purpose
Prompt/Response Pairs Complete input/output capture Audit, debugging
Token Usage Tokens consumed per request Cost tracking
Latency Response time metrics Performance monitoring
Embedding Quality Vector similarity and retrieval accuracy RAG optimization
Drift Detection Statistical measures of output distribution changes Model stability

Quality Metrics

Metric Description Tracking
Factual Grounding % Proportion of conclusions with clear evidence attribution Per-model, over time
Hallucination Rate Outputs lacking factual basis Alert threshold
Schema Validation Pass Rate Outputs conforming to expected structures Per-task type
Confidence Score Distribution Statistical profile of model certainty Baseline comparison
Q&A Correctness Verified accuracy against ground truth Benchmark testing

Model Version Management

Model Update Process

When LLM models are updated, organizations need assurance that investigative conclusions remain credible and traceable.

Pre-Deployment Testing:

Stage Activities
Historical Case Testing Run new model on cases with known outcomes
Output Comparison Compare against established conclusions
Metric Analysis Measure changes across golden test set
Regression Detection Identify any degradation in performance

Controlled Rollout:

Phase Description
Canary Deployment Deploy to subset of workloads
Real-Time Monitoring Compare metrics between model versions
Gradual Expansion Increase traffic only after validation
Full Deployment Complete rollout after confidence threshold met

Rollback Capability:

Control Implementation
Version Retention Previous model versions maintained
Instant Rollback One-click reversion if issues detected
Traffic Shifting Gradual migration between versions
State Preservation No data loss during model transitions

LLM Provenance

Complete Invocation Logging

Every LLM invocation captures:

Data Point Description Retention
Prompt Template Exact prompt sent to model 1 year
Model Version Specific model and parameters used 1 year
Retrieved Context Documents/data provided to model 1 year
Complete Response Full model output 1 year
Reasoning Chain Logic from context to conclusion 2 years
Supporting Facts Document references for conclusions 2 years

Provenance Security

Control Implementation
Immutability Provenance records cannot be modified
Tamper Evidence Cryptographic hash chain prevents alteration
Cross-Reference Links between related invocations preserved
Version Comparison Side-by-side analysis when model changes affect conclusions

Provider Security

Provider Assessment Requirements

Assessment Area Requirements
Data Handling No training on customer data
API Security TLS 1.3, secure API key management
Compliance SOC 2, GDPR compliance
Data Residency Regional processing options
Incident Response Breach notification commitment

Provider Isolation

Control Implementation
Credential Separation Each provider has separate, rotated credentials
Request Isolation No cross-provider request correlation
Response Isolation Provider responses not shared across tenants
Failure Isolation Provider failures don’t cascade

Prompt Security

Prompt Injection Defense

┌─────────────────────────────────────────────────────────────────────┐
│                    PROMPT INJECTION DEFENSE                          │
├─────────────────────────────────────────────────────────────────────┤
│  LAYER 1: INPUT PREPROCESSING                                       │
│  • Character filtering    • Length limits    • Encoding validation  │
├─────────────────────────────────────────────────────────────────────┤
│  LAYER 2: INJECTION DETECTION                                       │
│  • ML-based classifier    • Pattern matching    • Semantic analysis │
├─────────────────────────────────────────────────────────────────────┤
│  LAYER 3: PROMPT ARCHITECTURE                                       │
│  • Instruction hierarchy    • Delimiter separation    • Anchoring   │
├─────────────────────────────────────────────────────────────────────┤
│  LAYER 4: OUTPUT VALIDATION                                         │
│  • Format validation    • Action authorization    • Anomaly check   │
└─────────────────────────────────────────────────────────────────────┘

Input Sanitization

Control Implementation
Character Filtering Whitelist allowed characters
Length Limits 10,000 character limit per input
Encoding Validation Reject malformed encoding
Pattern Detection Regex + ML classifier

Instruction Hierarchy

Level Source Authority
System Platform code Highest (cannot be overridden)
Application Component configuration High
Context Retrieved documents Medium (clearly delimited)
User User input Lowest (sandboxed)

Data Protection in AI Pipelines

PII Handling

Pre-Processing (Before LLM):

Stage Control
Input Reception PII detection scan
Context Retrieval Access control enforcement
Prompt Assembly PII redaction or tokenization

Post-Processing (After LLM):

Stage Control
Response Reception PII detection scan
Content Filtering Unexpected PII flagged
Response Delivery Final PII scrubbing

Tenant Data Isolation

Control Implementation
Namespace Isolation Each tenant’s data in separate namespace
Embedding Isolation Tenant-specific vector collections
Query Filtering Automatic tenant filter on all queries
Cache Isolation Per-tenant response caching

Content Safety

Input Content Filtering

Filter Type Detection Method Action
Toxicity ML classifier Block + log
Hate Speech ML classifier + keywords Block + report
Illegal Content Pattern matching + ML Block + report

Output Content Filtering

Filter Type Detection Method Action
Harmful Instructions Pattern matching + ML Redact or block
Personal Information PII detection Redact
Confidential Data Classification check Redact

Hallucination Mitigation

Control Implementation
Grounding Responses grounded in retrieved context
Citation Required citations for factual claims
Confidence Low-confidence responses flagged
Verification Critical facts verified against knowledge base

Vector and Embedding Security

Embedding Security

Control Implementation
Encryption at Rest AES-256-GCM for vector data
Encryption in Transit TLS 1.3 for all connections
Access Control Role-based access, no direct user access
Tenant Isolation Separate collections per tenant

Retrieval Security

Stage Control
Query Reception User authentication verified
Similarity Search Tenant filter applied automatically
Result Filtering ACL check on each result
Context Assembly Only accessible documents included

Audit Logging

AI-Specific Events

Event Logged Data Retention
AI Request User ID, request type, timestamp 1 year
Input Processing Input hash, sanitization actions 1 year
LLM Invocation Provider, model, token count, latency 1 year
Output Processing Filtering actions, content flags 1 year

Security Events

Event Logged Data Retention
Injection Attempt Input pattern, detection method, user ID 2 years
Content Violation Violation type, content hash, action taken 2 years
Authorization Failure Requested action, denial reason 2 years

Responsible AI

Ethical Principles

Principle Implementation
Fairness Bias detection and monitoring
Transparency Clear AI involvement disclosure
Accountability Clear ownership, audit trails
Privacy Data minimization
Human-Centered AI augments human judgment

Explainability

Feature Description
Source Attribution Responses cite sources from knowledge base
Confidence Indication AI indicates confidence level
Reasoning Transparency Key reasoning steps exposed
Decision Logging Automated decisions logged with rationale

Quality Assurance and Feedback

Output Tracking Mechanisms

Model outputs are tracked through three complementary mechanisms:

Mechanism Description Feedback Loop
Ontology Review Users validate auto-generated legal concepts Refinement of extraction patterns
Dialogue Feedback Response quality ratings on conversational outputs Prompt optimization
Search Provenance Trace results back to sources, verify grounding Metadata and entity corrections

Quality Adjustment Mechanisms

When quality falls short, adjustments occur without traditional ML retraining:

Adjustment Type When Used Implementation
Agent Logic Refinement Specific extraction failures Modify confidence thresholds, extraction patterns
Ontology Regeneration Systemic quality issues Ingest additional/updated domain sources
Prompt Engineering Dialogue quality issues Refine prompt templates and examples
Validation Rule Updates Schema violations Adjust validation constraints

Continuous Improvement

Process Description
Feedback Collection Structured capture of user corrections
Root Cause Analysis Identify patterns in quality failures
Targeted Intervention Apply fixes to specific components
Validation Testing Verify improvements on representative samples

Model Lifecycle Management

Model Selection and Configuration

Consideration Options
Provider Selection Cloud APIs, self-hosted, enterprise offerings
Model Size Balance accuracy vs. latency/cost
Specialization General-purpose vs. domain-tuned models
Regional Compliance Data residency requirements

Model-Agnostic Architecture

The Knowledge Fabric and agent infrastructure consume structured metadata outputs regardless of which LLM generated them:

Benefit Description
Provider Flexibility Swap models without workflow changes
Best-of-Breed Different models for different tasks
Cost Optimization Route to appropriate model tier
Compliance Routing Direct data to compliant providers

Model Deprecation

Stage Activities
Deprecation Notice Advance warning before model retirement
Migration Path Clear upgrade path to replacement model
Parallel Running Both models available during transition
Validation Period Extended testing before old model removal