LegalFab Schema Management & Business Domain Discovery
Version: 1.0
Last Updated: January 2026
Component Overview
The Schema Management system provides a unified approach to defining, discovering, and managing business domain schemas across the LegalFab platform. It enables users to describe their business domain through document analysis and creates structured schemas that control data extraction, agent behavior, and integration patterns.
Core Capabilities:
| Capability |
Description |
Usage Context |
| Business Domain Discovery |
Extract domain concepts from user documents |
Studio onboarding |
| Schema Definition |
Create and manage structured data schemas |
Platform-wide |
| Document-to-Schema |
AI-powered schema extraction from documents |
Studio, Knowledge Fabric |
| Schema Versioning |
Track schema evolution with full history |
Governance |
| Schema Binding |
Link schemas to agents, extractors, and MCPs |
Execution |
| Schema Validation |
Ensure data conformance to defined schemas |
Data quality |
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────┐
│ SCHEMA MANAGEMENT ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SCHEMA DISCOVERY LAYER │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ │
│ │ │ Document │ │ Domain Concept │ │ Schema │ │ │
│ │ │ Analyzer │ │ Extractor │ │ Generator │ │ │
│ │ └──────────────────┘ └──────────────────┘ └─────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SCHEMA REGISTRY │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ │
│ │ │ Schema Store │ │ Version │ │ Dependency │ │ │
│ │ │ │ │ Control │ │ Manager │ │ │
│ │ └──────────────────┘ └──────────────────┘ └─────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ SCHEMA CONSUMERS │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────────────┐ │ │
│ │ │ Studio │ │ OSINT │ │ Agent │ │ MCP │ │ │
│ │ │ Agents │ │ Extractors│ │ Logic │ │ Connectors │ │ │
│ │ └───────────┘ └───────────┘ └───────────┘ └─────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ VALIDATION & GOVERNANCE │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ │
│ │ │ Schema │ │ Compliance │ │ Audit │ │ │
│ │ │ Validator │ │ Checker │ │ Logger │ │ │
│ │ └──────────────────┘ └──────────────────┘ └─────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Business Domain Discovery
The Business Domain Discovery feature enables users to describe their domain by providing documents, from which the system extracts relevant concepts and generates structured schemas.
Discovery Flow
┌─────────────────────────────────────────────────────────────────────────┐
│ BUSINESS DOMAIN DISCOVERY │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ │
│ │ User Documents │ Policies, procedures, contracts, forms, │
│ │ (Input) │ data dictionaries, specifications │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Document │ Extract text, structure, and metadata: │
│ │ Analysis │ • Document type classification │
│ │ │ • Section identification │
│ │ │ • Table and list extraction │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Concept │ Identify domain concepts: │
│ │ Extraction │ • Entities (Person, Organization, Matter) │
│ │ │ • Attributes (name, date, amount) │
│ │ │ • Relationships (owns, represents, related_to) │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Schema │ Generate structured schemas: │
│ │ Generation │ • Entity definitions │
│ │ │ • Attribute types and constraints │
│ │ │ • Relationship mappings │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ User Review │ Interactive refinement: │
│ │ & Refinement │ • Approve/reject concepts │
│ │ │ • Modify attributes and types │
│ │ │ • Add missing elements │
│ └────────-────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Supported Document Types
| Document Type |
Extraction Focus |
Example Concepts |
| Policies & Procedures |
Workflows, rules, roles |
Process steps, approval levels, responsibilities |
| Contracts & Agreements |
Parties, terms, obligations |
Party types, clause types, obligation categories |
| Data Dictionaries |
Field definitions, types |
Attribute names, data types, validation rules |
| Forms & Templates |
Input fields, validations |
Field names, required vs optional, formats |
| Organizational Charts |
Hierarchy, departments |
Role types, reporting relationships |
| Regulatory Documents |
Requirements, controls |
Compliance entities, control types |
The system uses AI-powered extraction to identify domain concepts from documents.
Extraction Capabilities:
| Capability |
Description |
| Entity Recognition |
Identify business entities (Client, Matter, Document, etc.) |
| Attribute Discovery |
Extract attributes and their data types |
| Relationship Mapping |
Identify relationships between entities |
| Constraint Detection |
Discover validation rules and constraints |
| Enumeration Extraction |
Identify fixed value sets (status codes, categories) |
| Hierarchy Detection |
Recognize parent-child and classification structures |
Extraction Security:
| Control |
Implementation |
| Document Isolation |
Each user’s documents processed in isolation |
| PII Detection |
Sensitive data identified and flagged |
| Content Filtering |
Inappropriate content filtered before processing |
| Audit Trail |
All extraction activities logged |
Schema Definition
Schemas provide structured definitions for business domain concepts used across the platform.
Schema Structure
| Component |
Description |
Purpose |
| Schema Metadata |
Name, version, description, owner |
Identification and governance |
| Entities |
Business object definitions |
Core domain concepts |
| Attributes |
Entity properties with types |
Data structure |
| Relationships |
Connections between entities |
Domain model |
| Constraints |
Validation rules and restrictions |
Data quality |
| Enumerations |
Fixed value sets |
Controlled vocabularies |
Entity Definition
| Element |
Description |
| Entity ID |
Unique identifier |
| Name |
Human-readable name |
| Description |
Purpose and usage |
| Attributes |
List of entity properties |
| Primary Key |
Unique identifier attribute(s) |
| Inheritance |
Parent entity (if applicable) |
Attribute Types
| Type |
Description |
Examples |
| String |
Text values |
Name, description, notes |
| Number |
Numeric values |
Amount, quantity, score |
| Boolean |
True/false values |
Active, verified, approved |
| Date |
Date values |
Created date, due date |
| DateTime |
Date and time values |
Timestamp, scheduled time |
| Enum |
Fixed value set |
Status, category, type |
| Reference |
Link to another entity |
Client ID, matter reference |
| Array |
List of values |
Tags, categories |
| Object |
Nested structure |
Address, contact details |
Constraint Types
| Constraint |
Description |
Example |
| Required |
Field must have a value |
Client name is required |
| Unique |
Value must be unique |
Email must be unique |
| Format |
Value must match pattern |
Phone format validation |
| Range |
Numeric bounds |
Amount between 0 and 1M |
| Length |
String length limits |
Description max 500 chars |
| Enumeration |
Value from fixed set |
Status in [Active, Closed] |
| Reference |
Valid entity reference |
Client must exist |
| Custom |
User-defined validation |
Business rule validation |
Schema Registry
The Schema Registry provides centralized storage, versioning, and access control for all schemas.
Registry Capabilities
| Capability |
Description |
| Schema Storage |
Persistent storage of schema definitions |
| Version Control |
Full version history with diff tracking |
| Dependency Tracking |
Track schema dependencies and consumers |
| Access Control |
Role-based schema access |
| Search & Discovery |
Find schemas by name, attribute, or content |
| Export/Import |
Schema portability across environments |
Version Management
| Operation |
Description |
Use Case |
| Create |
Initial schema version |
New domain concept |
| Update |
Create new version with changes |
Schema evolution |
| Deprecate |
Mark version as deprecated |
Migration preparation |
| Archive |
Remove from active use |
End of life |
| Rollback |
Restore previous version |
Error recovery |
Version Compatibility:
| Compatibility |
Description |
Impact |
| Backward Compatible |
New version accepts old data |
Safe upgrade |
| Forward Compatible |
Old version accepts new data |
Flexible consumers |
| Breaking Change |
Incompatible changes |
Migration required |
Schema Dependencies
| Dependency Type |
Description |
| Entity Reference |
Schema references entity from another schema |
| Inheritance |
Schema extends another schema |
| Composition |
Schema includes elements from another schema |
| Consumer |
Agent, extractor, or MCP using the schema |
Schema Usage
Studio Integration
Schemas control data processing in Studio agents and pipelines.
Agent Schema Binding:
| Binding Type |
Description |
| Input Schema |
Defines expected input structure |
| Output Schema |
Defines produced output structure |
| Internal Schema |
Defines intermediate data structures |
| Validation Schema |
Defines validation rules for agent data |
Pipeline Schema Flow:
┌─────────────────────────────────────────────────────────────────────────┐
│ SCHEMA-DRIVEN PIPELINE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Input │ │ Agent A │ │ Agent B │ │ Output │ │
│ │ Data │────▶│ │────▶│ │────▶│ Data │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Input │ │ Schema A │ │ Schema B │ │ Output │ │
│ │ Schema │ │ (I/O) │ │ (I/O) │ │ Schema │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ Schema validation occurs at each step boundary │
└─────────────────────────────────────────────────────────────────────────┘
External data sources use schemas to structure extracted data.
Extractor Schema Binding:
| Component |
Description |
| Source Mapping |
Map external data fields to schema attributes |
| Transformation Rules |
Convert external formats to schema types |
| Validation Rules |
Validate extracted data against schema constraints |
| Default Values |
Apply defaults for missing fields |
OSINT Schema Flow:
| Stage |
Description |
| Raw Data |
Unstructured data from external source |
| Field Mapping |
Map external fields to schema attributes |
| Type Conversion |
Convert to schema-defined types |
| Validation |
Validate against schema constraints |
| Enrichment |
Add computed or derived attributes |
| Output |
Schema-conformant data for Knowledge Graph |
MCP Connector Integration
MCP connectors use schemas to ensure data consistency across tool integrations.
MCP Schema Capabilities:
| Capability |
Description |
| Schema Provision |
Provide schema to MCP tool for data extraction |
| Response Mapping |
Map tool response to schema structure |
| Request Validation |
Validate tool inputs against schema |
| Contract Enforcement |
Ensure tool outputs match expected schema |
MCP Schema Binding:
┌─────────────────────────────────────────────────────────────────────────┐
│ MCP SCHEMA INTEGRATION │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Agent Request │ │ External Tool │ │
│ │ (Schema-bound) │────────────────────▶│ (MCP Server) │ │
│ └─────────────────┘ └────────┬────────┘ │
│ │ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Request │ │ Response │ │
│ │ Schema │ │ Data │ │
│ │ Validation │ │ │ │
│ └─────────────────┘ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Response │ │
│ │ Schema │ │
│ │ Mapping │ │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Schema- │ │
│ │ Conformant │ │
│ │ Output │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Schema Validation
All data processed through the platform is validated against its bound schema.
Validation Modes
| Mode |
Description |
Use Case |
| Strict |
All constraints enforced, reject on failure |
Production data |
| Lenient |
Log warnings, accept partial compliance |
Data migration |
| Report |
Generate validation report without rejection |
Data quality assessment |
Validation Results
| Result |
Description |
Action |
| Valid |
Data conforms to schema |
Continue processing |
| Warning |
Minor issues, data acceptable |
Log and continue |
| Error |
Schema violation, data rejected |
Reject with details |
| Critical |
Security or integrity concern |
Reject and alert |
Validation Security
| Control |
Implementation |
| Injection Prevention |
Schema definitions sanitized before use |
| Type Safety |
Strong typing prevents type confusion attacks |
| Constraint Enforcement |
All constraints validated server-side |
| Audit Logging |
All validation results logged |
Security Controls
Schema Access Control
| Permission |
Description |
Roles |
| schema.read |
View schema definitions |
All authenticated users |
| schema.create |
Create new schemas |
Schema designers, admins |
| schema.update |
Modify existing schemas |
Schema owners, admins |
| schema.delete |
Remove schemas |
Admins only |
| schema.publish |
Publish schema versions |
Schema owners, admins |
| schema.bind |
Bind schemas to consumers |
Agent developers, admins |
Schema Governance
| Control |
Implementation |
| Ownership |
Each schema has designated owner |
| Approval Workflow |
Schema changes require approval |
| Impact Analysis |
Changes analyzed for downstream impact |
| Change Notification |
Consumers notified of schema changes |
| Deprecation Policy |
Minimum notice period before removal |
Schema Security
| Control |
Implementation |
| Integrity |
Schema definitions cryptographically signed |
| Immutability |
Published versions cannot be modified |
| Isolation |
Tenant schemas isolated from each other |
| Encryption |
Schemas encrypted at rest |
Audit Logging
Logged Events
| Event Category |
Events |
Retention |
| Schema Lifecycle |
Create, update, delete, publish, deprecate |
2 years |
| Schema Access |
Read, export, clone |
1 year |
| Validation Events |
Validation success, failure, warnings |
1 year |
| Binding Events |
Bind, unbind, consumer registration |
2 years |
| Discovery Events |
Document analysis, concept extraction |
1 year |
Audit Trail Security
| Control |
Implementation |
| Integrity |
Tamper-evident logging |
| Completeness |
All schema operations logged |
| Confidentiality |
Logs encrypted at rest |
| Access Control |
Auditor role required for log access |