PATHLIGHTER — A RedStar Foundry Platform

// The Problem This Solves

Controls Testing Is Broken

Every GSIB runs the same playbook. Massive control inventories. Tons of manual sample analysis and data work. Human reviewers drowning in evidence. And at the end, findings that could have been caught in minutes.

✕ 01

Manual Everything

Control testing at major banks runs on spreadsheets, email chains, and copy-paste workflows. A single test of design takes days. Operating effectiveness testing takes weeks. Repeat for hundreds of controls across dozens of domains.

✕ 02

Inconsistent Quality

Different reviewers apply different standards. Workpaper quality varies wildly. Senior reviewers spend more time fixing documentation than analyzing controls. Institutional knowledge walks out the door with every departure.

✕ 03

Reactive, Not Continuous

Controls are tested on a cycle — quarterly, annually. Between tests, exceptions go undetected. By the time findings surface, the damage is done. Continuous monitoring exists in theory but rarely in practice.

// The Platform

Inside PathLighter

Real screens from the production platform. No mockups. No Figma. Working software.

Issue Management

AI-generated issue identification with root cause quality scoring, remediation tracking, validation evidence assessment, and theme analysis.

PathLighter Monitoring — domain overview with health gauges

Monitoring Domains

10 surveillance domains with health score gauges, rule counts, exception tallies, and pass rates. Wire Transfers through Trade Surveillance.

Control Testing

AI-powered evaluation with composite scoring gauge, seven testability dimension bars, and structured analysis with strengths and gaps.

Exception Queue

Full exception population browser with severity-coded rows, OFAC matches, structuring patterns, and review status tracking.

// The Platform

Four Modules

Four integrated modules — Monitoring for continuous surveillance, Control Testing for design analysis, operating effectiveness, test of one, and PRC inventory management, Issues for AI-driven root cause and remediation, and Reports for regulatory-ready output.

Monitoring

Continuous Surveillance

Full-population and sample-based monitoring with AI-powered exception triage. Populations ingest automatically, rules fire against live data, and exceptions route to reviewers with AI-generated severity scores.

Exception queue with severity scoring and auto-triage
Health score gauges with configurable threshold alerting
Sample-based deep-dive reviews with evidence collection
Results aggregation and domain-level dashboards

Control Testing

AI-Assisted Evaluation

Automated control evaluation across the full testing lifecycle. The LLM evaluation engine scores controls across seven testability dimensions, generates Test of Design analyses, and produces structured workpapers.

Control description analysis with gap identification
Interactive walkthrough assistant with context gathering
Test of Design with seven-dimension scoring
Operating effectiveness testing with evidence assessment

Issue Intelligence

Root Cause & Remediation

AI-generated issue identification from test results. Root cause analysis quality scoring, remediation plan tracking, validation evidence assessment — with full lifecycle management from discovery to closure.

Auto-generated issues from control testing results
Root cause analysis with AI quality scoring
Remediation plan generation and tracking
Theme analysis across issue populations

Reports & Analytics

Executive Reporting

Control health dashboards, issue aging analysis, and automated report generation. Export to PDF, Excel, and Word. Regulatory-ready output formatted for executive and committee presentations.

Control health scorecards with trend analysis
Issue aging and remediation velocity metrics
Automated regulatory report generation
Export to PDF, Excel, and Word formats

// What Powers It

The AI Engine

PathLighter doesn't use AI as a wrapper. The LLM evaluation engine is built on a 950+ normative control inventory organized across 10 novel archetype classifications — each with domain-calibrated rubrics that took hundreds of hours to tune. Calibrating an LLM to evaluate compliance language with precision is one of the hardest problems in AI governance, and it's the core of everything here.

LLM-as-a-Judge Framework

Controls are evaluated across seven testability dimensions — specificity, measurability, frequency, evidence quality, ownership clarity, exception handling, and automation level. Scores aggregate into a composite testability rating with weighted importance.

Domain Calibration

The hardest part of the entire system. The evaluation engine is calibrated against financial compliance language, regulatory frameworks, and 10 distinct control archetype patterns — each requiring different rubrics, scoring weights, and evidence expectations. Getting an LLM to reliably distinguish an APPR control from a RECON control, and evaluate each against the right standard, required extensive iteration and domain expertise that can't be shortcut.

Bias Quantification

Systematic bias detection across LLM outputs — position bias (first-option preference), verbosity inflation (longer = better scoring), and self-enhancement (AI favoring its own prior outputs). Each bias vector is measured and mitigated.

Workpaper Generation

Structured workpapers generated automatically from test results. Formatted with evidence citations, finding classifications, and recommendation language — all consistent with PCAOB and regulatory standards.

Context-Aware Chat

Interactive walkthrough assistant that gathers context about control environments through structured conversation. Identifies gaps in control descriptions, asks targeted questions to fill them, and tracks which testability dimensions have been addressed.

Exception Triage

AI-powered severity scoring for monitoring exceptions. Each exception is evaluated against historical patterns, control context, and risk impact — then routed to the appropriate reviewer with a pre-generated analysis brief.

// Technical Capabilities

Deep Dive

The architecture under PathLighter — purpose-built systems for compliance automation, continuous monitoring, and LLM governance at institutional scale. We've loaded 10 million+ transactions to demonstrate real-time performance at production volumes.

// 01

Continuous Monitoring

Full-population surveillance engines with rule-based and AI-powered anomaly detection — exception queues, health score gauges, threshold alerting, and sample-based reviews at scale.

In Practice

950+ controls across 10 regulatory archetypes
Exception triage with AI-generated severity scoring
Health score gauges with configurable threshold alerting
Population ingestion pipelines with automated rule execution

// 02

LLM Evaluation Framework

Structured LLM-as-a-Judge scoring that replaces subjective review with quantifiable assessment — seven testability dimensions, weighted aggregation, and human-in-the-loop governance.

In Practice

Seven-dimension testability scoring with weighted aggregation
Position bias, verbosity inflation, and self-enhancement detection
Domain-calibrated rubrics for compliance language
Human-in-the-loop review and approval workflows

// 03

Control Testing Automation

End-to-end automated testing from control description analysis through workpaper generation — Test of Design, Operating Effectiveness, and walkthrough analysis with AI-assisted evidence assessment.

In Practice

Automated Test of Design with structured scoring output
Operating effectiveness evaluation with sample assessment
Interactive walkthrough with dimension-aware context gathering
Structured workpaper generation with evidence citations

// 04

Control Archetype Intelligence

A normative inventory of 950+ controls organized across 10 regulatory archetypes — each with domain-specific evaluation criteria, population definitions, and expected evidence standards.

In Practice

10 archetypes: APPR, RECON, REV, SOD, ACCESS, ITGC, EXCMON, VAL, PERIOD, VENDOR
Archetype-specific evaluation rubrics and scoring weights
Normative control descriptions with enrichment pipelines
Cross-archetype pattern detection and gap analysis

// 05

Issue Lifecycle Management

AI-generated issue identification triggered from test results — with root cause quality scoring, remediation tracking, validation evidence assessment, and theme analysis across populations.

In Practice

Auto-generated issues with AI-prefilled root cause analysis
Remediation plan generation with milestone tracking
Validation evidence assessment and closure workflows
Cross-issue theme analysis and trend identification

// 06

Regulatory Reporting

Automated generation of committee-ready reports — control health scorecards, issue aging analytics, remediation velocity metrics, and export to PDF, Excel, and Word.

In Practice

Control health dashboards with trend and velocity analysis
Issue aging with SLA tracking and escalation triggers
Multi-format export (PDF, Excel, Word) with branded templates
Regulatory-aligned formatting (PCAOB, COSO, SR 11-7)

Controls Testing Is Broken

Manual Everything

Inconsistent Quality

Reactive, Not Continuous

Inside PathLighter

Issue Management

Monitoring Domains

Control Testing

Exception Queue

Four Modules

Continuous Surveillance

AI-Assisted Evaluation

Root Cause & Remediation

Executive Reporting

The AI Engine

LLM-as-a-Judge Framework

Domain Calibration

Bias Quantification

Workpaper Generation

Context-Aware Chat

Exception Triage

Deep Dive

Continuous Monitoring

LLM Evaluation Framework

Control Testing Automation

Control Archetype Intelligence

Issue Lifecycle Management

Regulatory Reporting

Stack