REDSTR
Foundry

We build intelligent systems — from AI-driven game engines and autonomous character networks to LLM-powered compliance platforms for global banks.

View Our Work Get In Touch
Scroll

Portfolio

Two production systems. One shared obsession: building AI that works at the technical layer — not as a wrapper around an API, but as deeply integrated architecture where models, evaluation frameworks, and domain logic are inseparable.

Disendya
Multi-Agent AI Dungeon Master

AI-Driven RPG Engine

Cyberpunk × High Fantasy

We don't build NPCs that wait for players to show up. We build characters that live lives when you're not watching.

A multi-agent AI Dungeon Master that runs a procedurally generated cyberpunk-fantasy RPG for multiple human players. Nine specialized AI agents coordinate in real-time — from narrative generation and tactical combat to NPC relationships and moral consequence tracking.

Features a canon story mode with intelligent divergence tracking, where the system gracefully adapts when players go off-script, alongside a fully procedural freeplay mode. AI companions have real personalities, opinions, and the ability to disagree.

LangGraph Claude API React / Next.js WebSockets FastAPI Redis PostgreSQL LangSmith React Native
9
AI Agents
1K+
Agentic Characters
5
Personality Axes
4
Memory Layers
Explore Project Story & World
PATHL GHTER
AI-Powered Risk Intelligence

Compliance Monitoring Platform

Financial Services × Generative AI

Monitoring, testing, issues, and reporting should be instant with AI-enabled help — not buried in six weeks of spreadsheets.

A full-stack AI-powered compliance monitoring and control testing platform. Global Systemically Important Banks run massive control inventories across dozens of regulatory domains — PathLighter automates the decision-making layer, from data ingestion through AI-assisted evaluation and workpaper generation.

The platform flags compliance exceptions, identifies design and operating effectiveness deficiencies, and surfaces control gaps that manual review would take weeks to find. An LLM-as-a-Judge evaluation engine scores controls across seven testability dimensions, while the monitoring module runs continuous surveillance with exception queues, health score gauges, and threshold alerting across 950+ controls spanning 10 regulatory archetypes.

Monitoring Control Testing Issue Intelligence Reports & Analytics
React / Next.js TypeScript Python / FastAPI PostgreSQL Claude API Tailwind CSS LLM-as-a-Judge
950+
Controls Mapped
10
Archetypes
7
Test Dimensions
4
Modules
Explore Project

Capabilities

01

Multi-Agent AI Systems

Specialized agent architectures where each AI handles a distinct responsibility — orchestrated through graph-based workflows with full observability, tracing, and tiered processing that scales cost to complexity.

In Practice
  • LangGraph workflow orchestration with conditional routing
  • Tiered processing: frontier models for critical decisions, lightweight for routine
  • LLM-as-a-Judge evaluation across structured scoring dimensions
  • Full LangSmith observability across agent decision chains
02

LLM Evaluation & Governance

Production-grade LLM evaluation frameworks that go beyond vibes — structured scoring rubrics, bias quantification, domain calibration, and human-in-the-loop governance for high-stakes AI outputs.

In Practice
  • Multi-dimension scoring with weighted aggregation
  • Bias detection: position bias, verbosity inflation, self-enhancement
  • Domain-calibrated rubrics for specialized language and context
  • Automated output generation with regulatory-ready formatting
03

Persistent Memory Systems

AI that remembers — episodic, emotional, and semantic memory layers with compression strategies that keep context windows manageable while preserving meaningful history across extended runtime.

In Practice
  • Episodic memory with timestamps, participants, and emotional tags
  • Tiered decay: emotional memory persists longer than factual by design
  • Compression pipelines that distill old memories into semantic knowledge
  • Embedding-based retrieval surfaces relevant context on demand
04

LLM Production Architecture

Production-grade LLM applications with context window management, tool use, structured outputs, rolling memory, and cost optimization strategies that make complex AI systems economically viable.

In Practice
  • Claude API with structured outputs and tool-use patterns
  • Context window management with token-aware memory compression
  • Cost tiering: frontier models for critical decisions, lightweight for routine
  • Prompt engineering for personality and domain consistency across sessions
05

Full-Stack Delivery

End-to-end system architecture from database schema to production UI — React frontends, Python backends, PostgreSQL data layers, and deployment pipelines built for real users, not demos.

In Practice
  • React + Next.js frontends with TypeScript and Tailwind
  • Python FastAPI backends with PostgreSQL + Redis
  • WebSocket real-time sync and REST API design
  • Complete SDLC: architecture, build, test, deploy, iterate

About

RedStar Foundry is led by Taiin — a terrible gamer, obsessive data nerd, movie lover, voracious reader, and chaotic energy all wrapped in one.

By day, AI governance, model risk, for Tier 1 financial institutions. By night, building worlds where AI characters live their own lives. The through-line is the same: we build systems where AI isn't a feature — it's the architecture.

Our philosophy: build things that work at the technical layer at scale. Not wrappers. Not demos. Not pitch decks. Working software that solves hard problems — whether that's a thousand NPCs with genuine personalities or a compliance platform that turns six weeks of manual data work into one autonomous agent executing instantly inside a governed framework.