Engineering at Darwin

We build systems where correctness matters

Darwin is a traceability and compliance platform for regulated supply chains. Here we share how we build it — architecture, technical decisions, and production learnings in AI, blockchain, and data engineering.

Principles

How we build

Opinionated engineering for regulated environments with real stakes.

Correctness over speed

When a decision affects regulatory compliance, being wrong quickly isn't an option. Guardrails first, performance later.

Hybrid retrieval

Vector-only doesn't scale for domains with structured + unstructured data. We use query planners that decide the strategy per question.

Observability from day one

Every query, every LLM call, every on-chain event has end-to-end tracing with OpenTelemetry. No observability, no production.

AI-augmented, human-in-the-loop

LLMs to reason over context; deterministic rules to validate results. The combo beats either approach alone.

On-chain for integrity, off-chain for performance

Blockchain only for critical attestations (identity, audit events). The rest lives in systems optimized for fast access.

Build for portability

Provider abstraction (YAML config) to swap LLMs without refactor. We avoid vendor lock-in in AI and data layers.

Stack

Tech we use

Informed choices — not trendy, but battle-tested in the domains that matter to us.

AI / Orchestration

LangGraph for agentic workflows, Python/FastAPI backend, Cursor + Claude Code as daily drivers.

Data / Retrieval

Qdrant for vector search, relational PostgreSQL, Firebase for state, Cloud Storage for blobs.

Blockchain

Polygon PoS + OP Stack L2, 12 smart contracts in Solidity (identity, governance, DID registry, NFT inventory).

Cloud / Infra

GKE on GCP, Pub/Sub event-driven, Terraform IaC, CI/CD with GitHub Actions.

Frontend

React (Next.js) for web, React Native for offline-first mobile (Captia).

Observability

OpenTelemetry full-stack, structured logging, LLM call tracing, metrics + dashboards.

Articles

Technical deep-dives

Production learnings — architecture, tradeoffs, what worked and what didn't.

RAG over 10+ databases: what production taught us
Engineering ·

RAG over 10+ databases: what production taught us

Why vector-only RAG doesn't scale in compliance, how we designed hybrid retrieval across multiple stores, and the architectural decisions that worked in production.

By Hernán Pérez Rodal

Why we put traceability on-chain: FSMA 204 compliance at the protocol level
Engineering ·

Why we put traceability on-chain: FSMA 204 compliance at the protocol level

Most traceability platforms use blockchain as marketing. We share how and why we use it architecturally at Darwin — and when it does NOT make sense.

By Hernán Pérez Rodal

Agentic Compliance System with LangGraph: patterns that work in production
Engineering ·

Agentic Compliance System with LangGraph: patterns that work in production

Not every multi-agent pattern survives regulated domains. We share the agent architecture we use at Darwin, why, and which anti-patterns we avoid.

By Hernán Pérez Rodal

AI anomaly detection on traceability events: from detection to yield optimization
Engineering ·

AI anomaly detection on traceability events: from detection to yield optimization

Detecting anomalies is 20% of the problem. The other 80% is turning alerts into real production savings. We share how we solve it at Darwin — with simple models that move the needle.

By Hernán Pérez Rodal

Offline-first architecture: data capture in rural areas with intermittent connectivity
Engineering ·

Offline-first architecture: data capture in rural areas with intermittent connectivity

In Latin America, the first production link typically has 3G signal or worse. We share how we designed Captia to work without connection — and what tradeoffs we accepted with eventual consistency.

By Hernán Pérez Rodal

LLM evaluation in regulated domains: beyond accuracy
Engineering ·

LLM evaluation in regulated domains: beyond accuracy

When a wrong answer from your LLM affects an FDA audit, accuracy isn't enough. We share how we evaluate LLMs and agents at Darwin — golden sets, LLM-as-judge, regression detection and numeric guardrails.

By Hernán Pérez Rodal

Building something similar?

We're interested in sharing architecture and learning from other cases. If you have a technical challenge that overlaps with ours — let's talk.