Guardrails AI

Guardrails AI is the open-source validation framework we layer on top of LLM responses to enforce structure, redact PII, and block unsafe outputs before they reach users. The principle: never trust a raw LLM response in production. Every output goes through a validation layer that checks for the things the LLM is likely to get wrong — wrong format, leaked private data, off-topic answers, jailbreak responses, hallucinated facts against a known truth set. We pair Guardrails with our own evaluation harness for ongoing measurement. Together they turn 'AI shipped' into 'AI shipped safely'.

What you get

PII detection + redaction before LLM output reaches the client — critical for healthcare, financial services, legal workloads

Output schema enforcement — guarantee structured JSON / specific format / required fields without trusting the model

Content safety filters catch toxic, off-topic, or jailbreak outputs at the application boundary

Reasking strategy — when validation fails, automatically reprompt the LLM with the failure context instead of returning the bad answer

Open-source + extensible — we add custom validators for client-specific compliance rules

Real examples

PII redaction in healthcare AI

Illustrative scenario: a healthcare provider's AI assistant summarises patient records. Guardrails enforces PII redaction on every output — patient names, MRNs, dates of birth are stripped or hashed before the summary reaches the doctor's screen. Mandatory audit trail attached to every redaction event.

Structured output enforcement for downstream automation

Illustrative scenario: an invoice-processing pipeline expects strict JSON output from the LLM. Guardrails enforces the schema (date format, currency code, line-item structure) and re-prompts the LLM if validation fails. Downstream automation never receives malformed data.

Off-topic + jailbreak detection in customer-facing chat

Illustrative scenario: a B2B SaaS chat assistant must stay within product-support scope. Guardrails filters outputs that drift off-topic, refuse politely on jailbreak attempts, and log every refusal for review. Prevents the bot from being weaponised for unintended use cases.

Common questions

Why a separate validation layer when LLMs have built-in safety?

Built-in model safety is good for the general case but tunes for breadth, not specificity. Your application has rules the model doesn't know — what counts as PII for your specific jurisdiction, what fields your downstream system requires, what topics are off-limits for your product. Guardrails encodes those rules as deterministic checks at the application boundary, independent of which model you're calling.

Does this add latency?

Some — typically 50-200ms per call depending on the validator stack. For most production workloads that's acceptable. For latency-critical realtime apps (sub-500ms response targets), we cut the validator stack to the safety-critical checks and run others asynchronously after the response is sent.

Can Guardrails replace our content-moderation policy?

No. It's enforcement, not policy. The policy still needs to come from your compliance / legal / product team — Guardrails just encodes that policy into deterministic runtime checks. We typically pair a Guardrails implementation with a documented content-moderation policy review.

What about hallucination detection?

Partial support. Guardrails has a 'provenance' validator that checks LLM claims against a known truth set (great for RAG). For genuinely novel hallucinations (where there's no truth set to check against), we layer in our own LLM-as-judge evaluation harness on top of Guardrails — judging accuracy with a different model is sometimes the best detection signal available.

How do you decide what validators to use?

Three buckets. Always-on: PII redaction (any data-sensitive context), JSON schema enforcement (any structured output). Risk-driven: toxicity, jailbreak, off-topic filters for customer-facing chat. Domain-specific: custom validators for industry rules (financial advice disclaimers, healthcare HIPAA equivalents, etc.). We design the validator stack as part of the AI readiness scoping work.

Ready to get started?

Tell us about your project and we'll tell you honestly how we can help.

Get in Touch