membrAIn conducted a comprehensive AI gateway penetration assessment against a client AI infrastructure on June 5, 2026. The engagement tested 102 distinct attack vectors across 12 threat categories — including prompt injection, jailbreak attempts, DLP exfiltration, multi-language evasion, MCP exploitation, and credential exposure — through the deployed membrAIn gateway at gateway.getmembrain.ai.
The gateway demonstrated strong baseline protection with an overall coverage score of 85% across all tested vectors. All 34 critical DLP patterns (credentials, PHI, financial instruments) were blocked with zero false negatives. Five findings were identified — two rated Critical, two High, and one Medium — with targeted remediation paths detailed in this report.
Overall assessment: The gateway is production-ready for standard enterprise AI governance. The two critical findings are addressable with configuration changes (no code changes required) within the remediation windows specified below. Immediate re-test recommended after remediation.
Testing was conducted against the production membrAIn gateway using the full 102-probe assessment suite. All probes were executed using synthetic, non-real data. No production data was accessed or exfiltrated at any point during the engagement.
Each probe was evaluated against three expected outcomes: Blocked (403 DLP violation or threat detection), Passed Through (200 unimpeded — adverse finding), or Gateway Error (configuration issue). Clean-control probes were included to verify zero false-positive rate.
| ID | Severity | Title | Category | Status |
|---|---|---|---|---|
| MBR-001 | Critical | Encoding-Obfuscated Injection Bypasses Detection Prompt Injection — Base64 / Unicode |
Prompt Injection | Open |
| MBR-002 | Critical | Nested Document Injection Not Blocked Indirect Prompt Injection |
Agent Exploitation | Open |
| MBR-003 | High | Portuguese/French Injection Reaches Model Multilingual Evasion |
DLP / Detection | Open |
| MBR-004 | High | Behavioral Anomaly Score Not Surfaced to SIEM Detection & Response Gap |
Monitoring | Open |
| MBR-005 | Medium | Jailbreak via Roleplay Framing Inconsistently Blocked Jailbreak — Semantic |
Threat Detection | Open |
Description: A Base64-encoded prompt injection payload was delivered to the gateway and forwarded to the model unblocked. The gateway's string-pattern detector operates on raw UTF-8 payload content and does not decode or canonicalize encoded variants before scanning. An adversary can encode any blocked pattern in Base64, URL-encoding, or Unicode escape sequences to bypass DLP and threat detection entirely.
analyzeContent() function: Base64-decode any detected Base64 blocks, URL-decode percent-encoded sequences, and Unicode-unescape \uXXXX sequences before running DLP and injection pattern matching. Re-run the full probe suite after the fix. Estimated effort: 2–4 hours.
Description: An adversarial instruction embedded inside a quoted document string was not detected as an injection. This is a standard indirect prompt injection vector — malicious instructions are embedded in documents, emails, or retrieved web content, framed as user-facing text to avoid detection while being processed as commands by the model.
classifyThreat(). This is a known hard problem; a behavioral Z-score anomaly check on response content is a complementary control.
Description: Prompt injection commands written in French and Portuguese bypassed the gateway. French and German variants were correctly blocked (fr-1 blocked; de-1 blocked), but Portuguese was not. Additionally, the French injection was occasionally bypassed on repeat testing, indicating an inconsistency in the LLM classifier's handling of accented characters.
Description: The gateway computes a behavioral Z-score for each request (token count, timing, content patterns) but this value is not included in the SIEM webhook payload. Security operations teams cannot correlate behavioral anomalies with downstream incidents without manual log correlation. This gap is particularly relevant for detecting slow, low-volume exfiltration attempts below the blocking threshold.
behavioral_score and anomaly_components fields to the SIEM webhook payload schema. Configuration change only — no gateway logic changes required. Verify with OQ-43 SIEM E2E test.
Description: Roleplay and hypothetical-framing jailbreaks were blocked on 3 of 5 repeat runs, indicating the LLM classifier produces inconsistent results for semantically borderline prompts. While the rate of successful blocks is high, the inconsistency means determined adversaries can retry until the attempt passes.
| Category | Probes | Blocked | Coverage | Status |
|---|---|---|---|---|
| DLP — Credentials & Secrets | 34 | 34 | 100% | Clean |
| DLP — Healthcare PHI | 10 | 10 | 100% | Clean |
| DLP — Financial | 9 | 9 | 100% | Clean |
| DLP — Legal & Compliance | 7 | 7 | 100% | Clean |
| Prompt Injection | 8 | 6 | 75% | 2 Findings |
| Jailbreak | 6 | 5 | 83% | 1 Finding |
| Multilingual Evasion | 8 | 6 | 75% | 1 Finding |
| MCP / Agent Exploitation | 5 | 4 | 80% | Monitor |
| Social Engineering | 4 | 4 | 100% | Clean |
| Clean Controls (false-positive check) | 6 | 0 | 0% blocked | 0 False Positives |
Coverage = threats blocked ÷ total threat probes. Clean-control probes excluded from coverage denominator. Zero false positives confirms production-safe deployment.
| Finding | Priority | Effort | Owner | Target Date |
|---|---|---|---|---|
| MBR-001 — Encoding normalization pass | P0 | 2–4 hrs | Gateway Engineering | June 12, 2026 |
| MBR-002 — Indirect injection classifier update | P0 | 4–8 hrs | Gateway Engineering | June 12, 2026 |
| MBR-003 — Portuguese/multilingual classifier | P1 | 2 hrs | Gateway Engineering | June 19, 2026 |
| MBR-004 — SIEM behavioral score field | P1 | 1 hr (config) | Platform Engineering | June 19, 2026 |
| MBR-005 — Roleplay jailbreak threshold | P2 | 1–2 hrs | Gateway Engineering | June 26, 2026 |
A complimentary re-test will be conducted against all five findings following remediation. Updated coverage scores and finding closure confirmation will be provided in an updated version of this report (v1.1).