Red Team: Definition and Use in Compliance
Red team is an adversarial testing practice in which an independent group simulates realistic attack scenarios against an organization's security controls, processes, and personnel to identify exploitable gaps before genuine threat actors find them.
What is Red Team?
A red team is an independent group that simulates realistic adversarial attacks against an organization's defenses, with the goal of finding exploitable weaknesses before genuine threat actors do. The defining characteristic is objective-driven scope. A red team doesn't receive a list of systems to test; it receives a goal. "Gain access to the wire transfer system." "Demonstrate you can suppress compliance alerts." "Exfiltrate customer PII." The team works out how to get there, same as a real attacker would.
This matters because scope-based penetration testing, however thorough, answers a different question. Penetration testing tells you whether a specific system has known vulnerabilities. Red teaming tells you whether a determined adversary can achieve a damaging outcome using the combination of weaknesses your organization has right now. Those are different questions, and regulated financial institutions need answers to both.
In practice, red teams operate across three attack surfaces simultaneously. Technical: network penetration, cloud misconfiguration, application exploits, API abuse. Human: spear-phishing specific employees, vishing the help desk, social engineering vendors. Physical: tailgating into restricted areas, USB drop attacks in car parks. Sophisticated adversaries combine all three. A red team that only tests technical defenses misses how most real breaches start.
The regulatory definitions are now precise. The European Central Bank's TIBER-EU framework, published in 2018, defines a Threat-Led Penetration Test as one that "mimics the tactics, techniques and procedures of real threat actors" based on institution-specific threat intelligence. The Bank of England's CBEST framework, introduced in 2013, follows the same design. DORA (EU Regulation 2022/2554), applicable from January 2025, made threat-led penetration testing a statutory obligation for significant financial entities, citing TIBER-EU as the reference methodology.
Red team findings feed into Operational Resilience planning. If a team demonstrates it can disrupt a payment processing service in under 72 hours, the firm needs to reassess whether its Impact Tolerance for that service is realistic and whether the controls behind it are actually effective.
How is Red Team used in practice?
Running a red team exercise in a financial institution is a structured process. Eight to sixteen weeks from initial scoping to final report is typical for a full-spectrum exercise; TIBER-EU and CBEST assessments often run longer because of the intelligence-gathering phase.
Scoping comes first. The CISO defines the primary objective and acceptable attack surfaces. Legal and compliance confirm what's permitted under local law, particularly for social engineering scenarios that involve employees. For TIBER-EU exercises, an accredited Threat Intelligence provider builds a Targeted Threat Intelligence report: a detailed profile of the threat actors most likely to target this specific institution, their tools, their past campaigns, their preferred entry points. This isn't generic sector intelligence. It's a dossier built on the actual threat landscape facing this firm.
A separate accredited red team provider uses that report to design realistic attack scenarios. They're not attempting everything possible; they're attempting what a specific adversary would attempt. That discipline keeps findings actionable.
During the exercise, the red team operates with minimal constraints. In one documented CBEST exercise (described in the Bank of England's 2019 thematic review of CBEST outcomes), a red team combined a spear-phishing email targeting a junior analyst with a watering-hole attack on an industry forum that analyst was known to visit, achieving persistent internal network access within five days. That specific finding would never surface in a scope-limited test.
After the exercise, the Blue Team receives a full attack timeline and debrief. In Purple Team format, red and blue teams work together in real time, which accelerates detection engineering but produces less realistic assessment of actual detection capability.
The MLRO's involvement at the scoping stage is non-negotiable for financial institutions. An attacker with access to a transaction monitoring console who can adjust alert thresholds can suppress Suspicious Activity Report (SAR) generation without touching anything a network security team would normally monitor. Testing that attack path requires AML expertise to define what "success" looks like from the adversary's perspective.
Red Team in regulatory context
The regulatory framework for red teaming in financial services is now well-developed in the EU and UK, and developing in the US.
The Bank of England introduced CBEST in 2013, making the UK one of the first jurisdictions to require intelligence-led adversarial testing for systemically important financial institutions. The FCA and PRA jointly govern the program. In-scope firms, including major banks, insurers, and payment system operators, run CBEST assessments on approximately a three-year cycle. The Bank of England published a thematic review of CBEST outcomes in 2019, which found that credential theft via social engineering was the most common successful initial access vector across assessed firms.
TIBER-EU, published by the European Central Bank in 2018, created a pan-European standard with a three-party model: the institution, a separate threat intelligence provider, and a separate red team provider. National competent authorities across EU member states adopted localized versions: TIBER-NL, TIBER-FR, TIBER-DE, TIBER-BE, and others. Each version follows the same governance structure but is supervised by the relevant national authority.
DORA elevated red teaming from voluntary best practice to statutory requirement. Article 26 of Regulation EU 2022/2554 requires significant financial entities to conduct threat-led penetration testing at least every three years. Entities operating critical systems across multiple member states must coordinate with the relevant competent authorities. Non-compliance triggers supervisory consequences under DORA's enforcement regime.
In the US, there's no single DORA-equivalent mandate, but the FFIEC Cybersecurity Assessment Tool and OCC guidance both treat adversarial testing as a component of mature security programs. New York's Department of Financial Services Cybersecurity Regulation (23 NYCRR Part 500) requires annual penetration testing and biennial vulnerability assessments; a red team exercise satisfies the penetration testing requirement for firms with advanced programs.
AI Governance expectations are extending into this space. The ECB's guidance on AI in banking (2024) explicitly references stress testing of model behavior under adversarial conditions. As AI systems take on substantive roles in Transaction Monitoring and fraud detection, supervisors increasingly expect firms to test those systems adversarially, not just validate them statistically. A model that performs well on historical data can behave differently when an attacker understands its feature weights.
Common challenges and how to address them
Red team exercises in financial institutions fail in predictable ways.
Scope that's too narrow. A firm that defines "in scope: the public-facing web application" learns nothing about how an attacker moves from a compromised endpoint to a sensitive system. Attackers don't respect scope boundaries. The exercise should include vendor access paths, internal networks reachable from compromised devices, and human targets (employees, contractors, help desk). This requires HR and legal sign-off, which is friction worth accepting. The one constraint that does belong in scope: don't take down production systems that would cause customer harm.
Blue team awareness killing realism. If too many people know an exercise is running, defenders unconsciously raise their alert levels. TIBER-EU handles this with a strict "white team" structure: only a small governance group (typically CISO, CRO, and one board member) knows the exercise is live. Operations staff work normally, which is the point. The Bank of England's CBEST governance requirements follow the same model.
Findings that don't reach decision-makers. A red team report that stays inside the security team is a missed opportunity. Findings that demonstrate an attacker could disrupt a Critical Business Service or suppress compliance reporting belong in front of the board risk committee. Some firms now include red team executive summaries as a standing quarterly item.
AI and model security is undertested. Most financial institutions red team their networks competently but don't test their AI systems adversarially. A fraud detection model can be manipulated by an attacker who understands its feature set and knows how to craft transactions that fall below alert thresholds. Model Risk Management (MRM) programs should include adversarial testing protocols alongside standard Model Validation. This adds time to model deployment, but the improvement in detecting evasion is worth it.
No formal remediation tracking. Red team findings have a short shelf life if remediation isn't tracked formally. Assign each finding an owner, a deadline, and a scheduled retest date. DORA requires documented evidence of remediation for TLPT findings. Maintaining a clean Audit Trail of what was found, what was fixed, and when it was verified is both good practice and a regulatory expectation.
Related terms and concepts
Red teaming intersects with a cluster of security, resilience, and model risk concepts that compliance teams need to understand together.
Blue Team is the defender group in a red team exercise. The blue team operates normally during the exercise without knowing it's running. After the exercise, blue team members receive the full attack timeline and use it to improve detection rules, incident response playbooks, and threat hunting workflows. In some exercises, the blue team's detection performance is itself a formal finding.
Purple Team format brings red and blue teams together in real time. The red team executes an attack technique; the blue team attempts to detect and respond; both teams iterate and improve detection logic together. Purple Team exercises accelerate knowledge transfer but are less realistic for assessing actual detection capability than a blind red team exercise.
Tabletop Exercise is a discussion-based simulation rather than a live attack. Tabletop Exercise tests the quality of decision-making under a simulated crisis scenario. It's a different tool from red teaming: tabletop tells you whether your response process works; red team tells you whether your defenses work.
Kill Switch in an AI context is the mechanism to halt an autonomous agent if its behavior deviates from acceptable bounds. Kill Switch reachability and effectiveness should be explicitly tested in any red team exercise targeting AI systems. If an attacker can disable or bypass the kill switch, the entire control assumption breaks.
Third-Party Risk Management (TPRM) intersects with red teaming because vendor access is one of the most common real-world initial access vectors. A red team that successfully exploits a vendor's legitimate credentials has found a TPRM failure, not just a security failure. Those findings should go directly into the vendor risk review cycle.
Human-in-the-Loop (HITL) controls in automated compliance workflows are increasingly red team targets. If a determined attacker can route around the human review step that acts as a control on automated fraud or AML decisions, the control is effectively absent. Testing that specifically requires the MLRO and CISO to design the scenario together.
Where does the term come from?
The term comes from Cold War US military practice. "Red" designated adversary forces (Soviet bloc) in war games. The NSA and Department of Defense used red teams to stress-test plans against adversarial assumptions.
In financial services, the term entered regulatory vocabulary through the Bank of England's CBEST framework in 2013, the first jurisdiction to formalize intelligence-led red teaming for systemically important financial institutions. The European Central Bank's TIBER-EU framework (2018) standardized the methodology across EU member states. The definition shifted from scope-based IT penetration testing toward full-spectrum operational attack simulation, a shift codified in DORA (Regulation EU 2022/2554), which became applicable January 17, 2025.
How FluxForce handles red team
FluxForce AI agents monitor red team-related patterns in real time, flag anomalies for analyst review, and generate evidence-backed decisions with full audit trails.