TechnologyDec 5, 202513 min read

Identity Recovery Must Live Outside the Blast Radius of a Cyber Attack

 

Conditional Access policies flipped.

MFA bypassed.

Logs silently truncated.

Admin roles reassigned.

That's what the incident response team found when they opened the tenant. The attacker had come through on-prem Active Directory, pivoted via Entra Connect, and was now sitting at the top of the Entra ID hierarchy. 

Within minutes the identity control plane belonged to someone else.

The team reached for their recovery plan; scripts, exports, backup tools, break-glass accounts. That's when they discovered something uncomfortable. 

Every safety net they'd built lived in the same compromised tenant. 

Everything had failed in perfect symmetry.

Here's the thing: If your recovery lives inside the thing you're trying to recover, you're screwed. You don't have cyber resilience. You just have another copy of your corrupted data.

 

 

Symmetry vs. Resilience: Your Architecture Choice Determines Your Recovery

Security folks know this principle from cryptography: if your symmetric keys sit next to your encrypted data, you don't really have encryption. You have risk. Once an attacker compromises the system, they get both the ciphertext and the key.

The same logic applies to Entra ID. Most "backup" or "recovery" strategies (like scripts, exports, monitoring apps, and SaaS tools) get deployed inside the very tenant they're supposed to protect. When that tenant gets breached, the attacker inherits your production identity fabric AND your recovery plan too.

This is what I'd call symmetric recovery. You're depending on the compromised system to help you recover from its own compromise. Which sounds as crazy as it actually is.

But the failure isn't just technical, it's evidentiary. Symmetric recovery doesn't just fail operationally, it destroys the audit trail you need for compliance and investigation. When your recovery mechanisms live in the compromised tenant, you lose the ability to prove what happened. Audit logs get truncated or wiped. Configuration states become unreliable. The integrity of your forensic timeline disappears. You can't demonstrate to regulators, auditors, or your board what the attacker changed, when they changed it, or what the environment looked like before the breach. 

Modern attacks like Storm-0558 and Lapsus$ don't just steal data anymore. They target the control plane itself. They disable audit logs, weaken MFA enforcement, manipulate conditional access, and surgically eliminate privileged roles. If your recovery mechanisms live in that same environment, they're going to be seen, tampered with, or deleted. That's not theoretical. It's happened.

Real resilience requires asymmetry and tamper-evident evidence stored outside the blast radius.

 

 

The Tenant-Wide Kill Switch Problem

To understand why this matters, you need to understand what identity has become in a hybrid enterprise.

For most organizations, Microsoft Entra ID stopped being "just the directory" years ago. Now it governs access to:

  • Microsoft 365, Teams, and Exchange

  • Azure subscriptions and resources

  • Third-party SaaS via SAML or OIDC

  • Conditional Access, MFA, authentication methods

  • Device trust, hybrid join, compliance policies

  • Security roles, app consents, privileged identities
     

Entra ID has become the business control plane. And the Global Admin role? That's the kill switch.

A compromised Global Admin can:

  • Disable all Conditional Access policies

  • Reassign or remove privileged roles

  • Consent malicious apps with broad API permissions

  • Modify federation and hybrid configurations

  • Wipe or disable audit logs

  • Lock out other admins or "break glass" accounts


Your backups don't matter if you can't reach them. Your scripts don't matter if the automation account gets disabled. Your logs don't matter if they're truncated or wiped before you can export them. Once an attacker has the tenant, they can cut every cord.

This isn't theory. In one simulation, red teamers compromised on-prem AD, synced a rogue user, hijacked an active Global Admin session, and granted themselves GA rights in Entra—all without detection. They accessed sensitive HR, payroll, and IP data within hours. Their endpoint showed as healthy. EDR never saw them.

That's what tenant-wide compromise looks like. Fast, quiet, and total.

 

Three Patterns of False Comfort

Faced with this risk, security teams tend to lean on "backup plans" that feel safe. But they collapse under actual compromise.

1. The Snapshot Mirage

"We export configs and scripts regularly."

These exports—PowerShell scripts, JSON configs, Graph API calls—lack any real integrity guarantees. They drift over time, miss critical relationships, and often get stored in the same cloud environment. A Global Admin attacker can modify them. Or worse, falsify them. There's no way to prove what's real anymore.

2. The In-Tenant Trap

"We have a monitoring or backup app in the tenant."

That app registration, service principal, or automation account lives under the same Conditional Access and RBAC model as everything else. A Global Admin can disable it, consent new scopes, or just shut it down entirely.

3. The Memory Palace Fallacy

"We'll rebuild from runbooks or institutional knowledge."

You cannot reconstruct the precise state of Conditional Access policy ordering, authentication method enforcement, app consents, and role assignments from memory. Not at 3 a.m. Not during a board-facing incident. Not ever.

Each of these approaches gives you the illusion of preparedness—right up until the moment you actually need it.

 

The Hidden Tenant: Resilience by Design

More mature security teams are adopting a different architecture: a hidden, isolated tenant dedicated to identity recovery.

It's an architectural design principle that insists on a separate trust boundary, separate control plane, and separate blast radius.

Key characteristics of this architecture include:

  • Logically separate Entra tenant: The instance is managed independently of production
     

  • Minimal, delegated, read-only roles: These must live in the production tenant—never Global Admin
     

  • No long-lived secrets: Access only via app registrations with strict scopes and modern auth
     

  • Immutable, tamper-evident snapshots: Must be stored per-directory and outside the production trust domain
     

  • Isolated Roles: No Conditional Access policies, roles, or storage shared with production
     

This architecture lets you do something critical during a breach: compare and restore at the configuration level.

This way, you're not rebuilding the entire house. You're not performing a full-tenant restore that takes weeks and disrupts every user.

Instead, you're identifying exactly what the attacker changed; which Conditional Access policies were flipped, which roles were reassigned, which app consents were added, which authentication methods were weakened. 

Now, you can surgically roll back just those changes.

You can see what your tenant looked like 24 hours ago, before the breach. You can diff that baseline against the current compromised state. And then you can restore individual policies, roles, and configurations without touching anything else. 

You're replacing the locks the intruder tampered with, not tearing down the whole building.That's what reduces recovery time from weeks to hours.

 

 

Operational Isolation, Not Operational Burden

CISOs usually ask: "Wait, are we now managing two tenants?"

No. The vendor should operate and secure the recovery tenant. Your team shouldn't be managing a second security environment.

The right implementation uses:

  • Scoped app permissions: You delegate limited access from production
     

  • Per-directory resource isolation: Your recovery data isn't co-mingled with anyone else's
     

  • Built-in automation: You don't write or maintain scripts
     

This isn't added complexity. It's containment. And it actually reduces operational burden during recovery.

 

Breach Timeline: Before vs. After Hidden Tenant

Let's walk through the same incident with and without a hidden tenant.

 

Event

Without Hidden Tenant

With Hidden Tenant

GA account compromised

Attacker flips CA, disables monitoring

Attacker flips CA, but can't access hidden tenant

Logs

Audit logs wiped or truncated

External log snapshots preserved

Scripts & backups

SPN disabled, backups deleted

Snapshots exist in immutable, out-of-band storage

Admin response

Break-glass account also compromised

Recovery console operates from separate plane

Time to Recovery

3–6 weeks manual rebuild

Hours to verify, diff, and restore

Compliance exposure

No verifiable state

Tamper-evident deltas and audit trail

Executive impact

Board-level crisis

Contained identity incident

 

The difference isn't subtle. It's existential.

 

The Four Non-Negotiables of Identity Recovery Architecture

When you're evaluating your own approach, start with these four tests:

  1. Does your recovery plane live in a separate control plane? If it lives in your tenant, it's part of your risk—not your resilience.

  2. Are recovery permissions scoped and minimal? Global Admin access isn't a safety net. It's a liability. Use delegated roles only.

  3. Is recovery data immutable and segregated? Tamper-evident snapshots should be stored outside the tenant, segmented per directory.

  4. Can you surgically compare and restore configuration state? Not just users or objects, Conditional Access policies, role assignments, consents, authentication methods, ordering.

If the answer to any of these is "No" your recovery plan might not survive the breach it's meant to fix.

 

Resilience Is a Control Plane Decision

The most important choice you make about Entra ID recovery isn't what you back up. It's where your backup lives.

Organizations can recover in hours because they've separated trust boundaries, built asymmetric control paths, and practiced surgical rollback. Without that architecture,  you’re left rebuilding manually, under the scrutiny of regulators, customers, and boards.

As hybrid identity becomes the backbone of cloud and SaaS access, we can't afford recovery strategies that assume the control plane will stay safe. We have to assume breach.

When you're evaluating vendors, don't ask them if they "backup" Entra ID. Ask them to demonstrate the asymmetric architecture of their recovery plane. If their control plane lives in your tenant, they're part of your risk, not your resilience.

Next Step: See What Identity Recovery in Hours Actually Looks Like

Most recovery plans assume the control plane survives the breach. But as you’ve seen, compromise at the tenant level collapses both your protection and your recovery unless you’ve separated the two.

Want to see what true isolation looks like in action?

Download How to Get Business Back on Track in Hours, Not Weeks

Learn how security-forward organizations reduce hybrid identity RTO by up to 86% with logically air-gapped recovery, immutable snapshots, and automated rebuild of AD + Entra ID.

Related Articles

Blogs by This Author