AZ-305 Learning Portal
Objective 3.1 45 minhigh priorityazure-backupsite-recoveryrecovery-services-vaultrtorposoft-deleteimmutable-vaultrecovery-plans

3.1 — Design Solutions for Backup and Disaster Recovery

Design backup and disaster recovery solutions using Azure Backup, Azure Site Recovery, and Recovery Services Vault to meet specific RTO and RPO requirements for VMs, databases, and multi-tier applications.

Concept — What & Why

Backup and Disaster Recovery Fundamentals

Recovery Time Objective (RTO)The maximum acceptable downtime — how long the business can tolerate service unavailability. Lower RTO requires faster failover mechanisms (Site Recovery, active-active) and costs more. Achievable via replication (minutes) or restore-from-backup (hours to days).Recovery Point Objective (RPO)The maximum acceptable data loss measured in time — how much data the business can afford to lose. Lower RPO requires more frequent snapshots or continuous replication. Crash-consistent recovery points every 5 minutes (Site Recovery); application-consistent every 1–24 hours.Azure BackupA managed backup service for VMs, SQL Server databases, on-premises servers, file shares, and blobs. Provides incremental backups, instant restore snapshots (1–5 day retention), soft delete (14–180 days), immutable vault, and cross-region restore (CRR) for multi-region compliance.Azure Site RecoveryOrchestrates replication and failover for VMs (Azure-to-Azure, on-premises to Azure, VMware, Hyper-V, physical). Provides crash-consistent recovery points every 5 minutes and application-consistent every 1–24 hours. Non-disruptive test failover validates DR without affecting production.Recovery Services VaultThe central management resource for both Azure Backup and Site Recovery. Stores backup data and replication metadata, manages retention policies, provides soft delete protection, and supports immutable vault configuration for compliance.

Backup vs. Replication: When to Use Which

ApproachRTORPOCostBest For
Backup onlyHours–DaysHours (daily)LowNon-critical, compliance archiving
Site Recovery replicationMinutes (less than 1 hour SLA)5 minutes (crash-consistent)MediumMission-critical workloads
Active-active + replicationSecondsZero (sync)HighHighest-criticality, zero tolerance

Recovery Point Types

TypeIntervalBest For
Crash-consistentEvery 5 minutes (automatic)Stateless apps, filesystems
Application-consistentEvery 1–24 hoursDatabases (SQL, SAP HANA)
Instant restore snapshotsOn backup scheduleFast local VM restore
Deep Dive — How It Works

DR Architecture Patterns

Recovery Plan Design for Multi-Tier Applications

Recovery plans support up to 7 groups with sequential failover. Database must be online before app tier connects; app tier must be healthy before web tier receives traffic. Manual actions between groups can run runbooks or validation scripts.

Azure Backup Key Features

FeatureDescriptionDesign Implication
Instant RestoreVM snapshots (1–5 days) retained locallyFastest restore; separate from vault backup
Soft Delete14–180 day recovery window for deleted backupsProtect against accidental deletion
Immutable VaultLock vault to prevent modification or deletionCompliance (ransomware protection)
Cross-Region Restore (CRR)Restore to secondary regionRegional DR; must be explicitly enabled; increases cost
Multi-Tier RetentionDaily/weekly/monthly/yearly policiesCompliance archiving (7-year, 10-year)

CRR Important Note: Cross-Region Restore must be explicitly enabled on the vault — it is NOT enabled by default. Enabling CRR increases vault storage cost because backup data is replicated to the secondary region.

Site Recovery: Extensions Are NOT Replicated

Site Recovery replicates VM disk data but does NOT replicate VM extensions (SQL IaaS Extension, monitoring agents, antivirus, custom script extensions). These must be installed post-failover via:

  • Runbooks in the recovery plan (automated)
  • Manual installation scripts
  • Azure Policy DeployIfNotExists on the target resource group

Cost-Optimized Compliance Backup Strategy

For 7-year retention compliance with minimal cost:

  • Daily retention: 30 days (short-term, interactive)
  • Weekly retention: 52 weeks (1 year)
  • Monthly retention: 12 months
  • Yearly retention (archive): 7 years at archive tier pricing (significantly cheaper)

This tier-based approach avoids paying interactive prices for data that's only accessed once per year for audits.

Hands-On Lab

Hands-On: Configure Azure Backup and Site Recovery

Step 1: Create Recovery Services Vault

  1. Navigate to Recovery Services vaults > Create
  2. Configure name, subscription, resource group, region
  3. Review + create

Step 2: Configure Vault Settings

  1. Open vault > Settings > Backup Configuration:
    • Storage redundancy: Geo-Redundant (GRS) for DR capability
    • Enable Soft Delete (configure 14–180 days)
    • Enable Immutable vault for ransomware protection
  2. Enable Cross-Region Restore if multi-region restore is required

Step 3: Enable VM Backup

  1. Open vault > Backup > Azure > Virtual machine
  2. Select VMs to protect
  3. Create backup policy:
    • Frequency: Daily
    • Retention: 30 days daily, 52 weeks weekly, 12 months monthly, 7 years yearly
    • Instant restore: 3 days
  4. Click Enable Backup — initial backup starts immediately

Step 4: Configure Site Recovery for Azure VM

  1. Open vault > Site Recovery > Prepare infrastructure
  2. Source region: e.g., East US; Target region: e.g., West US
  3. Configure replication policy:
    • Recovery point retention: 24 hours
    • App-consistent snapshots: Every 4 hours (requires VSS enabled on Windows VMs)
  4. Select VMs and enable replication
  5. Create Recovery Plan:
    • Group 1: Database VMs
    • Group 2: Application VMs
    • Group 3: Web VMs
    • Add runbooks between groups for health validation

Step 5: Run Test Failover (Non-Disruptive)

  1. Open Recovery Plan > Test failover
  2. Choose: Latest processed recovery point (fastest failover)
  3. Select target virtual network
  4. Click OK — Azure creates test VMs in target region without interrupting production
  5. Validate: Test connectivity, application functionality
  6. Click Cleanup test failover to remove test VMs
Exam Angle — What AZ-305 Tests

AZ-305 Exam Focus

AZ-305 tests your ability to design backup + DR solutions that meet stated RTO and RPO requirements. The exam frequently tests understanding of the difference between backup and replication, soft delete vs. immutable vault, and recovery point selection decisions.

Exam Trap

Backups Equal Disaster Recovery: Backups protect against data loss but do NOT provide fast failover. Restoring a VM from backup takes hours. Replication-based approaches (Site Recovery) achieve RTO in under 1 hour. If a scenario has tight RTO requirements (30 minutes, 1 hour), backup alone is insufficient.

Exam Trap

Site Recovery Replicates All VM Configuration: Site Recovery does NOT replicate VM extensions (SQL IaaS Extension, monitoring agents, antivirus). These must be installed post-failover via runbooks in the recovery plan. Forgetting this causes services to fail to start after failover.

Exam Trap

CRR Is Default: Cross-Region Restore is NOT enabled by default on Recovery Services vaults. It must be explicitly enabled and increases storage cost. Enable CRR only for mission-critical workloads requiring regional failover capability from backup data.

Exam Trap

Soft Delete Prevents All Loss: Soft delete retains deleted backups for 14–180 days but does NOT prevent permanent deletion after the window expires. Immutable vault adds a time-based lock that prevents modification or deletion during a configured retention period — required for ransomware protection scenarios.

Exam Tip

Recovery Point Selection for Failover: "Latest" recovery point has lowest RPO but higher RTO (must process pending replication data). "Latest processed" has slightly higher RPO but significantly lower RTO (uses already-processed data). In unplanned outages where minimizing downtime is critical, "Latest processed" is the better choice.

Must Memorize

App-Consistent vs. Crash-Consistent: Use app-consistent snapshots for databases (SQL Server, SAP HANA) to ensure consistent database state. Use crash-consistent for stateless VMs and filesystems. App-consistent uses VSS on Windows (VSS must be enabled) and is limited to every 1+ hours. Crash-consistent is automatic every 5 minutes.

Question — click to flip

Q: What is the difference between RTO and RPO?

Question — click to flip

Q: When is Azure Backup alone insufficient for disaster recovery?

Question — click to flip

Q: What is the difference between soft delete and immutable vault in Recovery Services?

Question — click to flip

Q: In a Site Recovery recovery plan, why is the database group placed first?

Question — click to flip

Q: What is Site Recovery test failover and why is it important?

Question — click to flip

Q: Why does Site Recovery require runbooks or manual steps to install VM extensions after failover?

Sources & Further Reading