Snapshot Management in VMware vCenter Site Recovery Manager

24Mar14

One of the differences between 7-Mode versus clustered Data ONTAP (cDOT) as it pertains to VMware vCenter Site Recovery Manager (SRM) is the array-based  replication (ABR) using NetApp SnapMirror®.  Before I dive into the different SRM fail-over scenarios and try to make sense of the what SnapMirror® snapshots are present, I think a  good place to start would be to identify the mapping for NetApp SnapShot® names.

What’s in a Name?

Snapshot names are broken into the following 4 parts.

snapmirror.dst_vsuuid_dstmisid.timestamp

  1. All SnapMirror® snapshots start with the word snapmirror
  2. Universally Unique IDentifier (UUID) of the destination Storage Virtual Machine (SVM) or Vserver:  (if you run ‘vserver show -fields uuid,name’ on the destination side you will see the UUID)
  3. Master data Set ID (MSID) of the destination volume: (if you run, ‘volume show -fields msid,name’ on the destination side you will see the MSID)
  4. GMT Timestamp:  matches the time at which the schedule was triggered on the destination. Snapshot is created a few second after this timestamp

snapshot

How many Snapshots?

SRM will always maintain 1 Snapshot copy at the source and 2 at the destination. Older snapshots are automatically deleted.  The following is an analysis of my snapshots on my SRM Protected and Recovery sites as I perform a the following operations:

  • Recovery (Planned Migration)
  • Reprotect,
  • and lastly a manual SnapMirror® update.

Scenario 1: Protected

In this scenario, Site 1 is the Protected site and the SRM Recovery Site is Site 2. Take note of the following:

  • Site 1 has 1 snapshot and Site 2 has 2 snapshots
  • Site 1 is the Protected site
  • The orange text shows the time-stamp of the most current SnapMirror® (Monday, 17 March, 19:43:10)
  • The green arrow indicates the SnapMirrored volume is active and replicated
Scenario 1

Scenario 1 – SRM protected site (site 1) replicating changes to Recovery Site (Site 2)

Scenario 2: Recovery Operation

In this scenario, an SRM Recovery Operation (Planned Migration) was performed.  SRM will performs the following tasks:

  1. Replicate VMs from Site 1 –> Site 2
  2. Gracefully shut down VMs in Site 1
  3. Replicate VMs from Site 1 –> Site 2 a second time to get a snapshot of the VMs in a powered off state
  4. Break the replication from Site 1 –> Site 2 making the storage in Site 2 writable
  5. Power on protected VMs in Site 2

Take note of the following after the Recovery Operation:

  • Site 1 has 1 snapshot and Site 2 has 2 snapshots
  • After the Recovery Operation, Site 1 is still the protected site
  • The orange text shows the time-stamp of the most current SnapMirror® (Monday, 17 March, 19:45:35)
  • The red arrow indicates, as expected, the Recovery Operation has broken the replication between Site 1 –> Site 2 and made Site 2 the active site. (Until a Reprotect Operation is run, changes at Site 2 will not be replicated to Site 1)
scen2-failover

Scenario 2 – A Recovery Operation breaks the SnapMirror and makes Site 2 storage writable.

Scenario 3: Reprotect

In this scenario, users have been failed over from Site 1 –> Site 2 and are making changes in their current production environment (Site 2).  To protect these changes, a Reprotect Operation is initiated.  This changes the direction of the SnapMirror and and replicates from Site 2 –> Site 1.  Take note of the following:

  • After a Reprotect Operation Site 2 is still the Protected Site.  This is the case until an additional Recovery Operation is run
  • The orange text shows the time-stamp of the most current SnapMirror (Monday, 17 March, 19:45:53)
  • The green arrow indicates that changes are being replicated from Site 1 –> Site 2
  • The Recovery site (Site 1) now has 3 snapshots. (wait, I thought you said SRM only keeps 2 snapshots in the Recovery Site.  Why are there 3?)  I’m glad you asked
  • Remember I said earlier, the first set of characters in the snapshot name (in purple) identifies the UUID of the destination Vserver.  This snapshot is left over from when Site 1 was the protected site which means that 82164144-5c5a-11e3-b603 is the UUID of Site 2. The very next time a manual or scheduled SnapMirror® update is run, this will automatically be purged and there will only be two remaining

scen3-reprotect2

Scenario 4: SnapMirror® Update

In this last scenario, I ran a SnapMirror® update from my destination (Site 1)

Command: cluster1::snapmirror> update -destination-path VS1:nyds0

Take note of the following:

  • SRM cleaned up my snapshots and only kept the two most recent
  • The green arrow still shows I am replicating from Site 1 <– Site 2.

scen4-SM2

If and when I decide I want to return my VMs to their original site (Site 1) I can run the same Recovery Operation in Scenario 2.

Advertisements


One Response to “Snapshot Management in VMware vCenter Site Recovery Manager”


  1. 1 #39 March 2014 Updates! - Datacenter Dude

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: