3q03-hou
TRANSCRIPT
-
7/31/2019 3q03-Hou
1/5
Storage area networks (SANs) offer an effective means
of storing and sharing data. As the amount of data
stored on a SAN increases, however, backup windows
lengthen and disaster recovery requires more time. EMC
SnapView 2.1 and MirrorView 1.7 storage management
software can help facilitate efficient backups and disaster
recovery in Dell | EMC SAN environments.
To demonstrate how IT administrators can improve SAN
management in a typical data center environment using
EMC storage management software, this article presents a
scenario using a fictional company called Acme. In this sce-
nario, Acme has a local data center that uses a Dell |EMC
SAN for consolidated, redundant, high-availability storage.
In addition, the company has heterogeneous servers shar-
ing the storage and tape library.
To protect the companys valuable business data, the
Acme IT department implemented a disaster recovery
plan. The plan involved deploying a fully redundant
Dell|EMC SAN, which allows Acme to achieve high avail-
ability at the hardware level. To prevent failures at the oper-
ating system (OS) and application levels, the IT department
deployed the companys main application in a Microsoft
Cluster Service (MSCS) environment. Every application
server that is connected to the SAN has two host bus
adapter (HBA) cards to provide redundancy and increase
bandwidth. Acme connected a tape library to the SAN for
backup and restore; tapes are stored in a remote site after
tapes at the production site are backed up and verified.
The remote site also acts as a disaster recovery site for
the primary (production) site, and the IT department
established a secondary SAN for the applications running
at the remote location.
Currently, Acme faces three business challenges:
Increasing backup window: As the databasegrows, the backup window will soon exceed the
time available for the daily backups.
Lengthy time to recovery: The length of timerequired for disaster recoveryreferred to as mean
time to recovery (MTTR)is growing because
larger databases require longer restore times.
The companys main application is inaccessible
during restoration.
Overhead on the production environment: Acmeuses its production database to perform application
development work; however, developer access to
the production database creates overhead on the
database engine. Performing online backups also
incurs overhead that affects the performance of the
production servers.
POWER SOLUTIONS August 200368
Data Replication and Recovery Using
EMC SnapView and MirrorViewEMC SnapView 2.1 and MirrorView 1.7 software help administrators protect and
recover their data in Dell|EMC storage area networks (SANs). This article explains the
differences between these applications, demonstrating how they can be used to achievehigh availability in a disaster recovery environment, reduce backup windows, and
remove the processing overhead for backups from production servers.
S T O R A G E E N V I R O N M E N T
B Y R I C H A R D H O U , S T E V E F EI BU S , A N D P A T TY Y O U N G
-
7/31/2019 3q03-Hou
2/5
To resolve these issues, Acme decided to update its disasterrecovery plan. The Acme IT department connected the primary-site
SAN with the secondary-site SAN, as shown in Figure 1. The
company plans to use SnapView snapshots and clones to create
replicas for online backups and for development use. Acme will use
MirrorView software across the Dell| EMC SANs to create remote
copies for disaster recovery. Using SnapView and MirrorView will
also enable Acme to create a plan for recovery at the file, logical
unit number (LUN), and array levels, as well as to complete online
backups without affecting the production environment.
Creating snapshots and clones for backups using SnapView
SnapView 2.1 creates either a virtual point-in-time copy (snapshot)
of the original data or a full, physical point-in-time copy (clone) of
the original data. Currently, SnapView is supported on Dell | EMC
FC4700-2, CX400, and CX600 storage arrays as a nondisruptive
upgrade, meaning that the software can be added at any time with-
out disturbing the production environment.
Snapshots depend on source LUN
At Acme, a long backup window consumes resources from the data-
base engine and creates overhead on the production environment.
The necessity for Acme development engineers to access the production
database for development work contributes to these two problems.
SnapView can help resolve both of these business issues.
Using SnapView, administrators can create up to eight point-
in-time snapshots of a LUN, which can subsequently be made
accessible to as many as eight hosts. For example, the Acme SAN
administrator can make a snapshot accessible to a backup server,
allowing the production server to continue processing without the
downtime traditionally associated with backup processes. An admin-
istrator also can create additional snapshot sessions for use by
the development engineers without affecting the data on the pro-
duction, or source, LUN.
The snapshot feature uses a cache-and-pointer design, where achunk map table keeps track of data chunks (groups of blocks)
based on their state at a given time. As the first write request to a
block is made to the source LUN, the chunk to be modified is copied
to a snapshot cache on private LUNsa process known as copy on
first write (COFW). The source LUN, the snapshot cache, and the
chunk map table work together to create the virtual snapshot LUN.
The snapshot LUN is an exact copy of the production LUN, and
thus the snapshot must be accessed by a different host, such as a
development or backup server. The backup server can read from
and write to a snapshot LUN, but any changes made to the
snapshot LUN do not replicate back to the source LUN. When
the snapshot session is deactivated, the virtual snapshot LUN will
be invisible to the server.
As Figure 2 indicates, every source LUN can have as many as
eight sessions and eight snapshots. Snapshots have a one-to-one
relationship with a server. Each snapshot must be assigned to a
different server, whereas sessions can be related to any server,
depending on which session is activated and when it is activated.
The most common use of a snapshot is to produce a backup
copy of a large database. Performing an online backup of a data-
base can help to shorten the backup window without interrupting
STORAGE ENVIRONMENT
www.dell.com/powersolutions POWER SOLUTIONS 69
NAS serverStand-alone
application serverClustered application server group
Primary storage array
Primary site
Backup/restoreserver
Remoteapplication server
Remote storage array
Secondary site
Remote backup/restore tape library
Figure 1. Primary and secondary SANs connected for disaster recovery
One sourceLUN
Multiplesnapshot cache
Up to eightsessions
Up to eightsnapshots
Up to eightservers
Possible relationshipbetween snapshot
LUN and the session
Storage group 1
Storage group 2
Storage group 3
Storage group 8
Figure 2. Source LUN, session, and snapshot LUN relationship
-
7/31/2019 3q03-Hou
3/5
STORAGE ENVIRONMENT
POWER SOLUTIONS August 200370
production access to the database. However, online backups create
overhead on the production database server, sometimes even requir-
ing that the database be stopped during the backup window. A
SnapView snapshot allows the database to be replicated instanta-
neously. The replica can then be used for online backups, as well
as for development work, without putting additional overhead onthe application server.
SnapView snapshots also improve and simplify file-level recov-
ery. Administrators can maintain a repository of snapshot sessions
across multiple days on the network attached storage (NAS) server
connected to the SAN, as shown in Figure 3. If, for example, a user
wants to access files from the Friday snapshot session, the SAN
administrator can simply activate the Friday session and share that
snapshot LUN with the user. The user can then retrieve the needed
files by copying files from the snapshot LUN to the source LUN.
Clones produce full, independent copies
Although SnapView reduces the backup window and removes
backup overhead from the production server, the snapshot fea-
tures cache-and-pointer design means that snapshot LUNs depend
on the existence of the source LUN. If the source LUN is damaged
or destroyed, administrators would need to rebuild the source LUN
and recover the data from tape or another backup medium (assum-
ing that the local Dell | EMC storage array is still up and running).
The MTTR after such an event might take hours depending on the
size of the LUN and the speed of the tape technology. For a com-
pany requiring fast disaster recovery, as in the Acme scenario,
snapshot LUNsthat is, virtual LUNsare not an ideal solution.
To decrease MTTR, administrators can use the SnapView clone
function to create LUN copies that are independent of the source
LUN. Unlike snapshots, which are point-in-time views of a
source LUN, clones are synchronous copies of the source LUN.
Each clone LUN consumes exactly the same amount of physical
space as the source LUN. Essentially a local mirror of the source
LUN, a clone offers high availability and can withstand storage
processor failures or source LUN failures, as well as path failures,
provided that EMC PowerPath or Application Transparent Failover
(ATF) software is installed and properly configured. Clones, there-
fore, are business continuance volumes (BCVs).
To create a clone, the initial data is copied, or synchronized,
to the clone (see Figure 4). During synchronization, any host write
requests made to the source LUN are copied to the clone. Once the
clone is 100 percent synchronized, it is fractured manually at a point
in time to create a stand-alone BCV that is independent of
the source LUN. Servers cannot access the clone LUN until it is
fracturedthough application I/O can still access the source
LUN during synchronization.
Resynchronization can occur in either direction. To recover data
from the clone to the source LUN, administrators can use the
reverse synchronization feature while I/O continues to the source
LUN. A clone becomes available for read and write access once it
is fractured. Administrators also can access a clone by creating
a snapshot and then assigning the snapshot to a second server
storage group as long as the snapshot is in a different storage
group than the source LUN. This manner of implementation not
only removes the overhead on the server, but it also enables thesource LUN to access snapshots without I/O overhead.
After synchronization and fracturing, a clone becomes a fully
populated, physical copy of its source LUN. Because clones are not
pointer-based replicas, they are not affected by the COFW perfor-
mance penalty; the data is replicated to the clone instead of being
copied to nonvolatile memory along with the modified chunks.
This process results in lower performance overhead for clones
than snapshots.
A clone is commonly used in environments that require quick
MTTR or online backups based on the point-in-time copies that have
zero impact on the production data. A server can read from and
write to a fractured clone without affecting the source LUN. Also,
resynchronizing the clone is fast because clones use a space in
memory called the clone private log (CPL) to keep track of the
changes that occur after they have been fractured. For efficiency,
100 percent resynchronization is avoided; only post-fracture changes
are resynchronized.
Enabling array-level disaster recovery through MirrorView
The Acme disaster recovery plan protects critical business data
by outlining a procedure for recovery when the primary site is
Productionstorage group
Snapshot LUN
Clone group(Up to eight clones)
Fracture aftersynchronization
Figure 4. Clone creation and access
Source LUN
Monday 6:00 P.M. session
Tuesday 6:00 P.M. session
Wednesday 6:00 P.M. session
Thursday 6:00 P.M. session
Friday 6:00 P.M. session
NAS server
Snapshot LUN
Backup/restoreserver
Local area network
Figure 3. File recovery from a snapshot LUN to the source LUN
-
7/31/2019 3q03-Hou
4/5
STORAGE ENVIRONMENT
POWER SOLUTIONS August 200372
down. The plan also addresses the replication of data from the
primary location to the secondary location so that applications run-
ning at the secondary site can access the same business data. To
implement these processes, the Acme scenario uses the EMC
MirrorView add-on software option. MirrorView is similar to the
SnapView clone option, but works between Dell | EMC arrays
instead of within a single array. Because MirrorView is array-
based software, it does not use server I/O or CPU resources, and
it supports all of the operating systems used on the array.
Provision for disaster recovery is the major benefit of
MirrorView mirroring. As shown in Figure 5, multiple arrays in dif-
ferent locations can mirror to a common disaster recovery site,
which makes it the central mirroring site for disaster recovery. If a
disaster cripples the primary site, a MirrorView secondary image can
be used to recover data and operations at the disaster recovery site.
MirrorView runs redundantly across arrays. If one storage
processor fails, MirrorViewrunning on the other storageprocessorwill take ownership of the mirrored LUNs. If the host
can fail over I/O to the remaining storage processor (using
PowerPath software), then mirroring will continue as normal.
After the primary-site array has been recovered, the data at the
secondary site can be synchronized back to the primary site.
Although the mirrored target cannot be directly assigned to a
server while it is acting as a mirrored target, SnapView software
can be used to take a snapshot of the secondary mirrored LUN and
then assign the snapshot to the servers on the secondary
site for immediate access, even if the two sites are mirroring.
MirrorView mirroring is synchronous, thus the longer the distance,
the longer the delay, because the application must wait for a
commitment to be returned from the remote array. For disaster
recovery, primary and secondary storage systems should be
relatively far apart (within 10 km) and connected through dedicated
redundant pairs of fiber-optic cabling for Fibre Channelbased
mirroring. For longer distances, other solutions exist.
MirrorView can ensure that data from the primary storage
system replicates to the secondary array (see Figure 6). The host
(if any) connected to the secondary array might normally sit idle
until the primary site fails. With SnapView at the secondary site,
the host at the secondary site can take snapshot copies of the
mirror images (that is, secondary LUNs) and back them up to other
media. This technique provides point-in-time snapshots of pro-
duction data with little impact to production server performance.
MirrorView provides a synchronous mirroring solution, which
can help ensure that any write to the primary array also is com-
mitted on the secondary array before the production server gets an
acknowledgment. Although this technique is commonly imple-
mented on most mirroring technologies, it also requires that latency
between two storage arrays be calculated and considered to pre-
vent any performance degradation. Currently, MirrorView runs
through either Fibre Channel (using dedicated fiber-optic cables)
or Fibre Channel over IP (using routers and sufficient dedicatedbandwidth on an IP wide area network, or WAN).
Selecting the appropriate data-protection strategy
SnapView snapshots, SnapView clones, and MirrorView mirrors
provide different levels of data protection. Snapshots are most likely
to be used in a parallel processing environment to provide online
backups or file-level recovery, whereas clones and mirrors are more
often used in disaster recovery situations.
Clones may be used for fast recovery of local corrupt LUNs;
clones support read and write access to both source LUN and clone
once the clone has been fractured. Mirrors usually enable recov-
ery of arrays or sites. Mirrors also can be used to replicate data to
multiple sites, and then used with snapshots for remote access.
Mirroring provides read and write capability only to the source
LUN, but read and write access to the remote copy of the data can
be accomplished by using SnapView on the target array to take a
snapshot of the mirror.
To support either MirrorView or SnapView, administrators must
install the EMC Access Logix tool. This software masks source and
target LUNs to different servers to prevent LUN corruption.
Primary location A
Primary location B
Primary location C
Primary location D
Snapshot storage group
Disaster recovery site
Figure 5. Central mirroring for disaster recovery
Secondary location B
Secondary location A
Production storage group
Primary location A
Snapshot storage group
Snapshot of thesecondary
image
Snapshot storage group
Snapshot of thesecondary
image
Figure 6. Using MirrorView for data replication
-
7/31/2019 3q03-Hou
5/5
Combined solutions reduce backup window
and production server overhead
In the Acme scenario, administrators were able to use both SnapView
and MirrorView to solve the three business problems that the com-
pany faced. The company now uses its NAS server, to which any
user can map, for storing snapshots. This server enables adminis-
trators to recover data from a specific point in time without a large
backup window. The company created a local clone as a develop-
ment server for its main clustered application, removing overhead
from the production environment. Acme also mirrored its data to
the remote site and created a snapshot of the mirror to enable online
backups that will not affect the production environment.
Mirroring the companys main application to the remote site
provides quick MTTR and allows for remote backups in case of dis-
aster at the primary site. Through snapshots, data can be assigned
to servers at the remote location for other applications. Figure 7
provides a decision tree to help administrators choose the right repli-
cation and recovery tools for their own companys specific imple-
mentations.
Enabling comprehensive data-recovery plans
using EMC software
Dell |EMC SANs provide a reliable environment for data consoli-
dation. The optional SnapView and MirrorView software add-ons
enable administrators to create a comprehensive data-recovery plan
for different disaster scenarios. When administrators use the fea-
tures provided in SnapView and MirrorView, they enable online
development work or data mining to be performed without
affecting the production environment. These features also provide
a way to replicate data to multiple locations as well as maintain
data consistency.
Richard Hou ([email protected]) is a systems engineer and consultant for the Dell
Enterprise Technology and Education Center (ETEC), part of the Dell Enterprise Services
and Support Group, where he specializes in SAN and Microsoft solutions. Richard has an
M.S. in Electrical and Computer Engineering from The University of Texas at Austin and a
B.S. in Mechanical Engineering from Zhejiang University, Hangzhou, China.
Steve Feibus ([email protected]) has been a storage enterprise technologist in
the Advanced Systems Group at Dell for the past two years and was recently promoted to
manager of the Client Technologist team at Dell. Steve has a B.S. in Electrical Engineering
from the University of Florida and has spent many years solving customer storage issues
using the latest technologies and products.
Patty Young ([email protected]) is a storage enterprise technologist in the Advanced
Systems Group at Dell. She has been working with storage solutions for many years,
supporting field system consultants in architecting storage solutions for their customers and
providing feedback from customers to Dell regarding storage challenges and requirements.
Patty has a B.A. from North Carolina State University.
STORAGE ENVIRONMENT
www.dell.com/powersolutions POWER SOLUTIONS 73
CX400, CX600,or FC4700-2?
Yes
No
Singleor multiple
array?
Single
Multiple
What is thepurpose of the
data copy?
Snapshot
BCV, data replication,online backup,
and data recoverywithin array
Online backup,decision support,and testing for
instantaneous copy
Clone
Tape or third-partysolutions for
data replication
Data replicationacross arrays; BCV
on remote site
Customeroperatingsystems
Microsoft Windows 2000 Server,IBM AIX, Linux, Sun Solaris,
Novell NetWare, HP-UX
Arraysto be
utilized
CX400, CX600,FC4700-2
Distancebetween mirrored
locations
Mainframe
CX200,Dell PowerVault 660F,
Dell PowerVault 650F
STOPMirrorView
not a solution
Over 500 km
60 km500 km
10 km60 km
Up to 10 km
Up to 500 m
Up to 300 mFibre Channel-1
Fibre Channel-2
Fibre Channel LW-GBIC
Dense wavelength divisionmultiplexing (DWDM) extender
MirrorView IPor third-party
solution
MirrorViewFibre Channel
or MirrorView IP
Figure 7. Decision tree for selecting snapshot, cloning, or mirroring
F O R M O R E I N F O R M A T I O N
EMC: http://www.emc.com
Dell|EMC: http://www.dell.com/emc