HP StorageWorks 12000 - Virtual Library System EVA Gateway Manual
HP StorageWorks 12000 - Virtual Library System EVA Gateway Manual

HP StorageWorks 12000 - Virtual Library System EVA Gateway Manual

Hp storageworks vls and d2d solutions guide (ag306-96028, march 2010)
Hide thumbs Also See for StorageWorks 12000 - Virtual Library System EVA Gateway:
Table of Contents

Advertisement

Quick Links

HP StorageWorks VLS and D2D Solutions
Guide
Design Guidelines for Virtual Tape Libraries with
Deduplication and Replication
This document describes the HP StorageWorks VLS and D2D systems and their concepts, including automigration,
deduplication, and replication, to help you define and implement your virtual tape library system. It includes
best practices for working with specific backup applications. This document is intended for use by system
administrators who are experienced with setting up and managing system backups over a SAN.
*AG306-96028*
Part number: AG306-96028
Seventh edition: March 2010

Advertisement

Table of Contents
loading

Summary of Contents for HP StorageWorks 12000 - Virtual Library System EVA Gateway

  • Page 1 HP StorageWorks VLS and D2D Solutions Guide Design Guidelines for Virtual Tape Libraries with Deduplication and Replication This document describes the HP StorageWorks VLS and D2D systems and their concepts, including automigration, deduplication, and replication, to help you define and implement your virtual tape library system. It includes best practices for working with specific backup applications.
  • Page 2 Legal and notice information © Copyright 2005, 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty.
  • Page 3: Table Of Contents

    Contents 1 Introduction ..................13 2 Concepts ..................15 Disk-based Backup and Virtual Tape Libraries ................15 Problems Addressed by Virtual Tape Libraries ................. 15 Integration of Disk in Data Protection Processes ..............15 Where Virtual Tape Fits in the Big Picture ................16 HP VLS and D2D Portfolio ....................
  • Page 4 LAN-free Backups ........................37 Retention Planning ........................37 Future Data Growth ........................38 Considerations for Copies ......................38 Copy to Physical Tape through the Backup Application ............39 Media Server Considerations ..................40 Benefits of Copying to Physical Tape through the Backup Application ......... 40 Considerations for Copying to Physical Tape through the Backup Application ......
  • Page 5 Dynamic Deduplication Implementation ................. 70 Restoring Data ........................71 Housekeeping ........................72 D2D Replication ........................73 How it Works ........................73 D2D Replication Implementation ................... 73 Licensing ........................74 Implementing the Initialization Process ................74 Replication Setup ......................76 Reporting ........................82 Design Considerations ......................
  • Page 6 Echo Copy Concepts ......................141 Implementation ........................ 141 Echo Copy Pools ......................141 Automigration Policy ....................142 Automigration Setup ....................143 Design Considerations ...................... 145 Automigration Use Models ..................145 Sizing the Tape Library ....................147 Restoring from Automigration Media ................... 148 VLS Accelerated Deduplication ....................
  • Page 7 NetBackup Deduplication Guidelines ................198 NetBackup Import Example Script ................198 IBM TSM ......................... 199 TSM General Guidelines .................... 199 TSM Deduplication Guidelines ..................200 TSM Useful Queries ....................203 EMC NetWorker ......................204 Networker General Guidelines ..................204 Networker Deduplication Guidelines ................206 6 Support and Other Resources ............
  • Page 8 Figures Common Backup Technologies ................. 16 HP Virtual Tape Library Product Range ............... 17 Basic Write-to-disk Setup ..................19 Unique Backup Data ....................20 Enterprise Deployment with Small and Large Remote and Branch Offices ......24 Remote Site Data Protection Before Replication ............25 Remote Site Data Protection Using Replication ............
  • Page 9 Example Schematic ....................78 Example 1: 100 GB Virtual Cartridge Replication over a 2 Mb/sec Link with 100% Bandwidth Utilization ....................85 Example 2: 100 GB Virtual Cartridge Replication over a 2 Mb/sec Link with 25% Bandwidth Utilization ....................86 Example 3: 500 GB Virtual Cartridge Replication over a 2 Mbit/sec Link ......
  • Page 10 Backup Analysis ....................150 Physical versus Logical Data ................... 150 Backup Sessions with Duplicate and Unique Data ............ 152 VLS9000 and VLS12000 node oversubscription ............158 How Replication Fits into the Deduplication Architecture ..........162 The Replication Sequence ..................163 Replication Scaling ....................
  • Page 11 Tables VLS Compared to Application-based Write-to-disk ............19 HP Deduplication Solutions ..................21 1 TB File Server Backup ................... 22 Deduplication Ratio Impact ..................22 Estimated Time to Replicate Data for a 1 TB Backup Environment at 2:1 ......26 D2D Technical Specifications ..................
  • Page 12 Recommended Settings for each Backup-archive Client's dsm.sys File (dsm.opt File for Windows) ......................200 Document conventions ................... 210...
  • Page 13: Introduction

    1 Introduction Welcome to virtual tape libraries. This guide describes the HP StorageWorks VLS and D2D systems and their concepts, including automigration, deduplication, and replication, to help you define and implement your virtual tape library system. It includes best practices for working with specific backup applications.
  • Page 14 Introduction...
  • Page 15: Concepts

    2 Concepts Disk-based Backup and Virtual Tape Libraries Problems Addressed by Virtual Tape Libraries You can optimize your backup environment with VLS and D2D if you are: • Not meeting backup windows due to slow servers. • Not consistently streaming your tape drives. •...
  • Page 16: Where Virtual Tape Fits In The Big Picture

    Where Virtual Tape Fits in the Big Picture Virtual libraries are not necessarily the only piece of your backup plans, but they can be an integral piece of a successful solution. Figure 1 illustrates the common backup technologies and their relative benefits and costs.
  • Page 17: Typical Vls Environments

    Figure 2 HP Virtual Tape Library Product Range Typical VLS Environments In a typical enterprise backup environment, there are multiple application servers backing up data to a shared tape library on the SAN. Each application server contains a remote backup agent that sends the data from the application server over the SAN fabric to a tape drive in the tape library.
  • Page 18: What Are The Alternatives

    restore performance. The addition of a D2D device to these environments allows de-multiplexing of the backups so that restore performance is improved, the deduplication allows for a longer retention time on disk without needing significantly higher disk capacities, and the deduplication-enabled replication allows cost-effective off-site copying of the backups for disaster protection.
  • Page 19: Business Copy

    Figure 3 Basic Write-to-disk Setup Table 1 VLS Compared to Application-based Write-to-disk Virtual tape devices Write-to-disk Setup and manage- Requires configuration of RAID groups, Sets up just like a physical tape library. ment complexity LUNs, volumes, and file systems. Software or hardware enabled (software No device-side data compression avail- Data compression compression generally decreases perform-...
  • Page 20: Deduplication

    Deduplication Introduction In recent years, the amount of data that companies produce has been steadily increasing. To comply with government regulations, or simply for disaster recovery and archival purposes, companies must retain more and more data. Consequently, the costs associated with data storage – labor, power, cooling, floor space, transportation of physical media –...
  • Page 21: Deduplication Ratios

    • Completely transparent to host. • No data is lost – backup streams can be fully restored. • Block or chunk level deduplication, providing greater reduction of data. • Even greater reduction of data when combined with traditional data compression. HP Accelerated deduplication and HP Dynamic deduplication are designed to meet different needs, as shown in Table...
  • Page 22: Target-Based Deduplication

    Table 3 1 TB File Server Backup Data stored normally Data stored with deduplication 1st daily full backup 500 GB 500 GB 1st daily incremental backup 50 GB 5 GB 2nd daily incremental backup 50 GB 5 GB 3rd daily incremental backup 50 GB 5 GB 4th daily incremental backup...
  • Page 23: Tape Oversubscription

    Tape Oversubscription Deduplication requires more virtual tape capacity than physical disk; this is sometimes called tape oversubscription. The purpose of deduplication is to reduce the amount of disk required to store multiple generations of backups. Be sure to create enough virtual tape capacity to contain your entire retention policy, and the amount of physical disk will be much less capacity due to deduplication.
  • Page 24: Hp Storageworks Replication Solutions

    Figure 5 Enterprise Deployment with Small and Large Remote and Branch Offices Replication provides end-to-end management of backup data from the small remote office to the regional site and finally into the primary data center, all controlled from the primary data center, while providing local access to backup data as well.
  • Page 25: Remote Site Data Protection Before Replication

    Figure 6 Remote Site Data Protection Before Replication Figure 7 Remote Site Data Protection Using Replication Deduplication is the key technology enabler for replication on HP VLS and D2D systems. (VLS systems use HP Accelerated deduplication, and D2D systems use Dynamic deduplication.) The same technology HP StorageWorks VLS and D2D Solutions Guide...
  • Page 26: Replication Deployment Options

    that allows duplicate data to be detected and stored only once on the HP VLS or D2D system also allows only the unique data to replicate between sites. Because the volume of data being replicated between sites is much less than if the full data set was replicated, you can use lower bandwidth links at correspondingly lower price points.
  • Page 27: Replication Configuration Options

    For both HP VLS and D2D systems, the unit of replication is a virtual cartridge and the replication link is TCP/IP (one GbE connection per node on the VLS system, and one to two GbE connections on the D2D system). Figure 8 shows how you can configure the system to replicate all of the cartridges or just a subset of cartridges from the source virtual library to the target virtual library.
  • Page 28: Backup Application Interaction With Replication

    • Active-Active: Each site requires a 4-node/4-array VLS9000 (with deduplication) shared between backups and replication target, one rack, and four replication LTUs. • 2x Active-Passive: Each site requires a 2-node/2-array VLS9000 (with deduplication) for backup and a separate 2-node/2-array VLS9000 (with deduplication) for replication target, two racks, and two replication LTUs.
  • Page 29: Replication Limitations

    On a VLS (by default) or a D2D (if you enable the read-only mode on the target library), you can still present the replication target to a different backup application instance (i.e., a separate backup application master/cell server on the target site with its own media database), which you can use to “import”...
  • Page 30 Concepts...
  • Page 31: Backup Solution Design Considerations

    3 Backup Solution Design Considerations This section uses use models to explores many of the concepts you must consider when designing your system. Use models are organizational schemes that provide a basic organizational framework that you can use to delineate your environment and visualize how to implement the VLS or D2D for your best results.
  • Page 32: Backup To The Vls In A Lan/San Hybrid Environment

    Figure 10 Backup to the VLS in a LAN/SAN Hybrid Environment This includes servers that contain lots of small files that are likely to slow down backup perform- ance on the host (i.e., those found on Windows file servers and web servers), blade servers, etc.
  • Page 33: Consider How You Want To Copy The Backup Data To An Off-Site Location

    servers could perform LAN-free backup (see LAN-free Backups) and thus move their backup traffic from the LAN to the SAN. • List your backup applications. In a heterogeneous environment, you can have multiple applications writing to different virtual libraries on the same VLS or D2D (which is highly flexible with regard to the libraries that can be configured).
  • Page 34: Benefits Of Single Library Systems

    Figure 11 Backup to VLS in a Simple Deployment (VLS9000 series with One Shared Library Shown) Benefits of Single Library Systems • This use model is easy to manage. • It is easy to copy through the backup application because you already have shared devices and they all see and are seen by a copy engine or server.
  • Page 35: Multiple Library

    • Multiple hosts are mixed together in one media set. Although you may be using less physical media, this may not meet your media tracking and pooling needs. Multiple Library In this model, multiple hosts map to discrete libraries in a one-to-one or some-to-one configuration. Figure 12 Backup to VLS in a Simple Deployment (VLS9000 series with Four Dedicated Libraries Shown) Benefits of Multiple Library Systems...
  • Page 36: Multiplexing, Multistreaming, And Multipathing

    do LUN mapping and masking 2) All hosts write to the library as a shared device. This involves copies going on during prime backup or operational time on the hosts. • Dedicated copy library for each virtual library -- This may be expensive with multiple virtual libraries. Multiplexing, Multistreaming, and Multipathing This section explains the concepts of Multiplexing, Multistreaming...
  • Page 37: Blocksize And Transfer Size

    NetBackup, etc.) support dual paths to the library change device and will automatically switch over to the alternate path if the primary path fails. Multipathing virtual tape drives is not recommended. Many enterprise backup environments (operating systems or backup application) do not support dual path to tape drives because they see this as two separate drives.
  • Page 38: Future Data Growth

    to optimize retention times of the VLS or D2D? Retention policies help you recycle virtual media. Bear the following considerations in mind as you plan retention policies: • If you are not using deduplication, you can retain the data on the disk backup device for a shorter period such as 1-2 weeks (because more than 90% of restores generally occur within the first week’s retention of backups) and then use tape copies to retain data for longer periods.
  • Page 39: Copy To Physical Tape Through The Backup Application

    to another VLS, D2D replicates to another D2D) and removes the need to move tapes between sites. See Replication. Copy to Physical Tape through the Backup Application You can migrate data from virtual media in the VLS or D2D device to physical tape media using the various tape copy/clone mechanisms that exist in enterprise backup applications.
  • Page 40: Media Server Considerations

    Figure 13 Writing to Tape in a LAN/SAN Hybrid Environment Media Server Considerations To support the background copy of backup data from the VLS or D2D to a tape library, use one or more of your existing backup media servers (running the backup application media agents) to copy the data directly from the device media onto the tape media with data passing over the SAN from the VLS or D2D to the device server and then back again to the tape library.
  • Page 41: Considerations For Copying To Physical Tape Through The Backup Application

    • Use the functionality of a backup application. • Copy only the specific files you need onto physical tape. • Not waste tape storage as physical tapes can be fully filled. • Monitor and track copy jobs from the backup application. •...
  • Page 42: Benefits Of Echo Copy

    Figure 14 Echo Copy is Managed through Automigration Benefits of Echo Copy • The destination library is not visible to the backup application, so it does not need licensing. • There is no need to license/configure copy jobs in the backup application. •...
  • Page 43: Copy To Tape Using D2D Tape Offload

    virtual tapes will not be created. Subsequent backups will fail because the virtual tapes present are protected. NOTE: Automigration echo copy is not suitable for use with deduplication because: • You cannot use echo copy to create archive tapes from the replication target device because these must have a different barcode, retention time, cartridge size, and contents.
  • Page 44: Benefits Of Replication

    data instead of the complete data set. This saves time and replication bandwidth, and is one of the most attractive features that deduplication offers. Replication enables better disaster tolerance without the operational costs associated with transporting data off-site on physical tape. See Introduction to Replication.
  • Page 45: Restoring From Disk Backup Device

    Restoring from Disk Backup Device Restore from the VLS or D2D is performed in the same way that restore from tape is done. Simply direct your backup application to do the restore. NOTE: A restore from one tape can be performed at the same time as a backup to a different tape. Restoring from Backup Application-created Tape Copy The format of physical tapes used in the VLS and D2D environments is the same as the format in environments not using a virtual tape library.
  • Page 46: Backup San Design Guidelines

    The HP Library and Tape Tools: • Dev Perf – This provides a simple test which will write data directly from system memory to a cartridge in a library on the VLS or D2D system. If this tool is run on a server being used as the backup media server, it can provide the maximum data throughput rate for a single backup or restore process.
  • Page 47: Operating System Tape Configuration

    • Small fabric (16 ports or less) may not need zoning. If no zoning is used, make sure that the virtual tape external Fibre Channel connections resides in the lowest ports of the switch. • Small to medium fabric (16 - 128 ports) can use host-centric zoning. Host-centric zoning is imple- mented by creating a specific zone for each server or host, and adding only those storage elements to be utilized by that host.
  • Page 48 For an HP-UX 11.31 server using persistent DSFs, the maximum number of LUNs per bus on the server can be increased by entering # scsimgr set_attr a max_lunid=32. • Disable unnecessary polling/diagnostic applications that can interfere with backups or impact performance.
  • Page 49: Lun Masking And Mapping

    • HPUX: prevent utilities like mt from rewinding a tape that is in use by the backup application by turning on the HPUX kernel tunable parameter st_san_safe (which disables tape device special files that are rewind-on-close) LUN Masking and Mapping D2D devices with the Fibre Channel option do not require LUN masking/mapping because every virtual device LUN (library or drive) has its own unique Fibre Channel WWPN (using what is called N-port virtualization).
  • Page 50: Backup Application Basic Guidelines

    Figure 15 Virtual Tape Environment In the VLS firmware version 1.x/2.x, the LUN mapping mode is specific to each host. It lets you manually assign new LUN numbers to the virtual devices visible to the host, and hosts that are not LUN masked continue to use the default (shared device) LUN numbering.
  • Page 51: Hp Data Protector Application Overview

    • Symantec NetBackup HP Data Protector Application Overview The Data Protector cell is a network environment that includes a Cell Manager and client systems that run agents. Client systems are imported into a cell and belong to a single Cell Manager. Multiple cells may exist, each with their own Cell Manager.
  • Page 52: Ibm Tsm Application Overview

    If a storage unit contains two drives and one is busy, NetBackup can use the other drive without administrator intervention. To send backups to a storage device, the administrator must define storage units using the Device Configuration Wizard. For virtual library devices such as VLS and D2D, the type of storage unit used is the "Media Manager storage unit."...
  • Page 53: Emc Networker Application Overview

    Figure 17 TSM LAN-free backups Data on large application servers (D-G) is backed up by TSM server via the SAN directly to the tape storage pool. Data is copied from the primary tape storage pool to a copy storage pool, making a set of tapes send off-site for disaster protection.
  • Page 54 Backup Solution Design Considerations...
  • Page 55: D2D Systems

    4 D2D Systems D2D Devices D2D Defined The entry level D2D100 series Backup System meets the needs of small businesses as a low-cost solution that does not incorporate deduplication technology. The D2D2500 is well suited for remote and branch offices and small IT environments, while the more powerful D2D4000 and D2D4100 are designed for medium-sized companies and small data centers.
  • Page 56: D2D Technical Specifications

    D2D Technical Specifications Table 6 D2D Technical Specifications D2D2500 Backup Systems D2D4000 Backup Systems D2D4112 Backup Systems For remote or branch offices and For remote or branch offices and For mid-size data centers and dis- smaller data centers mid-size data centers tributed environments 1U rack-mount 2U rack-mount...
  • Page 57: Multiple Backup Streams

    • Servers to back up are split across two physical networks which need independent access to the D2D. You must assign the virtual libraries to whichever port is on the network of the device to back up. • Separate data and management LANs are used, i.e., each server has a port for business network traffic and another for data backup.
  • Page 58: Disable Backup Application Verify Pass

    servers. In this case, also ensure that bonded/trunk network ports are configured for both the media server and the D2D device. With either configuration there is a requirement to correctly configure the backup application to target the devices correctly; this will usually involve creating multiple backup jobs to back up specific data to selected virtual libraries.
  • Page 59: Tape Library Emulation

    In addition, consider other operations running on the D2D. For example, if multiple backups finish at different times, each may result in some housekeeping work. Add all of the data backup amounts together to get an idea of the time that housekeeping will take. Tape Library Emulation Emulation Types The HP D2D Backup Systems can emulate several types of HP tape library devices, and the maximum...
  • Page 60: D2D Blueprints

    • Overwrite vs. append: Overwriting and appending to cartridges is also where virtual tape has an advantage. With physical media you may want to append multiple backup jobs to a single cartridge in order to reduce media costs; the downside is that cartridges cannot be overwritten until the retention policy for the last backup on that cartridge has expired.
  • Page 61: Large Sme Site Consolidation Requiring Fibre Channel Shared Devices

    Figure 18 Single Site Backup Consolidation using iSCSI Device configuration highlights: • Configure iSCSI ports in high availability mode to give added redundancy. • Configure a single library for best deduplication ratio or multiple libraries for best throughput. • Consider configuring multiple virtual libraries for different data types (office files, MSExchange databases) across multiple host servers to get the best possible performance and deduplication ratio.
  • Page 62: Large Sme Site Consolidation Requiring Fc Shared Devices

    Solution: HP D2D 4112 “expandable” storage, good performance, backup application Object Copy to small physical tape library for archiving and disaster recovery. Caveats: Preferred offload to physical tape is via backup application “object copy” functionality for maximum physical tape usage efficiency. ISVs: Typically HP Data Protector, Symatec Netbackup, Tivoli Storage Manager, EMC Networker, Commvault, Backup Exec, etc.
  • Page 63: Multi-Site Small Business Disaster Recovery Solution

    Multi-site Small Business Disaster Recovery Solution Company requirements: remove physical tape and any human intervention for backups at remote sites – completely automate backup and disaster recovery. Fewer than 1 TB at each remote site and up to 24 remote sites. Solution: Multi-remote site D2D iSCSI devices replicating into a central consolidated D2D at a disaster recovery site.
  • Page 64: Multi-Site Small Business Automated Disaster Recovery Solution With No Physical Tape

    Figure 21 Multi-site Small Business Automated Disaster Recovery Solution with no Physical Tape For a smaller business where there may be limited budgets or no compliance requirements dictating data to be archived or stored on tape for several years, the solution shown without physical tape above is perfectly adequate.
  • Page 65: Introduction To Vmware Terminology

    co-location of units and then splitting of units, or by using physical tape to transfer data from remote office to seed unit in data centre. This process only has to be done once. Other information: WAN link sizing and data centre D2D sizing is critical. Size for existing link or new link using the HP StorageWorks Backup Sizer.
  • Page 66: Using D2D In Simple Vmware Environments With Esxi

    Using D2D in Simple VMWare Environments with ESXi This configuration is currently being certified with VMWare. See future supported configurations at the end of this document. Using D2D in VMWare Environments with ESX4 and VMWare Consolidated Backup Company requirements: Mid-size VMWare environment, wants VMWare based snapshot backup with individual windows file based recovery using VMWare consolidated backup (VCB) in ESX4.
  • Page 67: Using D2D In Larger Vmware Environments With Hp Data Protector Zero Downtime Backup With Instant Recovery (Zdb + Ir)

    Recovery options: Recovery is a two stage recovery: 1) the image is restored to the VCB proxy or the ESX server, and 2) the image is restored as a VM using VMware Converter if restored to the VCB proxy, or vcbRestore if restored to the ESX server. Restart the VM and you are back in business as of the last snap.
  • Page 68: Using D2D In A Vmware Environment With Esx4 And Hp Data Protector Zdb Ir

    ISVs: HP Data Protector with Zero Downtime Backup Option and Symantec Netbackup with Snapshot Client are both deep dive tested by HP in this mode of operation. Device configuration highlights: Consider configuring multiple virtual libraries for different image types. Backup application configuration highlights: Configuring Zero Downtime Backup within Data Protector and Symantec Netbackup requires some additional scripting work.
  • Page 69: D2D Dynamic Deduplication

    • Limited to EVA & XP & LHN (future) • Snapshots could take up large amount of EVA and XP space depending on frequency of snapshots at extra cost. • Note that to do the ZDB backup of VM volumes it is a requirement that the VM volumes use the VMware raw device mappings (RDM) feature.
  • Page 70: Dynamic Deduplication Implementation

    Hash values are stored in an index that is referenced when subsequent backups are performed. When data generates a hash value that already exists in the index, the data is not stored a second time. Rather, an entry with the hash value is simply added to the “recipe file” for that backup session. Over the course of many backups, numerous instances generating the same hash values occur.
  • Page 71: Restoring Data

    Figure 26 D2D Configuration The current deduplication ratio for the D2D device is calculated and displayed in the GUI. This figure updates automatically. See Figure Figure 27 Dynamic Deduplication Ratio The D2D device fully emulates a tape library, including its drive-type and supported compression. Dynamic deduplication can be used in combination with data compression for even greater storage savings.
  • Page 72: Housekeeping

    Figure 28 Restoring from a Recipe File The first hash in the recipe file is located in the index, which provides the location on disk of the original chunk of data. The original chunk is found and returned to the restore stream. The D2D moves on to the second hash, and repeats this process for all subsequent hash values in the recipe file until the entire file is returned to the restore stream.
  • Page 73: D2D Replication

    Configuring backup rotation schemes correctly is very important to ensure the maximum efficiency of the product by reducing the amount of housekeeping required. See Optimizing Rotation Scheme to Reduce Housekeeping. D2D Replication The D2D replication technology (which leverages Dynamic deduplication) offers: •...
  • Page 74: Licensing

    Licensing On HP D2D 2500/4000/4100 Backup Systems, Dynamic deduplication is included as standard. Replication is a licensable feature at extra cost. Only one license is required for the replication target. Implementing the Initialization Process Before D2D replication can take place the reference data (the first full backup cycle performed with deduplication enabled) needs to be sent to the target.
  • Page 75 • Option 2: Use physical tape initialization. Physical tape library access is required at both sites. Export tapes from the source, import tapes into the target. During the import, the deduplication store is automatically created. Data is exported in backup software format. Initialization Co-location and WAN Replication Configuring Replication...
  • Page 76: Replication Setup

    Replication Setup Before you set up the D2D for replication it is important to understand the concept of slot mapping, because this is how replication is administered. It is also important to understand your cartridge rotation scheme because the replication is all predicated on knowing that weekly backups go to Tape X in Slot Y, that daily incrementals go to Tape A in Slot B and so on.
  • Page 77: Slot Mappings

    Figure 30 Slot Mappings • A “slot mapping collection” is any number of slots within a source library that are configured to replicate to a target library. • There is a maximum of one slot mapping collection per source library (you can add slots to mapping after initial creation).
  • Page 78: Example Schematic

    Configuring Replication The following steps will configure D2D replication. If using tape initialization, the initialization process has to happen before the replication is configured. Prior to configuring replication, you must enter the replication license on the target device. Use the diagram below when reviewing the steps to configuring replication. This example uses a D2D2500 source library IP 192.168.0.44 and a D2D4000 Fibre Channel target library at 192.168.0.207.
  • Page 79 Start the replication wizard. Select the Replication tab in D2D GUI and then the Mapping Configuration tab. The wizard can then be initiated. All the configuration talks place on the source library. The only actions necessary on the target library are to configure it with an IP address. Note also that this is the starting point for the recovery wizard.
  • Page 80 Assign target slots. Select Next. Use part of an existing library on the target or create a new library on the target directly from the source for the purpose of replication. In this case map four slots from the local (source) library onto target Lib 2 on the D2D4000 which has already been created.
  • Page 81 Configure slot mappings. Set up the slot mappings from the source library slots to the target-lib2 slots and give the slot mappings a name. Provide a name for the slot mapping. Select Apply. Notice how the virtual bar codes at both the source and the target are displayed. In this case there is a 1:1 mapping from the source onto the target replication library using the drop down lists.
  • Page 82: Reporting

    Configure throttling. Configure the necessary Blackout windows and Bandwidth throttling settings unique to your requirements. This example does not allow replication between 08:00 and 18:00 during the week. Select Finish. Note in this example the bandwidth is not limited so there is 100% utilization of the available link.
  • Page 83 Source Device Reporting From a source perspective, it is important to know if your cartridges have replicated successfully. From each source device on the replication tab there is an “Event History” tab which displays all the replication activity concerning that source in a time stamped Event History log. For the source library configured in the example above, you can see the stages and times as each of the four mapped cartridges is replicated.
  • Page 84: Design Considerations

    Design Considerations Because every enterprise backup environment is different, there is no single answer for optimizing replication. This section discusses replication design considerations to help you optimize replication in your environment. Link Sizing Knowing the rate of change of data on your systems and hence being able to size the replication link efficiently is the key to a successful implementation.
  • Page 85: Example 1: 100 Gb Virtual Cartridge Replication Over A 2 Mb/Sec Link With

    Figure 32 Example 1: 100 GB Virtual Cartridge Replication over a 2 Mb/sec Link with 100% Bandwidth Utilization Calculating data to replicate: • 100 GB daily backup to replicate within a 12–hour fixed window (between 20:00 and 08:00). • 1% daily change rate = 0.66 GB compressed •...
  • Page 86: Example 2: 100 Gb Virtual Cartridge Replication Over A 2 Mb/Sec Link With

    Figure 33 Example 2: 100 GB Virtual Cartridge Replication over a 2 Mb/sec Link with 25% Bandwidth Utilization Calculating data to replicate: • The same data calculations as the example in Figure • The total data to transmit is 2.16 GB. Results: •...
  • Page 87: Example 4: 500 Gb Virtual Cartridge Replication Over A 10 Mbit/Sec Link

    Calculating data to replicate: • 500 GB daily backup to replicate within a 12–hour fixed window (between 20:00 and 08:00). • 1% daily change rate = 3.3 GB compressed • Overhead (manifest, imperfect match, etc.) is 1.5% = 7.5 GB •...
  • Page 88: Many-To-One Example Using Mixed Wan Link Speeds

    Figure 36 Many-to-One Example using Mixed WAN Link Speeds Here are four remote sites replicating 100 GB and 500 GB backups daily into a central D2D4000. Note also that the central D2D4000 has 2 x 1 GbE connections from the central router for link aggregation.
  • Page 89: Sample Initialization And Replication Times

    Table 7 shows the typical ongoing replication times depending on link speeds and how long the initialization process will take depending on if initialization is done via co-location or the WAN link. Use this table as a reference guide. Table 7 Sample Initialization and Replication Times Initial Link Backup...
  • Page 90: Telco Provisioning Overview Of Inter-Site Links

    Initial Link Backup Initial. Initial. Ongoing Limiting Ongoing Limiting backup speed in time at time via time via replica- factor in replica- factor in size in Mb/sec WAN in co-loca- tion time replica- tion time replica- MB/sec hours tion in in hours tion in hours...
  • Page 91 • Resilience dependent on business criticality of replication. Even with the most basic package (single line unprotected) most Telcos will guarantee 99.9% availability. Above this you can opt for master and standby paths (only one active at a time) that link to a single NTE (Network Termin- ation Equipment) at your premises.
  • Page 92: Some Idea Of Telco Costs

    Figure 37 A Basic WAN using MPLS Some Idea of Telco Costs It is very difficult to generalize on typical link costs because of the variables listed above, but as a practical example a major UK Telco operator quoted costs for links at various speeds for HP Sites in Bristol, UK and Warrington, UK, for a distance of around 250 km.
  • Page 93: Telco Terminology And Branding For Wan Services

    Table 9 Prices of a Major US Telco for HP Facilities in the Houston, Texas Area Installation Cost (US $) Rental per year (US $) Average per year (over Bandwidth Mbit/sec 3 years) (US $) 2000 11400 12066 2000 18000 18666 3000 36000...
  • Page 94: Sample British Telcom Services And Speeds

    This is a flexible bandwidth solution with speeds between 64 Kbps and 1 Gbps. The router is provided by BT and is designed primarily for internet access. The connection is typically to a BT point of presence POP and optional features such as failover and load balancing are available. The connection to the BT POP would be copper wire for the lower speeds (SDSL type connection) (64 Kbps, 2 Mbps) and optical fibre at the higher speeds.
  • Page 95: D2D Replication Data Recovery Options

    D2D Replication Data Recovery Options Replication on D2D enables easier recovery of data in the event of a major site disaster. The data is not instantly available but has to be recovered through a standard restore process using a backup application.
  • Page 96 of reverse replication is the same as if initializing the device at the remote site and so depends on the link speed and amount of critical data to be reverse replicated. Once the server is recovered and able to perform backups again, configure replication back to the target device in the data center all over again.
  • Page 97 The recovery wizard will then provide slot mappings. Select the required tapes which were last replicated prior to the disaster at remote site C. Select Adopt Slot Mapping and then Next. Choose the target cartridges you wish to be reverse replicated and select Apply. The system displays a message indicating that reverse replication has been established.
  • Page 98: Reverse Tape Initialization On The D2D

    With reverse replication the deduplication store at the source has to be repopulated again from the beginning so reverse replication will take longer than “normal replication” after initialization. After reverse replication, the normal source to target relationship is reestablished automatically. Further replication will take place from source to target in the normal manner.
  • Page 99: Creating Archive Tapes From The Target

    Figure 39 Recovery Options — Physical Tape Creating Archive Tapes from the Target Even with replication removing the need for physical tape offsite, there are still many users who wish to use physical tape for archive or test recoveries, etc. An archive tape will have the following differences from the original backup: •...
  • Page 100: Creating Archive Tapes From The Target

    Figure 40 Creating Archive Tapes from the Target In order to have the target backup application copy from the replicated cartridges to a physical tape library, it must be “taught” what is on the replicated cartridges. This can be done automatically as shown in Figure 40 by setting the target virtual libary to “read-only”...
  • Page 101: Virtual Library Systems

    5 Virtual Library Systems VLS Devices VLS Defined The HP StorageWorks VLS12000 Gateway, the VLS6000–series, and the VLS9000–series are RAID disk-based SAN backup devices. The Gateway platform and the VLS6000 and VLS9000–series devices emulate physical tape libraries, allowing you to perform disk-to-virtual tape (disk-to-disk) backups using your existing backup applications.
  • Page 102: Prime Environments

    libraries as physical libraries on the SAN. Because you can create many more virtual libraries and drives than you have physical tape drives, many more SAN-based backups can run concurrently from the application servers, reducing the aggregate backup window. After the backups are complete, the automigration feature or the data protection software can migrate backup data from the virtual media to physical tape, for off-site disaster protection or long-term archival.
  • Page 103: Vls9000-Series Configurations

    • With VLS12000 Gateway, VLS6600, and VLS9000, data compression is provided by a hardware compression card that does not impede performance. • The VLS12000 Gateway and the VLS9000 provide high reliability, dual paths between the node and the storage pools, and redundant power supplies and cooling. •...
  • Page 104: Vls Technical Specifications

    Figure 42 Full VLS9000 System VLS Technical Specifications Table 13 VLS9000 series Multi-node Maximum Capacity by Configuration with 20 port Connectivity Kit and 40 TB Arrays (2:1 Compression, Non-deduplication) 1 node 2 nodes 3 nodes 4 nodes 5 nodes 1 array 80 TB 2 arrays 160 TB...
  • Page 105: Kit And 40 Tb Arrays (2:1 Compression, Non-Deduplication)

    1 node 2 nodes 3 nodes 4 nodes 5 nodes 8 arrays Table 14 VLS9000 series Multi-node Maximum Capacity by Configuration with 32 port Connectivity Kit and 40 TB Arrays (2:1 Compression, Non-deduplication) 1 node 2 nodes 3 nodes 4 nodes 5 nodes 6 nodes 7 nodes...
  • Page 106: Vls9000–Series Backup Performance By Configuration

    1 node 2 nodes 3 nodes 4 nodes 5 nodes 6 nodes 7 nodes 8 nodes 8 arrays 640 TB 640 TB 640 TB 640 TB 640 TB 9 arrays 720 TB 720 TB 720 TB 720 TB 10 arrays 800 TB 800 TB 800 TB...
  • Page 107: Vls6000–Series Performance By Configuration, 4 Gb Models

    VLS6218 VLS6227 VLS6636 VLS6653 Maximum usable capacity (2:1 56.8 TB 61.2 TB 113.6 TB 122.4 TB compression) Table 18 VLS6000 series Performance by Configuration, 4 GB Models VLS6200 VLS6600 with commpression with uncompressible with 2:1 compressible disabled data data 1 disk enclosure 0.4 TB/hr (100 MB/s) 2 disk enclosures 0.7 TB/hr (200 MB/s)
  • Page 108: Vls12000 Eva Gateway Technical Specifications

    Table 19 VLS12000 EVA Gateway Technical Specifications VLS12000 EVA Gateway Usable capacity for base (2–node) configuration 50 TB Maximum usable capacity 512 TB and 1 PB (with hardware compression) Number of virtual library nodes 2 to 8 Number of virtual tape libraries per node 1 to 16 Number of virtual tape drives per node 1 to 128...
  • Page 109: Vls12000 Eva Gateway Backup Throughput Without Deduplication

    Number of VLS12000 EVA Gateway Nodes with uncompressible data or deduplication enabled MB/s MB/s MB/s MB/s MB/s MB/s MB/s Number of EVA6000 or MB/s MB/s MB/s MB/s MB/s MB/s MB/s EVA6100 or EVA4400 Arrays 1080 1200 1200 1200 1200 1200 1200 (assumes 112 MB/s...
  • Page 110 Number of VLS12000 EVA Gateway Nodes with 2:1 compressible data or deduplication disabled 1200 1200 1200 1200 1200 MB/s MB/s MB/s MB/s MB/s 1200 1400 1400 1400 1400 1400 MB/s MB/s MB/s MB/s MB/s MB/s 1200 1600 1600 1600 1600 1600 1600 MB/s...
  • Page 111: How It Works

    Number of VLS12000 EVA Gateway Nodes with 2:1 compressible data or deduplication disabled 1200 1800 2400 3000 3600 4000 4000 MB/s MB/s MB/s MB/s MB/s MB/s MB/s 1200 1800 2400 3000 3600 4200 4800 MB/s MB/s MB/s MB/s MB/s MB/s MB/s How it Works VLS Scalability...
  • Page 112: Internal Architecture Of The Vls (Vls9000 Shown)

    Figure 43 Internal Architecture of the VLS (VLS9000 Shown) This allows any cartridge to be loaded into any virtual tape drive on any node, so the virtual drives in a library can be on different nodes from the virtual library robot. Therefore, the virtual drives for a library can be configured across multiple nodes as shown in Figure 44, and as the device is scaled...
  • Page 113: Vls Automatic Performance Load Balancing

    Figure 44 Virtual Drives across Multiple Nodes VLS Automatic Performance Load Balancing The VLS always “thin provisions” its virtual cartridges and dynamically load balances all the incoming new backup data across the available back-end arrays to ensure that there are never any performance hotspots.
  • Page 114: Vls Warm Failover

    Figure 45 Storage Pool Load Balancing on the VLS Figure 45 shows, the VLS storage pool: • Provides clustered virtual cartridge file system across one or more arrays. • Automatically load-balances incoming backup data across all available array LUNs (in 32MB extents).
  • Page 115: Implementation

    • The VLS will automatically save, within 1 hour, the current configuration and licenses to a hidden virtual cartridge stored on the back-end disk arrays. • In the event of a node0 disaster (node failed and was replaced with a new one, lose of node0 operating system, Quick Restore of new operating system, etc) the node0 recovery process is: •...
  • Page 116 • Select a storage pooling strategy for the VLS9000 and VLS12000. Identify which pools are used for which virtual libraries. This could be based on separating cartridges across virtual libraries, or decreasing the impact of array failure. See Storage Pooling. If you are planning to use dedu- plication, note that this currently supports only one storage pool per VLS.
  • Page 117: Capacity Licensing

    • Set the tape block size to 256 KB for all VLS tape drives. • Disable multiplexing/interleaving on all backup jobs to the VLS. • Identify clients on the SAN that are currently performing LAN-based backup but could be converted to LAN-free backup to improve performance.
  • Page 118: Virtual Libraries/Drives/Cartridge Configuration

    VLS model Capacity upgrade license VLS9000–series Enables an additional 30 or 40 TB VLS9000 array per license. VLS12000–series Enables an additional 2 TB (1 LUN) per license. Virtual Libraries/drives/cartridge Configuration The VLS provides a very flexible virtual library configuration, allowing you to create many more virtual tapes drives than a physical library could contain (and therefore run many more concurrent backup jobs), and to create virtual cartridges of any required size.
  • Page 119: Eva Preparation

    If you use LUN mapping and masking to assign specific virtual drives to specific backup media servers, make sure that the LUN numbering stays in sequence with no gaps. Instead of deleting tape drives, you can disable them. (You can present them to a new host in the future if required.) When creating virtual cartridges inside the virtual library, it is good practice to keep the virtual cartridge size smaller than the physical tape.
  • Page 120: Vls12000 Eva Gateway Connected To An Eva

    Figure 46 VLS12000 EVA Gateway Connected to an EVA • Ensure that the zoning configuration is complete and that storage ports 2 and 3 on each VLS node connect to different switches/fabrics or zones. The EVA controllers must also be connected to both switches/fabrics or zones.
  • Page 121: Storage Pooling

    • The EVA LUNs must be fully initialized and ready for use before attaching the VLS to the EVA. It can take several hours to initialize an entire EVA; therefore, HP advises you to create the LUNs in advance of the installation. •...
  • Page 122: Single Vs. Multiple Storage Pool Considerations

    Figure 47 Single vs. Multiple Storage Pool Considerations VLS9000 Storage Pooling With VLS9000, the user establishes a storage pool policy which defines how the arrays are pooled. VLS9000 then automatically creates storage pools based on the policy. Storage pools are defined in terms of whole arrays, which consist of one base disk array and three expansion disk arrays.
  • Page 123: Vls Design Considerations

    across one array. See Figure 48. If a fourth array is added, the VLS automatically adds it to the storage pool with only one array. Figure 48 VLS9000 Partially Populated Storage Pools The flexibility of the VLS9000 storage pooling effectively allows the user to partition the disk arrays to fit the user's environment.
  • Page 124 • If you use ‘x’% of the VLS to better fit into your backup window, what will that gain you in your backup window and is that the best use of your VLS? • You should do some backup tests to understand exactly how much the backups are sped up through the introduction of the VLS into the environment before you can make this estimate reasonably accurately.
  • Page 125 • Use RAID 5 configuration on the EVA in order to get faster sequential access and better capacity efficiency. Sizing/implementation Examples There are no standard, generic rules for sizing your environment but there are general parameters that you can use to help you understand your capacity needs. Example 1 VLS Sizing for Performance and Capacity Assume that you have 12 servers being backed up across a SAN.
  • Page 126: Testing Backups To Tape Vs. Backups To A Vls

    Figure 49 Testing Backups to Tape vs. Backups to a VLS If you run two tests, in both cases backing up approximately 13 GB of data from the five servers, the following results: • Backing up to physical tape takes 6 minutes 41 seconds •...
  • Page 127: Vls Blueprints

    VLS Blueprints The following blueprints of VLS virtual tape libraries with deduplication and replication start from specifying company requirements, then defining the HP blueprint for the solution, and finally defining any solution caveats or ISV dependencies associated with the solution. This will help you make informed decisions and allow you to quickly assess areas of concern and possible implementations.
  • Page 128: Large Scale Backup Consolidation (No Deduplication)

    • Cost effective • Easy disaster recovery solution if physical tape copy is regularly off-sited Solution trade-offs: Take care to size the solution correctly. VLS9000 entry level scales to 40 TB before you need to purchase additional connectivity kits. Other information: HP strongly recommends using the backup application to perform the copy from the VLS to physical tape.
  • Page 129: Managing Data Retention And Reducing Backup Storage Costs

    Solution advantages: • Massive scalability in terms of capacity and performance, “hot add” capability for adding capacity and nodes • High availability features in nodes and back-end disk arrays • Flexible emulations in terms of number of libraries, drives, slots, and media sizes •...
  • Page 130: Full Enterprise Backup And Disaster Recovery Capability

    Figure 52 Using VLS to Manage Data Retention and Reduce Backup Storage Costs Recovery options: All recovery from VLS with deduplication is fast, especially the last full backup (typically 90% of restores are from the last backup). Copies from the VLS to physical tape is done by expanding deduplicated data so full content is copied to physical tape.
  • Page 131: Using Vls With Full Enterprise Backup And Disaster Recovery Capability

    Device configuration highlights: Setting up replication is done by creating echo copy pools. Identical cartridges are kept on source and target devices and report “up to date” when they are fully synchronized. Backup application configuration highlights: Two separate master servers are required with one at each site.
  • Page 132: Multi-Data Center Replication To A Common Disaster Recovery Site

    • A one-time initialization process is required the first time replication runs so the source and target can synchronize. This can be done by co-location, replication over low bandwidth link, or by using physical tape transport – export and import. Other information: HP strongly recommends using the backup application to perform the copy from VLS to physical tape.
  • Page 133: Vmware Backup Environment With Esxi

    • A replacement VLS and servers can be installed at the primary site and data can be reverse rep- licated from the disaster recovery site over the low bandwidth link, although this is not recommended for high volumes of data (TBs) because of the time to replicate data wholesale over a relatively small link speed.
  • Page 134: Vmware Backup Environment With Esx4 And Vcb-Based Backups

    Figure 55 Using VLS with VMWare using Backup Agents Recovery options: Using the separate server the restore can take place directly from the VLS device over the network to the VMWare machines. If the data no longer exists on the VLS but has been archived to tape then it can restored directly from tape to VMWare machines.
  • Page 135: Using Vls With Vmware Using Vcb

    server has to have sufficient storage to store the snapshots if image-based backups are being used because the images are copied from the primary storage to the proxy server for these types of backup. ISVs: HP Data Protector and Netbackup only at this time. (Tested by HP enterprise backup solutions lab in depth.) Device configuration highlights: Separate virtual library created for each VMWare machine.
  • Page 136: Larger Vmware Environment With Esx4 And Hp Data Protector Zdb Ir Integration With Vmware Consolidated Backup

    • Disaster recovery of ESX environments • Easy transfer to physical tape if required, if backup application used • With HP Data Protector and most other backup software the following approaches are possible: • Online and offline snapshots of VMs. File level backup and recovery of data in the VMs. •...
  • Page 137: Using Vls In Larger Vmware Environments With Hp Data Protector Zdb Ir Backup

    Figure 57 Using VLS in Larger VMWare Environments with HP Data Protector ZDB IR Backup Recovery options: • Two step restore process that provides full recovery of VM and data • Recover the VM image from VCB backup • Required only in case the whole VM is lost or restore to a new ESX server •...
  • Page 138: Large Enterprise Cross-Site Backup With Deduplication And Inbuilt Disaster Recovery

    • Data Protector Integration with VMware VCB provides the first layer of protection • Provides snapshots that can be used for restoring VM • Needed only when VM configuration changes Application backup • Provides the second layer that enables protection of application data •...
  • Page 139: Combining D2D And Vls In An End-To-End Solution For Robo/Regional Dcs And Main Data Center Using Hp Data Protector

    Recovery options: Depending on retention time on the VLS, data can be recovered directly from VLS on the remote site or if that retention has expired then a copy should be archived on physical tape which is again directly available from the remote site over the extended SAN. Solution advantages: •...
  • Page 140: Vls Automigration

    Figure 59 Using D2D and VLS in ROBO/RDC/MDC Environments Recovery options: As with all replication solutions using D2D andVLS, under normal situation data is recoverable from local D2D or VLS. If retention period has expired it may be on physical tape at that site.
  • Page 141: Echo Copy Concepts

    Echo Copy Concepts Echo copy essentially acts as a transparent disk cache to the physical library, so that echo copy jobs can be performed at times other than the peak backup window. Once automigration is set up, echo copy operations are automatic and seamless to the user. During the normal backup window, the backup application writes to virtual cartridges.
  • Page 142: Automigration Policy

    Figure 60 Linked Media When virtual tapes are ejected by a backup application, or when tapes are ejected in the destination library, the matching virtual tapes are moved to the device's firesafe a temporary holding place. This ensures the device does not fill up its disk space with older cartridges. When virtual media is ejected, the matching destination tapes are also automatically ejected.
  • Page 143: Automigration Setup

    are no longer present. Therefore, to restore data from virtual cartridges in the firesafe, the user must move the virtual cartridges out of the firesafe first. See Restoring from Automigration Media. • Start time – the time at which the echo copy job begins. HP recommends that copies are scheduled within a different time window from other backup activities.
  • Page 144: Automigration Media Life Cycle

    Figure 62 Automigration Media Life Cycle This section presents an example, in which a user who works for Company X did everything described in the previous section to set up automigration. See Automigration Setup. The numbered list that follows coincides with Figure 62, which illustrates the stages in the life cycle of automigration virtual and physical media.
  • Page 145: Design Considerations

    user in this example set the retention time for seven days. Therefore, if there is a need to do a restore within the 7–day retention period, the user can simply move a cartridge from the firesafe, and use the backup application to do the restore. When the seven-day retention period passes, the virtual cartridges are deleted, at which point a physical tape is needed for restore operations.
  • Page 146: Multiple Virtual Libraries

    Figure 63 Multiple Virtual Libraries In this use case, the physical tape library slots are split into separate slot ranges, with each slot range being assigned to a different virtual library. This allows separate backup servers to backup to their own dedicated virtual libraries.
  • Page 147: Sizing The Tape Library

    Figure 64 Shared Virtual Library In this scenario, there is one or more physical tape libraries that need to be mapped to a single, shared virtual library. The physical library or libraries can be bigger or smaller than the available disk space on the virtual library.
  • Page 148: Restoring From Automigration Media

    nodes in the VLS. If the copy window does not need maximum copy performance, you can use even fewer tape drives per node. For example, a 4-node VLS9000 device that needs to automigrate 64 TB every night within an eight hour copy window (that runs after the backups are complete) requires an aggregate copy performance of 2200 MB/sec.
  • Page 149: How It Works

    • Delivers fastest restore because it maintains a complete copy of the most recent backup data. • Older data can be restored faster because it is still on the VLS – there is no need to locate physical media, load it into a drive, and perform the restore in the traditional manner. •...
  • Page 150: Backup Analysis

    Item Description The VLS analyzes the backup as it goes through memory. The backup application metadata is stripped and used to understand the format of the tape. The metadata database creates a map of the locations of the logical backup data. Figure 66 Backup Analysis Figure 67, two instances of File A are shown, one from Session 1, and one from Session 2.
  • Page 151 the names of the backup jobs the server being backed up the type of backup – full or incremental the type of data in the backup – files, database, etc. The deduplication software then queries the metadata database to find an equivalent older version of the same backup job to compare it against the new backup.
  • Page 152: Accelerated Deduplication Implementation

    Item Description Backup session 1 consists of files A and B. Duplicate data from A is replaced with a pointer to A'. The unique data from file A, as well as file B in session 1 one is retained. The pointer from A to A' is readjusted to point to A” in session 3, the most recent instance of that data.
  • Page 153: Supported Backup Applications And Data Types

    • Install deduplication licenses • Configure the available deduplication options using Command View VLS A complete backup is necessary to initialize Accelerated deduplication, but data in subsequent sessions can then be deduplicated. See the HP StorageWorks Virtual Library System users guide for the complete setup procedures.
  • Page 154: Licensing

    space by deleting cartridges. In the 3.2 firmware this is disabled in the GUI, so use the VLS CLI to perform this. NOTE: Use only the CLI cartridge delete option to delete non-deduplicated cartridges. After deduplication is successfully enabled, ensure virtual cartridges are created correctly for optimum deduplication performance and for correct backup capacity: •...
  • Page 155: Migrating Your Existing Backup Data

    There are three licenses available, one for each VLS platform – VLS6000–series, VLS9000–series, and VLS12000 Gateway. See Table Table 24 Required Deduplication Licenses by Platform Platform Deduplication licenses required One license per MSA20 for 250 or 500 GB drives Two licenses per MSA20 for 750 GB drives VLS6000–series Two licenses per MSA20 for 1 TB drives One license per base unit...
  • Page 156: Handling The Device Out Of Capacity Condition

    Delta-diff in Process the backup has identified another version of itself to difference against and is now running differencing to identify the duplicate data between the two versions. With multi-stream backups, this process may take multiple tries (going back to "Waiting for Next Backup" state each time) until the differencing locates the correct stream.
  • Page 157: Design Considerations

    Use the backup application to reformat/erase the cartridges identified by the Cartridge Utilization Report as good candidates to free up disk space. For example, in Netbackup you can expire and relabel the cartridges, in Data Protector you can recycle and format the cartridges, and in TSM and NetWorker you can relabel the cartridges.
  • Page 158: Capacity Sizing

    Capacity Sizing Several factors affect how much disk space is required for the desired retention scheme: • Data compression rate. • Daily change rate. • Retention policy, or the number of full and incremental backups retained. • “Scratch” space required for post-processing. Download the HP StorageWorks Sizer tool to help determine your capacity needs.
  • Page 159: Optimum Record Sizing

    X = node for backup and restore, Y = node for deduplication processing only. Note: the maximum deduplication configuration currently supports four nodes. Optimum Record Sizing You must configure your backup application to use a record size (i.e., tape block size) of 256 KB when using deduplication.
  • Page 160: Media Management

    Media Management To make the most of storage savings with Accelerated deduplication, consider the following: • Reclamation does not occur until a tape is full (and everything on the tape is either delta-differenced, disabled, or unsupported). • The deduplication software cannot deduplicate versions of a backup that are on the same cartridge; the versions are not deduplicated until a new version is written to a different virtual cartridge.
  • Page 161: Disabling Deduplication On Specific Backup Policies

    • Do not divide full backups and incremental backups across different job names. They must operate under a common backup job name. • If a client is removed from the backup application configuration (e.g., the client is decommissioned or renamed), you must set that client's backup policy/name to Deduplication Disabled in the VLS GUI.
  • Page 162: How Replication Fits Into The Deduplication Architecture

    Figure 70 How Replication Fits into the Deduplication Architecture As the backup takes place, a metadata database is created. Using the metadata compiled in step 1, the deduplication software running on the VLS nodes identifies which backups to compare against which similar backups in order to remove duplicate data.
  • Page 163: Replication Of Incremental Backups

    Figure 71 The Replication Sequence A backup is performed (and its metadata is stored in the database). The new backup is compared to the previous versions and the unique data in the new backup is identified. The source device performs space reclamation on the previous backups to eliminate the duplicate data, and concurrently the replication data (containing just the unique data and metadata) is transferred to the target device.
  • Page 164: Multi-Node Replication Scaling

    Multi-node Replication Scaling The HP VLS9000 and VLS12000 architectures allow formidable scaling in terms of capacity and performance. Supporting multiple nodes, the backup throughput, deduplication comparison, and the replication process can be load balanced across all the available nodes to ensure no bottlenecks to performance.
  • Page 165: Replication Preparation

    are required for more than 10 sources). In the active-active deployment, you would install one li- cense per node in both the source and target devices (for all nodes in those devices). Replication Preparation Prior to configuring replication, ensure that deduplication is licensed and enabled on both the source and target devices and enter the replication licenses on the target device.
  • Page 166: Replication Setup

    replication traffic. However, in multi-node device configurations you must also assign TCP/IP addresses to every node (matching the static or DHCP mode on node0) in both the source and target VLS devices in order to balance the replication load across the entire device. (See Multi-node Replication Scaling for details.) The VLS will automatically balance the active replication jobs across all the nodes that have an external TCP/IP address configured.
  • Page 167: Vls Replication By Numbers

    Figure 74 VLS Replication by Numbers HP StorageWorks VLS and D2D Solutions Guide...
  • Page 168 Create a replication target on the target device. On the target device, set up “Global LAN/WAN Target Settings” and then “Create a “LAN/WAN Target” with: • Name • Start and end slots • Maximum simultaneous transfers • Password • Target replication start window •...
  • Page 169 Configure the replication target’s window limits start time and window duration. HP StorageWorks VLS and D2D Solutions Guide...
  • Page 170 Add “LAN/WAN Destination” to the source device (to link the source device to target device). On the source device, use “Manage LAN/WAN Library” and the relevant password to link the source and target devices together in a replication pair. You can see the target VLS, slots, and copy pools on the target from the source.
  • Page 171 Create an echo copy pool (including selecting whether to use tape initialization). On the source device, select the destination library and “Create echo copy pools.” In this context, the echo copy pool acts as an asynchronous mirror between the selected source virtual library and the target library.
  • Page 172: Implementing The Initialization Process

    Run the first full backup (to replicating media in the source echo copy pool). Only backups performed after the echo copy pool is created will be replicated. Any previous existing backups will not be replicated until their cartridges expire in the backup application and are overwritten by new backups.
  • Page 173: Vls Initialization Using Physical Tape

    Figure 75 VLS Initialization using Physical Tape This shows how using the export/import pools function within the VLS replication GUI creates a stack of tapes that can be used to populate the deduplication store at the target (destination library). To implement this way of tape transfer, you must select “Initialize via Tape Transport”...
  • Page 174 The tape export will stack the virtual cartridges to replicate (i.e., the cartridges that contain the full backup) onto the available physical tapes in the selected SAN destination library. The tape export GUI will show the export status to the tape handler to determine which physical tapes are ready for removal, whether new tapes need to be loaded, etc.
  • Page 175: Vls Initialization Using The Wan

    Figure 76 VLS Initialization using the WAN Co-location Initialization In circumstances where data centers are within a small distance of each other it may be practical to co-locate the two VLS units at the same location, directly connect the GbE links from the VLS nodes together by plugging both source and target device external LAN ports into the same external LAN switch.
  • Page 176: Reporting

    Figure 77 VLS Initialization by Co-location Co-location is practical if replication is being installed along with VLS technology from day 1. If replication is being added to VLS already installed at different sites, WAN transfer is best for small quantities of data and tape transfer for larger quantities of data. Again, for many-to-one or active-active implementations, co-location is impractical and WAN transfer or tape transfer is best.
  • Page 177 Status of LAN/WAN Replication Connection You can check that the replication LAN/WAN connection between the source and target devices is online by viewing “LAN/WAN Replication Libraries” in the source device GUI. If the replication connection is offline this can be because: •...
  • Page 178 cartridges. For example, if you create 1000 cartridges in the replication target is can take 40 minutes create the matching 1000 cartridges on the source device and complete the “Adding Cartridge” operations for all 1000 cartridges. • Up to Date. The source virtual cartridge is fully synchronized with the matching target cartridge (i.e., the mirror is in sync).
  • Page 179 Target Device Replication Status On the target device GUI you can check the replication status of each target virtual cartridge by viewing “Slots” in a LAN/WAN replication targets. The “Last Successful Echo Copy” field shows the date/time of the last successful replication operation. If the cartridge has never been replicated or is in the middle of replicating the date/time is shown as “Unknown.”...
  • Page 180: Design Considerations

    On both the source and target devices there is also an option to view the “Job Summary” which provides a summary of either all outgoing jobs (source) or all incoming (target) jobs. The summary includes how many replication jobs were successful, how many jobs were rescheduled (e.g., due to cartridges being appended while they were actively replicating), and how many jobs failed.
  • Page 181: Dividing The Backup Jobs By Priority Level

    Figure 78 Dividing the Backup Jobs by Priority Level On the target device you also have the ability to limit replication traffic beyond limiting the replication time windows. This can be done by controlling how many concurrent replication jobs are allowed per replication target, and optionally controlling the maximum Mbytes/sec throughput of a single replication job.
  • Page 182: Link Sizing

    Job Concurrency Required to Network Latency Aggregate Mbytes/sec Saturate Link 50ms 80 MB/sec 100ms 80 MB/sec 200ms 80 MB/sec 500ms 52 MB/sec When you set the global per job Mbytes/sec limit on the target device, you can calculate the Mbytes/sec limit on each replication target by multiplying the per job limit by the maximum number of concurrent jobs set for that replication target.
  • Page 183: Vls Sizing Example 1

    NOTE: You should always factor in additional bandwidth to the calculated link size to take possible link downtime or replication initialization delays into account. If the replication falls behind for any reason (e.g., the link is down for 24 hours), you need more than the normal daily bandwidth to allow replication to “catch up.”...
  • Page 184: Vls Sizing Example 2

    • Required Link speed 40% of existing 200Mbit/sec link This example shows a 15 TB full database backup and 10% incremental backup of file system data to replicate daily between sites. There is already a 200Mbit/sec link between sites. The actual amount of data to replicate is 164 GB (representing 15 TB full database backup and 10% incremental of 5 TB File system backup).
  • Page 185: Vls Sizing Example 3

    Figure 81 VLS Sizing Example 3 • Database Full Backup size 200TB – 1% change rate • File System Full backup Size 66.6 TB – This example replicates one of the incremental backups, size of 10% • Initial Replication Window 24 hours •...
  • Page 186: Vls Sizing Example 4

    Data types (3 Full size in GB Replicated Time avail- Link speed Combined DB:1 FS) & data (com- able to replic- available or link speed backup type pressed) in ation/time to required in replicate Mbit/sec Database (Full) Assume fixed Assume fixed 50000 9.5 hours 1% change...
  • Page 187: Device Sizing

    • Each remote site could have different start time and replication window to ensure target device is not overloaded and best use is made of inter-site link speeds. Device Sizing When sizing a replication target device, the replicated virtual cartridges on the target device take up the same amount of disk space as the original cartridges on the source device, and they also require the same amount of working disk space for deduplication.
  • Page 188: Vls Recovery Options For A Data Center Rebuild

    Figure 83 VLS Recovery Options for a Data Center Rebuild This target virtual library can be used to restore directly onto new servers that can then be shipped to the replacement site, or to copy the data to physical tape which can be shipped to the replacement site and used to rebuild the new servers there.
  • Page 189: Restore The Vls Over The Lan/Wan

    Replication). One option would be to use the LUN Mapping feature in the VLS to hide the target virtual library from the backup application. But in the event of a disaster recovery, you can perform the sequence shown in Figure 84.
  • Page 190: Rebuilding Source Device And Re-Establishing Replication

    cartridges containing the last full backup and any incremental backups after this point. These can then be used by the backup application to import/restore from the source device’s virtual library to replacement servers. Rebuilding Source Device and Re-establishing Replication If the entire source VLS device was destroyed (i.e., you had a site disaster) and was replaced with a brand new blank device, then you must recreate the original virtual library configuration and replication configuration.
  • Page 191: Creating Archive Tapes From The Vls

    Figure 85 Creating Archive Tapes from the VLS In order to have the target backup application copy from the replicated cartridges to a physical tape library it needs to be “taught” what is on the replicated cartridges. This can be done automatically as shown in the diagram above, by using a new “ISV Import”...
  • Page 192: Vls Non-Deduplicated Replication

    Email Processing Example Script The following section details an example script to process the “ISV Import” email report from the VLS and convert it into a list of cartridges to be imported. The following steps show how this might be done in the target site (this example uses a Linux server client): •...
  • Page 193: Replicating A Subset Of Virtual Cartridges - A Use Case

    NOTE: You cannot use device-to-device replication with the VLS9000 because its back-end storage ports are wired into its private SAN and cannot be connected to a public SAN. Figure 86 Site-to-site Replication • Using TCP/IP-based non-deduplicated replication You can configure an echo copy pool to a LAN/WAN destination library (which connects to a replication target on the target device).
  • Page 194: Detailed Backup Application Guidelines For Vls

    Create an echo copy pool for the VLS2 replication library, with the library on VLS1 as the source. Automigration will automatically create the replicating virtual cartridges on VLS1 in the spare slots configured in step 1. During the automigration window you created, backups already copied to replicating virtual cartridges on VLS1 will automatically be copied over to VLS2.
  • Page 195: Data Protector Deduplication Guidelines

    Data Protector Deduplication Guidelines In addition to the Data Protector general guidelines, the following guidelines apply to a VLS with deduplication enabled: • Patch all Windows clients: HP Data Protector on Windows clients requires a software fix to elim- inate 56 KB file fragments that diminish deduplication performance on VLS and D2D products. The fix depends on your version of Data Protector: •...
  • Page 196: Data Protector Import Example Script

    ferent pools with different retention times for fulls vs. incrementals), but this means that the incre- mental backups will not be fully deduplicated because their backup jobs names will not match the full backup job names. To address this, the VLS firmware version 3.3.0 includes a new GUI option called “Edit Data Protector Configuration”...
  • Page 197: Symantec Netbackup

    for i in `seq 0 3`; if [ ${DRVBSY[$i]} -eq 0 ] then /usr/omni/bin/omnimm -recycle $BARCODE /usr/omni/bin/omnimm -export $BARCODE /usr/omni/bin/omnimm -import ${DRVNAME[$i]} -slot $BCSLOT & DRVBSY[$i]=1 break done if [ ${DRVBSY[0]} -eq 1 ] && [ ${DRVBSY[1]} -eq 1 ] && [ ${DRVBSY[2]} -eq 1 ] && [ ${DRVBSY[3]} -eq 1 ] then wait DRVBSY=( 0 0 0 0 )
  • Page 198: Netbackup Deduplication Guidelines

    NetBackup Deduplication Guidelines In addition to the NetBackup general guidelines, the following guidelines apply to a VLS with deduplication enabled: • NetBackup raw/image file backups: NetBackup raw file backups will only deduplicate if the dif- ferencing algorithm for the corresponding policy is set to Backup-level. If the policy for these files is set to File-level, these files will not be deduplicated.
  • Page 199: Ibm Tsm

    The actual script would process the cartridge list from stdin (identified by the “ISV” tag at the beginning of the line) and use the NetBackup CLI to trigger a tape import on the specified library. For example: #!/bin/bash #set -x #read from stdin CARTLIST=`grep ISV~ /dev/stdin` for CART in $CARTLIST...
  • Page 200: Tsm Deduplication Guidelines

    Value Notes Command TCPWINDOWSIZE Default BUFPOOLSIZE 32768 Default TXNGROUPMAX Default Default = 24. When setting this to 0, use an administrative schedule to execute expiration at an appro- EXPINTERVAL priate time each day. A value of 0 means the expiration must be started with the EXPIre Inventory command.
  • Page 201 • Keep mount point: The Keep Mount Point setting on each node must be set to Yes using the com- mand keepmp=yes. This command only works if there are no client sessions running when enabled. Update node * keepmp=yes Replace the asterisk (*) with a node name to update an individual node. •...
  • Page 202 all objects in the primary storage pools. Data is stored in a copy storage pool when you back up active and inactive data in a primary storage pool. A copy storage pool uses only sequential-access storage (e.g., a tape or FILE device class). The following TSM copy methods will deduplicate successfully in the VLS: •...
  • Page 203: Tsm Useful Queries

    • Disable multiplexing in Oracle RMAN backups: In the VLS firmware version 3.3.0 (or higher) the scripting requirements for Oracle data files have been relaxed, so it is no longer mandatory to assign prefix naming conventions to Oracle data files to make sure the data is properly deduplic- ated.
  • Page 204: Emc Networker

    select cast(float(sum(bytes))/1024/1024/1024/1024 as dec(8,4)) as ""Backed up data in TB"" from summary where activity='BACKUP' and start_time>timestamp(current_date)-(7)days • Display backup volume in Bytes by node for the past 24 hours: select start_time, end_time, entity, bytes from summary where activ- ity='BACKUP' and start_time>timestamp(current_date)-(1)days order by end_time •...
  • Page 205 • Windows device configuration: The following guidelines apply to Networker on Windows: • Disable Plug and Play Test Unit Ready (TUR) for any tape drivers in use. A SCSI TUR command may rewind a mounted NetWorker tape unexpectedly, causing data loss. •...
  • Page 206: Networker Deduplication Guidelines

    continuation, No space left on device, at file 2 record 0 -------------------------------------------------------------------------------- Media is marked as full and the application selects another tape and progresses through the same until all tapes are listed as “full.” The recommended action is to obtain and install HPUX 11.31 patch PHKL_36312.
  • Page 207 mands sets it to the default of 1 stream per data file). The default setting for this is disabled, and most sites do not use Striped Backups with SQL Server. HP StorageWorks VLS and D2D Solutions Guide...
  • Page 208 Virtual Library Systems...
  • Page 209: Support And Other Resources

    6 Support and Other Resources Related Information Documents • HP StorageWorks 9000 series virtual library system user guide • HP StorageWorks 300 and 12000 Gateway virtual library system user guide • HP StorageWorks 6000 series virtual library system user guide •...
  • Page 210: Hp Contact Information

    • Product serial numbers • Product model names and numbers • Applicable error messages • Operating system type and revision level • Detailed, specific questions HP contact information For the name of the nearest HP authorized reseller: • See the Contact HP worldwide (in English) webpage (http://welcome.hp.com/country/us/en/ wwcontact_us.html).
  • Page 211 Convention Element Italic text Text emphasis • File and directory names • System output Monospace text • Code • Commands, their arguments, and argument values • Code variables Monospace, italic text • Command variables Monospace, bold text Emphasized monospace text WARNING! Indicates that failure to follow directions could result in bodily harm or death.
  • Page 212 WARNING! These symbols, which mark a surface or area of the equipment, indicate the presence of a hot surface or hot component. Contact with this surface could result in injury. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching.
  • Page 213: Glossary

    Glossary automigration The feature in which the virtual tape library acts as a tape copy engine that transfers data from virtual cartridges on disk to a physical tape library connected to the virtual tape device. The HP StorageWorks D2D Backup Systems product line. deduplication The feature in which only a single copy of a data block is stored on a device.
  • Page 214 Glossary...
  • Page 215: Index

    Index backup solutions, Accelerated deduplication analyzing environment, backup job naming, application guidelines, capacity sizing, backup considerations, client naming, blocksize, defined, copy considerations, design considerations, future growth, disabling on specific backup policies, identifying performance bottleneck, how it works, LAN-free, implementing, multiple library licensing, See multiple library backup media management,...
  • Page 216 deduplication See Accelerated deduplication (VLS) or Dynamic deduplication (D2D) cartridge sizing, Accelerated, defined, backup application guidelines, design considerations, defined, Dynamic deduplication, Dynamic, housekeeping, HP solutions, implementing replication, ratios, multiple backup streams, tape oversubscription, offload performance, target-based, port optimization, design considerations portfolio, D2D, reducing housekeeping,...
  • Page 217 LAN-free backups, performance bottlenecks, licensing physical tape Accelerated deduplication, alternative to virtual tape, capacity on the VLS, port optimization, D2D, D2D replication, Dynamic deduplication, VLS replication, replication link sizing active-active, D2D replication, active-passive, LUN mapping, benefits, LUN masking, considerations, creating archive tapes, D2D, many-to-one replication, configuring,...
  • Page 218 restoring data from disk backup, See IBM TSM from replication target, deduplication guidelines, from tape copy, general guidelines, retention planning, useful queries, virtual tape libraries, backup guidelines, alternatives to, design considerations, integration, zoning, problems addressed, script examples HP Data Protector, Accelerated deduplication, NetBackup, automigration,...

Table of Contents