As a backup product, Microsoft’s System Center Data Protection Manager (DPM) does a wonderful job of protecting Microsoft workloads – from file system to SQL, Exchange, Sharepoint, Hyper-V, Active Directory, Sharepoint, and beyond. To accomplish this, though, DPM makes heavy use of a Windows feature called the Volume Snapshot Service (VSS). Introduced in Windows XP and Windows Server 2003, this was very much the Microsoft answer to a range of open file backup agents for the various backup products on the market, amongst other considerations.
These agents, normally an extra cost on top of the standard backup agent, typically introduced a file system filter driver to handle file locking that often prevented successful backups. Unfortunately, as is so often the case, these 3rd party file system filters tended to introduce their own problems – from performance issues to incompatibility with antivirus, file corruption, and system instability. I’ve hit some particularly ugly examples over the years from both backup or replication products, and from antivirus products.
Similarly, other Microsoft workloads, like Exchange and SQL, had different ways and means of being backed up. While SQL has generally been fairly simple, Exchange backups were long notorious as being excessively painful. Between database backup and restoration – a veritable minefield in itself – and the slow and painful “brick level” mailbox backups to make it feasible to recover individual mailboxes and emails, this was an area screaming out for improvement and standardisation.
Little wonder, then, that Microsoft decided to address backups with an operating system-level feature – VSS.VSS introduced a consistent way to backup different workloads using public APIs and fully supported methodologies. Better still, by backing up with VSS, you could attain a level of assurance previously unattainable – Microsoft will support your VSS backups, and work with you to recover data that you have problems restoring. Better still, it paved the way for best practice approaches to backups, such as the Recovery Storage Group feature in Exchange.
VSS wasn’t rapidly adopted by vendors – even today, you get non-standard, entirely unsupported implementations of Exchange backup and restoration, for example – but its use has certainly grown substantially. There’s good reason for that – every time I’ve used a third party vendor’s approach to backup and replication that doesn’t leverage VSS, it’s been flawed, full of bugs, and led to data loss or system instability. Every time, seriously.
The Volume Snapshot Service operates at the block level of the file system. This low level approach makes it trivial for VSS to access files, regardless of whether they have locks to indicate they’re open or not. When the operating system is signalled for backup of a file, it will initiate a “freeze” on the file; no further changes can be made to the actual file during this time. This allows VSS to make a copy, or snapshot, of the file – normally a quick and seamless process. During this freeze period, changes to the file aren’t lost – there’s handling to ensure that changes made during this time are retained and applied once the file is “thawed”, or released by VSS. This is important for large files, like Exchange databases!
This snapshot process makes it easy for VSS to backup internally consistent copies of files. The operating system can communicate with the application that “owns” the file through VSS writers – hooks by the application that allow it to communicate with VSS. Each Microsoft workload typically has its own VSS writer for this purpose, so that VSS can signal thaw and freeze events to each workload, and they will respond appropriately. You can see a list of these writers on any Windows machine by running vssadmin list writers from an elevated command prompt
Initially, VSS made only temporary snapshot copies. It wasn’t possible to keep a permanent copy of the snapshotted file, because it was intended to be used by backup applications. Over time, though – and introduced primarily through successive service packs – the VSS features were expanded to include Shadow Copies. Nowadays, both servers and clients can generate periodic snapshots of folders that can be used as short term backup copies – like an undelete or limited version control system – and allow users and administrators to recover files on the files, using the Previous Versions feature. With this feature, VSS really started to come into its own.
It’s in this context that System Center Data Protection Manager was created. Microsoft had the mechanisms needed to provide short-term disk-based backup of its various workloads. What it didn’t necessarily have, though, was a backup product for the enterprise that leveraged only supported methodologies in its approach to backup. The first version of DPM only did backup to disk, and was intended to complement existing backup solutions. It did a nice job, but it was also too limited to be of use to the majority of enterprises. It supported just 6 terabytes of data and 30 file servers on a single DPM server!
DPM 2006 did, though, introduce a number of the fundamental concepts that remain in DPM to this day. DPM leverages VSS, but further improves on it by using replicas and recovery points. This allows DPM to backup just the parts of files that have actually changed, and to offer end user recovery of data. The key concept was near-continuous data protection, leveraging disk storage as the first line of backup, before tape archival – by no means a new idea, but it included some innovative ideas to streamline the approach. The ability to periodically synchronise data throughout the day was a differentiator that made it work a look, but it was with DPM 2007 that the product started to become a truly viable competitor to existing backup products.
DPM 2007 introduced tape backups, increased protection capabilities, vastly enhanced scalability, and true disaster recovery with the secondary DPM server capabilities. It’s here that the idea of disk-to-disk-to-disk-to-tape entered DPM, and of course it wasn’t long before cloud was introduced as another option. DPM 2007 pushed the boundaries of what was possible with VSS as it existed in Windows 2003 and Windows 2008, and it became necessary to make a major change to VSS in Windows 2008 R2. Prior to Win2k8 R2, VSS could support an absolute hard maximum of 10,000 volume snapshots. This was a hard limit, because the registry hive storing the data only allowed for 4 byte key names – 0000 to 9999. VSS also didn’t clean up nicely after itself on hitting this limit.
I believe that, with Win2k8 R2, this limit is now around 30,000 volume snapshots, and that this is just a soft limit. Even if it’s a hard limit, it’s evident that VSS now cleans up properly after itself. You can use either DPM 2007 or DPM 2010 with Windows 2008 R2, although the improvements in DPM 2010 would lead me to say that DPM 2010 is more or less a must. DPM 2010 introduced auto-growing of replica and recovery point volumes, shrinking of volumes, auto-healing of volumes, SQL self-service recovery, client computer protection, and is overall leagues ahead of DPM 2007 (which was, nonetheless, a great product)!
This dependence on VSS does, of course, make DPM susceptible to operating system bugs and limitations – the VSS limitation above being an example. With DPM 2007, I had quite a long list of hotfixes for Windows 2003 x64 to address VSS and storage bugs, performance, and enhancements. More recently, with Windows 2008 R2 SP1, I’ve found it necessary to develop lists of hotfixes for stability and performance. Such is the price of such deep OS integration, perhaps, but I remember the “bad old days” of third party filter drivers, and I’d easily pick addressing Windows bugs over tussling with third party vendors over their bugs, any day.
In many ways, DPM has helped to drive VSS forward substantially. While you may not use DPM in your own environment, if you use VSS, you can perhaps thank DPM for driving it forward from its initial, limited capabilities. You might prefer to curse VSS, too! However by understanding how it has evolved over time, you might better appreciate the journey it’s taken, and appreciate its overall utility to Windows environments. I for one am looking forward to the next version of DPM, and its generic data source protection in particular as the next logical evolution. The more DPM evolves, the more VSS benefits too!
I love DPM and VSS almost as much as I love tweeting: @OhCrap
Browse more posts:
Enjoyed this post?
Help us spread the word by sharing with friends and colleagues!