Thursday, March 10, 2011

Windows 7 Backup does not work

Windows 7 backup often fails for a variety of reasons. You can usually troubleshoot these by the error messages that it gives you. It is torture, but it can be done. However, doing both a successful backup and then a subsequent restore if your system fails is so unlikely as to be effectively impossible.

Here is the simple rule for Windows backups: they only work when you don't need them.

If you look on the web for some assistance for backup and restore you will find precious little information. What you do find is all but unusable by anyone who is not an expert.

If everything is well and all goes smoothly, you can use instructions at sites like dummies.com. Here is a page on a vanilla restore from an existing backup:


I am making this post because I simply *had* to use Microsoft backup on Windows 7 and I *had* to restore that backup. It did not work.

There are so many things that can fail with Windows backups that it is impossible to itemize them here. I will just offer up the solution that allowed me to restore my files.

The files had been backed up to an external USB drive. This seems, from what I can tell, to be the only really viable method of backup for most people. It should be dead simple. Most common scenarios would have been tested if anyone seriously expected backups to both be made and then restored. Since Microsoft has probably something like 100,000 or more of its own desktops and tens, hundreds or thousands of millions of client desktops, you would think that a simple scenario would have been tested.

Here is my scenario: backup to a 2 TB USB hard drive and then restore from that drive shared on the network.

The backup was hampered by the usual errors. One that comes to mind gives the error "the operation timed-out before the shadow copy was created". It helpfully suggests to try another time. After trying off and on over a period of days I was finally able to get it to run successfully. There is nothing wrong with the drive or anything. I routinely move about gigabytes of data on that drive and this is the only operation that dies on that disk. Another error encountered was the system being unable to find the backup location when attempting to do an incremental backup to the last set. The fix for this is to re-run the backup utility and choose the drive and get it to try finding the backup a couple of times. Sometimes you have to also change the permissions of the backup directory and files. This is well beyond anything that it is reasonable to expect from an ordinary PC user.

When it came time to restore my hard won backup (it took literally days), it came up with an error:

Windows was not able to find any backup sets.

This is because it only expects to be restoring to the same machine from a prior backup and if you are restoring to a new machine you don't have any. It does not bother looking and it does not bother telling you that you should look either. So ... unless you have the backup on a network share so you can browse for it, you have to start looking for the files manually and inspect them manually to see what (if anything) can be done to repair them.

If the files are on a network share, you are not out of the woods. What I got when I went to the share with the backup on it was the message:

"Windows was not able to find any backup sets"

and the chipper "Please select a different location"

There is no meaningful help offered anywhere that I can find to solve this issue. I definitely had a backup that the originating machine could see. I even went back to it to do a test restore of a few files to make sure it was real.

The only thing that I could think of was that it was not visible somehow due to permission problems. I went back to the originating machine and carefully took ownership of the files as 'Everyone' with full permission and then removed all the other permissions. This was after a few false starts with permissions that were more sensible. Exasperated, I decided to give up and just give the world all permissions to make sure that was not the problem. I know from other forays with this stuff that permissions *can* be the problem. In this case, though, it was not. That is, it *was* permissions, just not on the backup itself. It was somewhere else.

The last thing that finally got the backup and its files visible was to change the permissions and ownership of a file in the root of the disk used for the backup:

MediaID.bin

As with the files themselves, I changed this to world readable, world writable (permissions assigned to the user/group 'Everyone').

That last bit finally got the backup visible and I was, after days of misery, able to restore the backup.

This helped me, but it is not a fix. I am at a loss as to why nobody will 'fess up' that you cannot effectively backup and restore Windows systems. It is theoretically possible, if the stars align themselves, but for any practical purpose it is not possible. It has obviously not been very extensively tested by Microsoft and is not in widespread use. This is one of the shameful secrets of the computer industry. Everywhere you look, you are glibly told to do a backup before trying anything to correct problems. Nobody mentions how exceedingly difficult it can be.

A backup that you cannot restore is useless as a backup.

When you are in the position of attempting to restore a backup you usually do not have access to the original system. You are attempting to recover from a failure of that system.

If you are in the most dire position of attempting to recover from a catastrophic failure, chances are good that other things have gone wrong. This is not the time for an elaborate fussy, fragile system that breaks the instant something is not exactly as anticipated by the designer. Certainly it is not the time for an elaborate system that has not even been fully tested. This is the time for a dead simple system that allows you to find and open a single file that has whatever it needs to restore your files. Requiring that the system already know about the backup and it have a special file in the root of a hard disk and must be stored in a certain place, etc is just downright crazy. It is a sure-fire recipe for failure.

I have been working with systems for decades. It is exceedingly rare for me to lose anything and I never lose anything important. How is this accomplished? I make multiple copies of files. The number of copies is proportional to their importance. Mission critical client data is generally backed up across a number of geographically distributed systems.

The disaster recovery cycle is torturous in the extreme. I expect that most people end up losing most of their stuff from time to time because there is no reliable way for them to prevent it.

I am surrounded by enough systems that they *do* fail and we are required to bring them back up. Typically this involves a fairly time consuming cycle of re-building or replacing equipment, re-installing software and then restoring files and settings from copies.

Windows backup *is* used successfully, but it has to be very carefully and conservatively set up, has to be dead vanilla and has to be periodically tested (you have to do a test recovery). Even then, any critical data needs to be backed up separately and the backup has to account for a wide variety of settings that depend upon the version of Windows being used, the particular software being used and the configuration of the source system (the thing you backed up) and the target system (the thing to which you are restoring).

If you are very careful, you will note one of the signs that the backup/restore facilities have not been extensively tested (by anyone anywhere). When you are choosing the network location to *restore* from, the dialog has the following embarrassing caption:

"Select the folder where you want to save the backup"

Imagine if, every time that you went to load a file, the system asked you to choose the file to save. Each time you saw that, you would have to convince yourself that selecting the file would not end up overwriting your only copy. It would be irritating in the extreme. You don't see that in major products because they are being used all the time. People would quickly catch the problem. The fact that the above caption is messed up like that means that even though it should be something that has been used hundreds of millions of times it likely has only been used a few thousand or less.

The bottom line is that you should test however you are going to backup and subsequently recover. It is possible to backup and restore using Windows' built in facilities, but it is not very likely. You should not depend upon it for anything important.

Microsoft should do penance for all the misery it has caused. Here is a way that I think they could make amends and cure this problem once and for all: Fund an open-source project to develop a proper backup facility that would be able to backup and restore files from/to the majority of commodity systems. It should be able to save, for instance, a backup from an old Windows 95 box and restore that to a modern Linux box. This is not simple and for some things, there is no apples-to-apples equivalent of source/target. Linux, for instance, does not have a registry.

I do not expect that a complex installation of a Microsoft IIS web server and development environment can be backed up and restored to a Linux system. However, I *do* expect that a Windows 7 system can be backed up and restored to another Windows 7 system.

I am aware of challenges such as the registry, DLL hell, disk geometry, drivers, etc. It is a story for another day, but all that stuff is horribly broken. The problems with these systems goes back a long way. A backup program cannot fix everything, but it should be able to fix many of the things that currently make it so difficult to maintain systems.


1 comment:

Dylan Stinks said...

I love your article. Lots of genuine laughs. I cannot find this MediaID.bin file you speak of in the root of my backup drive, even with all protected files showing. Thanks again for the laughs. TYou are a fine writer.

Note -- this is a working draft that is changing as you read this.  "First, LLMs do have robust internal representations. Second, there...