Friday, March 6, 2009

Manual commit snapshots delta file to vmdk flat file

I had a tough time this week to deal with the snapshot issue with one of the VM. The VM is containing an important snapshot that previously taken for system restoration. When I browsed through the snapshot manager from vCenter, the system show my VM was running without any snapshots. Here was the kicked start of my problem and excited journey until I managed to recover it this morning.

I tried to SSH to the ESX host and browse to the specified datastore, and I found the snapshot file which end with file extension .vmsn were available in the correct location. No matter how many times I tried and rebooted my Virtual center, the snapshot were not visible to the snapshot manager still.

I read through some articles and forums which suggested to clone the snapshot by using vmkfstools -i option, but it didn't success in my case here, and I continue my research and here I found a useful blog post from 1 of the blogger Oliver O'Boyle who experienced similar issue previously.

After I read through his article, which explained the chain within the CID and parent CID, it does help me to resolve my issues. I found that the root cause of my VM was due to the snapshot problem & vmdk config file corruption. For snapshot issues, we can recreate a new snapshots and we select to delete all snapshot afterward, it should force the vmdk flat files and delta files to be committed. In 1 of the virtual hard disk, we experience difficulty as the ESX servers will force the virtual HDD to be detached from the VM. The root cause of that was caused by the file missing on the parent file which should be VMxxxx.vmdk.

During this troubleshooting, you should ensure that the delta files and flat files are always retained and not overwritten. There are 2 delta files which end with VMxxxxx-000001.vmdk and VMxxxxx-000001-delta.vmdk. Your flat file should end with VMxxxxx-flat.vmdk. The 1st thing I did, was to ensure the virtual disk was able to re-attached the vm. I had manually created a new vmdk config file follow the guide from the Oliver O'Boyle, and I copy the parent CID and virtual disk value number require. I had manually configured the link within .vmdk and the flat file. After that, I was able to attach the virtual disk back to the VM from virtual center. Please take note that the virtual center will not see the flat files as the attachable virtual disk, as vCenter recognize the virtual disk base on the location of .vmdk. Recommended to keep the .vmdk and flat file within same datastore. You can also relocate the vmdk files to different datastore if you wish to do so.

Once the virtual disk had been attached to the VM, boot up the VM immediately. Please log in to the system and ensure everything is in normal and functioning correctly. The data I contained now, wasn't the latest data I needed as the result of the missing snapshot which was not committed by the system. Now, I take a new snapshot for my entire VM. Once I had done that, datastore in SSH showed up with plenty of delta files and newly created VMDK files which end with VMxxxxxxx-000003.vmdk and so on.

Here are the steps been taken to commit the snapshots manually

  1. Power off the VM
  2. Right click the VM and select edit settings from vCenter and select the virtual disk that you are trying to recover. The system will show which vmdk files this virtual disk is pointing to
  3. Copy down the file names and go back to your SSH screen
  4. Replace the VMDK and delta files that you previous retain from your original snapshots which you are recovering with the FILE NAMES that you copy on step 2
  5. Open up the snapshot manager for the VM, and select delete all snapshots option. This process will take time as it depend the size of your delta files require to be committed.
  6. It should stuck at 95 % or time out, but the system will still continue to commit the delta files back to the flat files. In my case, it took more than 2 hours to delete the snapshot
  7. I noticed the ESX server load and disks activity increased from the performance chart
  8. Once it completed, all the delta files will be deleted and everything should be back to normal
  9. Power on the VM and double check all the data and mount point and I found the system was back to normal

1 comment:

bb said...

Interesting article... but there much that I have to learn about it... thanks!

 
Site Meter