Pages

Tuesday, December 02, 2008

Storage Analysis of VMware View Composer

Can I turn 16TB of storage for 1000 VDI users into 619GB, let me show you how it’s actually done. The release today of VMware View Manager 3 brings to market the long anticipated thin provisioning of storage for virtual desktops. Previewed in 2007 as SVI (Scalable Virtual Images) what does this now released View Composer linked clone technology look like under the hood? How much storage will it actually use?

Here is the diagram presented on page 94 of the View Manager Administration Guide (http://www.vmware.com/pdf/viewmanager3_admin_guide.pdf).



This diagram as presented is a conceptual view of the storage. The important logical elements to note here are

  • Parent VM. This is the standard virtual machine you use to create and maintain your various versions of your image. It can have various versions as different snapshots.

  • Replica. The replica is a copy of a specific version of the Parent VM. That is its one of your snapshots states of the Parent. The key thing here is that the disk in the replica is a thin provisioned copy of the parent disk and snapshot.

  • Clones. The diagram shows two clones. The clones are an instance of a replica for a particular VM. For its disk the clone uses the thin provisioned disk in the replica plus its own snapshot to provide the disk. Changes the clone makes to its disk are isolated in the clones snapshot and the replica disk remains untouched and shared by all the clones.


The diagram in the manual is not a great representation, here is my own one that adds some needed detail. Of course there is more complexity, but we can handle that.



What are we looking at here.


  • Each box is a directory in your Datastore. You have one directory for your Parent VM (what I have labeled base image), one for each replica, a special one called a source and one for each clone (which I have labeled user).

  • The Parent VM (base image) in blue is your standard VM. Notice that the C: disk is thick provisioned as is normal in ESX. A 15G disk will consume 15G in your base image. Then you have your snapshots. Notice that the C: and snapshot 0002 have been combined logically, as this is our view of the disk from the VM.

  • Using the Add Desktop wizard in the View Administrator you can create a pool of desktops based on a snapshot from a ParentVM. As part of the process you have to choose a VM and one of its snapshots. When this is done a unique replica is created. This process is marked as (1) on the diagram. Here a copy of the machine is performed, into a new directory however the disk is thin provisioned. If our original disk was 15G yet only 2G was consumed, the disk in the replica will only by 2G. This process can take a short period of time as the data copies, but it is a once off process. This thin provisioned disk is the master disk that all of the clone VMs will use as their base. You can make changes to the parent VM, and the replica can not be harmed.

  • What is not shown in the documentation is that a source directory is also created. This source directory is unique to the replica and contains all of the files required to make a clone. These files are essentially your standard VM files with an empty snapshot of the disk in the replica. It is my thinking that clones are created by making copies of the files in this directory. This is why the cloning process is very very fast, all of the work in the background is mostly done. My testing shows under 60 seconds to deploy a new clone. Again this creation of the source directory is a once off process.

  • The clone (labeled user) directories are created once for each VM in the pool as required/directed by your pool configuration. The directory name is based on the naming convention given at pool creation. Here we have the files required to run the VM instance. The two important files are firstly the snapshot file which is based of the thin provisioned disk in the replica directory. This is where all of the writes for the VM will be stored, so this file will grow over time (until you Recomposition, Refresh or Rebalance the VM). The diagram tries to shows how the C: drive is made of the combination of the thin disk from the replica directory and the local snapshot file in the clone machine. A separate thin provisioned disk is also created for the user D: drive. This is where the quickprep and user data is stored. This user D drive will grow over time as data is put there, it can’t cant be shrunk.


There you have it, the storage layout of View Composer. What does it look like in reality? Here are some screen snippets.


Datastore Directories.

Here is the directories in the Datastore. You can see there is one Parent VM (XP_VMMAST), one replica directory with its matching source directory, and 3 clone directories XPROD 1 through 3.


Parent VM directory files

These are the files in the parent VM, all the usual suspects and the disk is 15G thick provisioned.


Replica directory files

These are the files in the replica directory. Notice the disk has shrunk to only 2G as its thin provisioned and there is our snaptshot which does not really get used.


Source directory files

These are the files in the source directory. They are pristine and clean, ready for use as a clone. Notice the vmdk file, its based on the replica name, a special kind of snapshot.



Clone VM directory files

These are the files in the actual VM clone directory, one directory for each provisioned desktop. Notice that the vmdk file has grown. This is the growth after booting windows the first time and letting it settle, 50M. Notice there are two more files here, one is the user D disk, which is persistent but thin provisioned, its grown to 23M in size. There is also a vswp file as the machine is booted, otherwise if suspended it would be the vmss.

There you have it. For this test of a 15G machine with just over 2.1G used, what would the storage look like for 1000 users. We will leave the user space aside, we need to cater for that either way. We just want to compare the old method with the new View Composer.

Parent VM is 16G.

Replica and source is 2.1G

1000 machines including swap is 600G

Grand storage space for 1000 users is 619GB.

Compare that to one week ago when it was 16TB, that’s some saving. Of course these figures a little extreme, we now have 1000 users running off a single 2.1G thin provisioned disk, its going to need a lot of spindles to deliver the IOPS required.

Exciting times. We are all going to see how View plays out over the next six months. There is some great architectural work to be done in designing for implementation.

Rodos (Rod Haywood)
Senior Consulting Architect - Virtualisation
Alphawest Services, Sydney Australia

3 comments:

  1. Anonymous3:46 am

    Do you know where the Replica copy is created? Does it always get created in the same datastore as the parent, or does it get randomly created in one of the datastores available to the desktop pool?

    Since all of the desktops based upon a specific replica will reference back to the data in the replica, I'm thinking that there could be a large amount of disk I/O to that replica (depending on number of clones based on it). Is there a possible performance issue here, or is there some sort of memory caching mechanism that would prevent an overload of disk I/O to the replica?

    Thanks
    juice13

    ReplyDelete
  2. Anonymous10:35 am

    Hi Rodos,

    I came across your postings after reading numerous recommendation to your sites. Your articles are really helpful, and I came across some issue that hope you can help.

    1. How are you able to get thin-provisioned replica vmdks? The replica that gets created from my parent VM are all thick-provisioned.
    2. Is the replicaxxxx.vmdk file in the source directory a snapshot? It is a config file in my source directory.
    3. What's the relationship between replica and source directory? How do they affect each other? Please explain more.

    I'll try to answer Juice, and please correct me if I'm wrong:
    - The replica is created in the designated datastore that you provided and each datastore can only have one replica.
    - Since non-persistent VMs are performing I/Os on the same source, so there shouldn't be large amount of disk I/O, but I think that there could be a potential I/O issue with persistent VMs.

    Thanks in advance.

    ReplyDelete
  3. Anonymous8:23 pm

    For load balancing IO, you can chose multiple datastores for your clones, one replica will be created per datastore. The more IO / Users you need, the more datastores you should choose.

    ReplyDelete