Thin or Thick Provisioning in VMware?


To totally unlock this section you need to Log-in


Login

Lets begin by looking at the various types of Virtual Machine disks (VMDKs) that are available to us.

VMDK Overview

  • Thin - These virtual disks do not reserve space on the VMFS filesystem, nor do they reserve space on the back-end storage. They only consume blocks when data is written to disk from within the VM/Guest OS. The amount of actual space consumed by the VMDK starts out small, but grows in size as the Guest OS commits more I/O to disk, up to a maximum size set at VMDK creation time. The Guest OS believes that it has the maximum disk size available to it as storage space from the start.
  • Thick (aka LazyZeroedThick) - These disks reserve space on the VMFS filesystem but there is an interesting caveat. Although they are called thick disks, they behave similar to thinly provisioned disks. Disk blocks are only used on the back-end (array) when they get written to inside in the VM/Guest OS. Again, the Guest OS inside this VM thinks it has this maximum size from the start.
  • EagerZeroedThick - These virtual disks reserve space on the VMFS filesystem and zero out the disk blocks at creation time. This disk type may take a little longer to create as it zeroes out the blocks, but its performance should be optimal from deployment time (no overhead in zeroing out disk blocks on-demand, meaning no latency incurred from the zeroing operation). However, if the array supports the VAAI Zero primitive which offloads the zero operation to the array, then the additional time to create the zeroed out VMDK should be minimal.

Option 1 – Thin Provision at the Array Side

If you're storage array supports it, devices/LUN can be thinly provisioned at the back-end/array. The advantage is physical disk space savings. There is no need to calculate provisioned storage based on the total VMDKs. Storage Pools of 'thin' disks (which can grow over time) can now be used to present datastores to ESXi hosts. VMs using thin or lazyzeroed VMDKs will now consume what they need rather than what they are allocated, which results in a capex saving (no need to purchase additional disk space).

Thin or Thick Provisioning in VMware?

Most arrays which allow thin provisioning will generate events/alarms when the thin provisioned devices/pools start to get full. In most cases, it simply a matter of dropping more storage into the pool to address this, but of course the assumption here is that you have a SAN admin who is monitoring for these events.

Advantages of Thin Provisioning at the back-end:

  • Address situations where a Guest OS or applications require lots of disk space before they can be installed, but might end up using only a portion of that disk space.
  • Address situations where your customer state they need lot of disk space for their VM, but might end up using only a portion of that disk space.
  • In larger environments which employ SAN admins, the monitoring of over-committed storage falls on the SAN admin, not the vSphere admin (in situations where the SAN admin is also the vSphere admin, this isn't such an advantage).

Option 2- Thin Provision at the Hypervisor Side

There are a number of distinct advantages to using Thin Provisioned VMDKs. In no specific order:

  1. As above, address situations where a Guest OS or applications require lots of disk space before they can be installed, but might end up using only a portion of that disk space.
  2. Again as above, address situations where your customer state they need lot of disk space for their VM, but might end up using only a portion of that disk space.
  3. Over-commit in a situation where you need to deploy more VMDKs than the currently available disk space at the back-end, perhaps because additional storage is on order, but not yet in place.
  4. Over-commit, but on storage that does not support Thin Provisioning on the back-end (e.g. local storage).
  5. No space reclamation/dead space accumulation issues. More on this shortly.
  6. Storage DRS space usage balancing features can be used when one datastore in a datastore cluster starts to run out of space on one datastore, possibly as a result of thinly provisioned VMs growing in size.

Thin Provisioning Concerns

There are a few concerns with Thin Provisioning.

Possibly the biggest issue that we have with Thin Provisioning is running out of space on a device that is Thin Provisioned at the back-end. Prior to vSphere 5.0, we didn't have any notifications about this in the vSphere layer, and when the thinly provisioned datastore filled up, all of the VMs on that datastore were affected. In vSphere 5.0 a number of enhancements were made through VAAI:

  • VAAI will now automatically raise an alarm in vSphere if a Thin Provisioned datastore becomes 75% full.
  • VMs residing on a Thin Provisioned datastore that runs out of space now behave differently than before; only VMs which require additional disk space are paused. VMs which do not require additional disk space continue to run quite happily even though there is no space left on the datastore.
  • If the 75% alarm triggers, Storage DRS will no longer consider this datastore as a destination.

The second issue is dead space reclamation and the inability to reuse space on Thin Provisioned datastore. Prior to vSphere 5.0, if a VM's file are deleted or if a VM is Storage vMotioned, we had no way of informing the array that we are no longer using this disk space. In 5.0, we introduced a new VAAI primitive called UNMAP which informs the array about blocks that are no longer used. Unfortunately there were some teething issues with the initial implementation but we expect to have an update on this very shortly.

If the VMDK is provisioned as thin, then each time the VMDK grows (new blocks added), the VMFS datastore would have to be locked so that it's metadata could be updated with the new size information. Historically this was done with SCSI reservations, and could cause some performance related issues if a lot of thinly provisioned VMs were growing at the same time. With the arrival of VAAI and the Atomic Test & Set primitive (ATS) which replaces SCSI reservations for metadata updates, this is less of a concern these days.

Thick on Thin

First, Eagerzeroedthick VMDKs do not lend themselves to thin provisioning at the backend since all of the allocated space is zeroed out at creation time. This leaves us with the option of lazyzeroedthick, and this works just fine on thin provisioned devices.

In fact, many storage array vendors recommend this approach, as the management of the over-commited devices falls to the SAN admin, and as mentioned earlier, most of these array have alarms/events to raise awareness around space consumption, and the storage pool providing the thin provisioned device is easily expanded. One caveat though is the dead space accumulation through the deletion of files and the use of Storage vMotion & Storage DRS.

Thin on Thick

One of the very nice things about this appraoch is that, through the use of Storage DRS, when one datastore in a datastore cluster starts to run out of space, possibly as a result of thinly provisioned VMs growing in size, SDRS can use Storage vMotion to move VMs around the remaining datastores in the datastore cluster and avoid a datastore filling up completely. The other advantage is that there are no dead-space accumulation/reclamation concerns as the storage on the back-end is thickly provisioned. One factor to keep in mind though is that Thin provisioned VMDKs have slightly less performance than thick VMDKs as the new blocks allocated to the VMDK were zeroed out before the I/O in the Guest OS is commited to disk. The metadata updates may also involve SCSI Reservations instead of VAAI ATS if the array does not support VAAI. However, once these VMDKs have grown to their optimum size (little further growth), then this overhead is no longer an issue/concern.

Thin on Thin

This is the option I get the most queries about. Wouldn't this give you the best of both worlds? While there is nothing inherently wrong with doing thin-on-thin, there is an additional management overhead occurred with this approach.

While VAAI has introduced (VMware ESX 4.0) a number of features to handle over-commitment as discussed earlier, thin provisioning will still have to be managed at the host (hypervisor) level as well as at the storage array level. But keep in mind that this level of over-commitment could lead to out of space conditions occuring sooner rather than later. At the VMDK level, you once again have the additional latency of zeroing out blocks, and at the array level you have the space reclamation concern.

With all this in mind, you will have to trade off each of these options against each other to see which is the most suitable for your environment.

How much space does a thin disk consume?

On classic ESX, you can use the du command against a VMDK to determine this. On ESXi, you can use the stat command against a VMDK to get the same info.

# stat zzz-flat.vmdk
  File: "zzz-flat.vmdk"
  Size: 4294967296      Blocks: 3502080    IO Block: 131072 regular file
Device: c2b73def5e83e851h/14030751262388250705d Inode: 163613572   Links: 1
Access: (0600/-rw——-)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2012-02-29 13:04:53.000000000
Modify: 2012-03-01 15:29:11.000000000
Change: 2012-02-22 00:12:40.000000000

This is a thinly provisioned 200GB VMDK (4294967296 * 512) but only ~ 1.8 GB (3502080 * 512) used.

But what's VAAI (vStorage APIs for Array Integration)?

vStorage API for Array Integration (VAAI) is an application program interface (API) framework from VMware that enables certain storage tasks, such as thin provisioning, to be offloaded from the VMware server virtualization hardware to the storage array.

Offloading these tasks lessens the processing workload on the virtual server hardware. For a storage administrator to make use of VAAI, the manufacturer of his storage system must have built support for VAAI into the storage system.

Introduced in vSphere 4 with support for block-based (Fibre Channel or iSCSI) storage systems, VAAI consisted of a number of primitives, or parts.

"Copy offload" enables the storage system to make full copies of data within the array, offloading that chore from the ESX server.

"Write same offload" enables the storage system to zero out a large number of data blocks to speed the provisioning of virtual machines (VMs) and reduce I/O. Hardware-assisted locking allows vCenter to offload SCSI commands from the ESX server to the storage system so the array can control the locking mechanism while the system does data updates.

In vSphere 5, vStorage APIs for Array Integration were enhanced. The most notable new functionality addresses thin provisioning of storage systems and expands support to network-attached storage (NAS) devices.