Data Warehouse vs. Data Lake: What’s the Difference?
Learn the difference between data warehouses and data lakes, and how businesses use them.
Virtualization is great: it can simplify deployments, hardware maintenance, and provides greater resiliency. Lots of upside, but occasionally it introduces a new item that creates work for admin teams.
When VMware introduced vSphere-based thin provisioning for virtual machine disks, it was a huge help for many I.T. teams around the world. At the time (2009) thin provisioning in the storage cabinet was often a very expensive option. I had the privilege of doing the onsite training and implementation of vSphere 4.0 for a customer that was faced with needing another, bigger SAN because of the pre-allocated monolithic disk model of virtual disks prior to vSphere 4.0. Once we had their environment upgraded to 4.0, they could perform a storage vMotion and convert to thin provisioned virtual machine disks. $1MM SAN purchase no longer necessary.
However, we all know there are trade-offs in every scenario. The trade-off here is that over time, the virtual machine disks grew through normal use of the disk by the guest OS. As the virtual machine guest OSes created and then deleted temporary files, additional blocks were consumed (allocated) on the underlying storage array, but no longer in use by the virtual machine. Creating the allocated, but unused block dilemma. The solution to this situation was to periodically run the unmap command manually.esxcli storage vmfs unmap
--volume-label=volume_label|--volume-uuid=volume_uuid --reclaim-unit=number
Not ideal, but probably acceptable given the space savings thin provisioning could deliver in many deployments.
Six years later, vSphere 6 entered the market, including version 6 of the VMFS file system. Yay! Unmap is now an automatic behavior . . . but it’s not immediate. Some legacy storage doesn’t do well with large unmap jobs, so in VMFS 6, it can take up to a day to see the space reclaimed from delete actions.
It is recommended that you leave auto unmap as a background task if you have an older storage platform. If your environment has high storage turn-over or if you are just a fan of instant gratification, and you have a newer array, you can tune VMFS 6 to process the unmap faster when the guest OS deletes the blocks.
If you want to increase the rate of the unmap operations, the quickest way is to set the space reclamation priority. To set this to anything other than “none” or “low”, you’ll need to drop out to the esxcli to set the reclaim configuration. Changing the priority from low (the default) to medium will double the rate at which the host send unmap command from 25-50 MB/sec to 50-100 MB/sec. Hopefully you are asking yourself, “how will my array respond to that?” Before you make any changes in the automatic unmap configuration, you’ll need to check the specs and recommendations for your specific array.
One last note, if your array uses anything larger than 1MB for it’s unmap granularity, VMFS 6 won’t perform the auto unmap, it only works with 1MB granularity. If your storage arrays uses a larger unmap granularity, you’ll still need to periodically run the manual unmap command.
Take a look at our GTR schedule of vSphere training