March 1, 2023
Application performance can often hinge on how well your storage can serve data to end clients. For this reason you must correctly design or choose your storage tier speed in terms of both IOPS and throughput, which rate the speed and bandwidth of the storage respectively.
IOPS stands for input/output operations per second. It’s a measurement of performance for hard drives (HDDs or SSDs) and storage area networks. IOPS represents how quickly a given storage device or medium can read and write commands in every second.
The “back end” IOPS relates to the physical constraints of the media itself, and equals 1000 milliseconds / (Average Seek Time + Average Latency), with the latter two also measured in milliseconds.
Back end IOPS is dependent on the rotational speed of the HD, if applicable (solid state drives do not rotate, while traditional hard drive disks do). The Average Latency in the above formula is the time it takes the disk platter to spin halfway around. It is calculated by dividing 60 seconds by the Rotational Speed of the disk, then dividing that result by 2 and multiplying everything by 1,000 milliseconds. ((60 / RPM)/2)*1,000.
Of course, for solid state drives, the average latency drops significantly, as there is no rotating disk inside. Therefore you can just plug in 0.1 ms as the average latency to account for network traffic between the processor in your server/virtual machine and the storage array or device. More on network issues shortly.
Average Seek Time is the time it takes for the head (the piece that reads data) to reach the area on the disk upon which that data is stored. The head needs to move around the storage area in order to locate the targeted data. You must average both write and write seek times in order to find the average seek time.
Most of these ratings are given to you by the manufacturers. Generally a HDD will have an IOPS range of 55-180, while a SSD will have an IOPS from 3,000 – 40,000.
Different applications require different IOPS and block sizes to function properly. A single application may even have different components that function at different size ranges for blocks. It is vital to check software providers’ recommendations around block size and performance.
Block loosely translates to any piece of data. File systems write entire blocks of data rather than individual bits and bytes. A file system block can stretch over multiple sectors, which are the physical disk sections. Blocks are abstract representations of the hardware that can (or may not) be a multiple of physical block size. Every file takes up one block no matter its size, so choosing the correct block size to efficiently consume storage can make a big difference when it comes to performance.
For example. SQL server logs in 64 kb blocks, while Windows Server might use 4 kb blocks, and the underlying vSphere hypervisor uses 1 MB blocks. As you’ll see, the block size and IOPS have a snowballing effect on network traffic throughput.
Small block sizes are preferred for a large volume of smaller files, because you are more efficiently using the storage (remember that even small files must consume an entire block — so a file that is only a couple of bytes will use an entire 4 KB block).
You’ll want to architect storage to properly align according to multiples of the base sectors. For example, four sectors of 4 KB will fit on a block of 16 KB, which fits on a stripe of 64KB nicely. Otherwise you run into additional latency and will have to defragment frequently as your blocks will overlap different blocks and sectors (and perhaps even disks in a storage array) as you read and write. vSphere 5.0 and up will automatically align with a 1 MB block, as mentioned above.