Harddisk speed increasingly turns out to be a bottleneck in cloud performance. Processors are getting quicker and traffic, RAM memory and harddisk space have been dropping in price. The harddisk speed (harddisk IO) has not structurally improved over the last few years however. Quicker disks, like Solid State Disks (SSDs), remain relatively expensive at the moment. The marginal improvement in harddisk IO has become ever more obvious as other capacities like RAM and CPU became more abundantly available to cloud users. This effect was especially pronounced with virtual servers using a Linux OS because most heavy applications run on Linux.
Some providers achieve a (modest) improvement by storing the data on harddisks within the servers the VPS is hosted on. Because we offer our customers High Availability virtual server we store our customers’ data not on the server itself but on shared storage machines. This allows us to create a redundant infrastructure but it can also cause a bit of extra delay between the virtual servers and their storage.
We believe the future eventually belongs to SSD disks but using these disks for all your storage results in a very expensive virtual server for the time being. We believe the best price/quality balance is currently achieved by applying storage differentiation. This means storing data that is read of written over a lot on fast storage media. Data that is not often requested will be stored on a slower type of storage.
Increasing Harddisk IO throughput
At the beginning of this year we started to take structural measures in order to speed up harddisk IO significantly. We started to use a more efficient way to control the disk arrays. Another method that we employed is the use of caching.
Caching means that recent read or write requests are temporally stored on a fast storage medium. In the case of read caching the data that is read again during the caching period (a re-read) can be found on the fast disks.
In the case of write caching, write operations are first stored on an SSD disk or even on RAM memory so the virtual server can quickly continue with the next operations. After a certain amount of write data is gathered then that is stored on a cheaper but slower medium in one go. The first step within our caching project was increasing the available write cache within our clusters.
We also wanted to improve the read speed for our Linux virtual servers however. The largest improvement can be achieved by using SSD disks located within the servers hosting the virtual servers. Requests will be stored on these SSD disks for a certain time and if they are being read again they are available with blistering speed. This is ideal for images or pieces of PHP code that are requested repeatedly.
Developing dm-cache with Florida International University
It quickly transpired that no software was available that could manage such a caching layer within a server with an external storage system. We discovered that the Laboratory for Virtualized Infrastructure, Systems, and Applications (VISA) from Floriday International University had developed an open source project named dm-cache that could be adapted to do the job. Dm-cache is a completely new open source caching-methodology that operates from within the Linux kernel.
We subsequently got in touch with Dr. Ming Zhao, the assistant professor heading up the VISA group. He was very interested in expanding the project for cloud deployments and we indicated we wanted to support him in this.
In order to put our money where our mouth was we have sponsored the PhD student Dulcardo Arteaga from September of 2011 onward so he could work on this project. We also provided continuous feedback and real life performance data to the development process. On the 23th of July we stared a public beta with a couple of hundred customers and other interested parties. Dulcardo has been our guest for the entire summer in order to help with this project.
SSD Caching for Linux is Activated
We successfully finalised the beta test last week. We expect on the basis of different tests that the measures we took will lead to a 100 to 150% increase in read speed, depending on the type of load. The write speed should rise by 50 to 100%, again depending on the type of load. As from Tuesday we are provisioning all new virtual Linux servers on clusters using dm-cache. This is an amazing improvement that will allow our customers to run ever faster applications in our cloud.
We will not increase the price for these virtual servers and our customers effectively get a faster VPS for the same price. Over the next few months we will approach customers with higher IO demands for their Linux virtual servers in order to migrate them to our new clusters. Virtual servers that remain the previous infrastructure for now will also reap the benefit of this. Let us know if your Linux VPS urgently needs an IO performance boost then we can give you some priority in the migration.