Performance Implications of SSDs in Virtualized Hadoop Clusters

Virtualization in Hadoop cluster would cause fragmented I/O pattern (more seeks and smaller I/O block size). This pattern would create I/O bottleneck in Hadoop cluster using HDD. In the contrary, SSD get small degradation because it has no mechanical part causing seek overhead. But SSD favor large I/O block size that can make use of parallelism supported by the internal structure of SSD, so there is still slight degradation. When concern about deployment of VMs, it shows the more VMs the worse performance in HDD-based environment. But the SSD-based environment almost ignores the increased number of VMs because of no I/O bottleneck and high CPU utilization for parallelism.


