Thursday, October 20, 2016

HopsFS based on MySQL Cluster 7.5 delivers a scalable HDFS

The swedish research institute, SICS, have worked hard for a few years on
developing a scalable and a highly available Hadoop implementation using
MySQL Cluster to store the metadata. In particular they have focused on the
Hadoop file system (HDFS) and the YARN. Using features of MySQL
Cluster 7.5 they were able to achieve linear scaling in number of name
nodes as well as in number of NDB data nodes to the number of nodes
available for the experiment (72 machines). Read the press release from
SICS here

The existing metadata layer of HDFS is based on a single Java server
that acts as name node in HDFS. There are implementations to ensure
that this metadata layer have HA by using a backup name node and to
use ZooKeeper for heartbeats and a number of Journalling nodes to
ensure that logs of changes to metadata are safely changed.

With MySQL Cluster 7.5 all these nodes are replaced by a MySQL Cluster
installation with 2 data nodes (or more NDB  data nodes if needed to scale
higher) to achieve the same availability. This solution scales by using many
HDFS name nodes. Each 2 NDB data nodes scales to supporting around
10 name nodes. SICS made an experiment where they managed to
scale HDFS to using 12 NDB data nodes and 60 name nodes where they
achieved 1.2 millions file operations per second. The workload is based on
real-world data from a company delivering a cloud-based service
based on Hadoop. Most file operations are a combination of a number of
key lookups and a number of scan operations. We have not found any
limiting factor for scaling even more with even more machines
available.

This application uses ClusterJ, ClusterJ is a Java API that access
the MySQL Cluster data nodes directly using a very simple API.

The application uses a lot of scans to get the data, so the
application takes advantage of the improved scalability of scans
as present in 7.5. Given that it is a highly distributed application
a lot of the CPU time is spent in communicating, so the new adaptive
algorithms for sending is ensuring that performance is scaling nicely.

SICS have also developed a framework for installing MySQL Cluster in
the cloud (Amazon, Google, OpenStack) or on bare metal. I tried this
out and got the entire HopsFS installed on my laptop by doing a few
clicks on a web page on my desktop and pointing out the address of my
laptop. This uses a combination of Karamel, a number of Chef cookbooks
for MySQL Cluster, and a number of cookbooks for installing HopsFS.
Karamel uses JClouds to start up VMs in a number of different clouds.

No comments: