Parallel File Systems for Big Data at SNW

SNW, formerly known as Storage Network World, is a cooperative effort between SNIA and Computerworld. 

Interesting SNW observations:

Lots of flash storage, some for block architectures, some NAS, some cards, quite a variety of Flash memory solutions.  The reason for all the interest is the gap between server and storage performance which has reached a critical level, combined with a whopping drop in the cost of NAND flash memory is changing the storage world.  This change will be felt in the Big Data world too.

Another observation: JBOD is now JBOSS- there are no disk drives in the latest Flash based systems, so just a bunch of disks doesn’t make sense.  I propose just a bunch of solid state or JBOSS.  It sounds cool too.

SNIA has a proposed next generation distributed file system, but so have a bunch of others.  A presentation by Philippe Nicolas of Scality caught my attention.

Things don’t scale well at the Pedabyte level, of course.  And things based on hierarchical models don’t parallelize well at all.  A potential solution for a Big Data approach is a distributed file system.  Think more along the lines of how RAID storage parity is spread over multiple drives and assembled in the event of a drive failure.  In a parallel file system, the file system is spread over a number of servers and assembled as needed.  This means the relational DB model is out, and a key value (recognize that Hadoop lovers?) replaces it.  It means that file systems become more of a peer to peer proposition.  In such architectures, the tradeoff can be performance with some sort of global namespace to hold the metadata.  If done right, the performance can be better, if there is some replication of the data. The latest versions are built on approaches from Gnutella, Napster, BitTorrent and others.  The new approaches can actually be used for legitimate purposes.

Philippe’s tenet was that traditional file systems have hit their limit, and things will have to move to parallel structures.  One example of a distributed files system is Apache’s DFS.  It creates a federation of nodes that will act as the file system with a high availability name node for the metadata.  It is built in Java and can scale to 120PB. 

Another example is the Google File System.  The reason they need a massive file system is obvious.  GFS is now version 2, and takes the data in 1MB chunks and distributes it across a multitude of nodes.  The metadata is kept in multiple distributed master nodes.  Moose and Cloudstore have a similar approach.

Parallel NFS (PNFS) is an update of a long-time tool, NFS.  PNFS allows the metadata to be distributed over many nodes to improve performance and availability and is being looked at closely in SNIA.

 Lustre is an object based system that has evolved from widespread academic roots, including Carnegie Mellon.  It is commonly used in academic HPC environments where massive scalability is required. 

Check for more details on their view of file system evolution and current state of the art.

My opinion is that Big Data is the right approach when there is no one single answer.  If the question is “What were the company’s earnings this quarter?”  That question is best solved with a relational DB and traditional data approaches with ACID rigor.  If you want to know why your sales are off, this is a great Big Data problem that will require a lot of information and probably generate a thought provoking answer that will lead to more questions.  Big Data is here to stay, but so are relational approaches, so to get the best results, use both.



About Big Data Perspectives

Erik Ottem has over 25 years of technology experience with IBM, Seagate, Gadzoox Networks and Agilent. Observations and comments about Big Data are presented for your review.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s