Scale-out Analytics

Problem

Analytics have already changed the world.  The business world.  The science world.  The education world. The government world. And still, most of the data we have has not been used.  An IDC research report, funded by Seagate Technology, states that 68% of enterprise data goes unused.1 If we are to extract more value from data, we need to analyze more data.

Most of today’s enterprise data centers aren’t configured to handle the petabytes of data needed for large-scale analytics. The approach that many enterprises take is to move to the cloud to get the scale and ease of use that is needed for scale-out analytics. It can work.  But it’s not cheap.

To give you an idea of how expensive cloud computing can be, consider Amazon Web Services.  AWS was responsible for 13 percent of Amazon’s overall revenue, but it accounts for about 71% of Amazon’s overall operating profits.2   However, you don’t need to go to the cloud to get the benefits of the cloud, and you might be able to save some money.  Having a data strategy that uses both cloud and on-prem resources appropriately gives you a framework to decide what applications are best served where. This becomes more important over time since you pay to export data from the cloud, so if you change your mind down the road, it might be prohibitively expensive to move the function back on-site. 

So back to the original problem: how should an enterprise build scale-out analytic infrastructure to take advantage of the massive amount of data available but still unused?

Solution

A significant reason that enterprises haven’t been executing scale-out analytics on premises is the existing infrastructure is largely designed to accommodate the bread and butter databases that manage transactions, inventory and customer service. Moving to a scale-out analytics infrastructure is another matter. 

One of the significant benefits that the cloud offers is a scale-out infrastructure for very large jobs. Cloud hyperscalers had to develop their own technology to provide scale-out infrastructure.  That technology was only available to you as a customer of the hyperscalers.  One of the critical enabling technologies for scale-out infrastructure is just coming to the general market now, NVMe-oF.

The foundational technology, NVMe has been around for a few years, and it has fundamentally changed the way flash storage communicates.  Previous technologies were built to accommodate hard drives, which are much slower and lack the queue and command depth of NVMe.  The resulting improvement in performance has been profound. The limitation of NVMe for devices is that the great performance works well in the box, but then the box is connected to the network, you’re plugging into the older, slower and less scalable network.  NVMe-oF, or NVMe over fabrics, creates a fabric that is both scalable and blazing fast. The architecture of NVMe-oF is built on Ethernet and IP so there is no need to struggle with an all-new unproven technology.  You can upgrade and expand your existing IP network to support NVMe-oF.

Benefits

With the speed, scale and flexibility of NVMe-oF, you can create a cloud-like architecture on your premises and reap the benefits in performance, scale and flexibility as well as cost savings. Additionally, when you own your infrastructure on-prem, you need not worry about the generosity of the cloud vendor in the future.  Enterprise IT operations can benefit from the flexibility of a composable disaggregated architecture built with NVMe-oF to create configurations that suit the application.  For example, lots of CPU power and ultra-fast storage for traditional database jobs or alternatively, lots of GPU power and fast storage for scale-out analytics or maybe lots of CPU and less expensive storage for backup and archive workloads.

Summary

Strong competitive forces are driving enterprises to make the most of their data. When scaling and flexibility are no longer limitations you have an opportunity to get more out of your data by using more data. The opportunity to discover new insights with data, previously unseen with smaller data sets, can improve science, government, education and business.  The ability to understand your customers more profoundly, detail your operations more precisely, and manage your marketing more effectively are game changers that get better with more data.  To get more out of your data you need a scale-out architecture, and NVMe-oF is the way to make it happen.

1 https://www.geekwire.com/2019/amazon-web-services-growth-slows-missing-analyst-expectations/#:~:text=AWS%20was%20responsible%20for%2013,of%20Amazon’s%20overall%20operating%20profits. 2 https://apnews.com/press-release/business-wire/e7e1851ee8a74ca3acb1b089f6bd0fa8