Intelligent Networks Transform Big Data into Information Capital
By   |  February 12, 2014

Over the past few years, public and private organizations have experienced an awakening to the knowledge hidden in Big Data. While a concrete definition of the term has yet to emerge, the explosion of dedicated tools, among which intelligent networks, is strongly indicative that Big Data holds the key to limitless kingdoms of new understanding and discoveries.

Bithika Khargharia, Ph.D. – Senior Engineer, Vertical Solutions and Architecture, Extreme Networks.

According to a widely quoted statistic from IBM, 90 percent of the world’s data has been created in the last two years alone. The world is now generating data at a rate of 2.5 quintillion bytes each day, the storage equivalent of 57.5 billion 32 GB iPads. According to UK analyst firm The Big Data Insight Group, every minute of every day YouTube users upload 48 hours of new video, 571 new websites are created and Google receives over two million search queries. And Big Data is not just derived from search, social, and web – other sources include sensors that monitor climate information, purchase transaction records, and cell phone GPS signals.

Compounding this challenge is that much of what constitutes Big Data is actually unstructured data. While structured data fits neatly into traditional database schemas, unstructured data is much harder to wrangle. Much of the value obtained from Big Data analytics now comes from the ability to search and query unstructured data, for example, the ability to pick out an individual from a video clip with thousands of faces using facial recognition algorithms.

Technology companies are feverishly working to develop technologies such as Hadoop Map/Reduce, Dryad, Spark, HBase to turn all of this data quickly and efficiently into information capital for businesses and research organizations. At the heart of it, these technologies achieve speed and efficiency by parallelizing analytic computations across clusters of hundreds of thousands of servers connected via high-speed Ethernet networks. Hence, the process of mining intelligence from Big Data fundamentally involves three steps that could be summarized as follows: splitting the data into multiple server nodes; analyzing each data block in parallel; merging the results. These three operations are repeated through successive stages until the entire dataset has been analyzed.

Owing to the Split-Merge nature of these parallel computations, Big Data analytics can place a significant burden on the underlying network, regardless of its architecture. Even with the fastest servers in the world, the data processing speeds can only be as fast as the network’s capability to transfer data between servers in both the Split and Merge phases. What we need is an intelligent network that, through each stage of the computation, adaptively scales to suit the high bandwidth requirements of the data transfer in the Split and Merge phases, thereby not only improving speed-up but also improving utilization and experience.

The Role of Software-Defined Networking in the Big Data Equation

The emergence of Software-Defined Networking (SDN) is a development with huge potential towards building this intelligent adaptive network for Big Data Analytics. Due to the separation of the control and data plane, SDN provides a well-defined programmatic interface for software intelligence to program networks that are highly customizable, scalable and agile, to meet the requirements of Big Data on-demand.

This software intelligence, which is fundamentally an understanding of what the application needs from the network, can be derived with much precision and efficiency for Big Data applications. The reason is twofold: first, the existence of well-defined computation and communication patterns e.g. Hadoop’s Split-Merge or Map-Reduce paradigm; second, the existence of centralized management structures that make it possible to leverage application-level information e.g. Hadoop Scheduler or HBase Master. With the aid of the SDN Controller, that has a global view of the underlying network – its state, its utilization, etc. – the software intelligence can accurately translate the application needs by programming the network on-demand.

SDN also offers other features that assist with management, integration, and analysis of Big Data. New SDN oriented network protocols, including OpenFlow and OpenStack, promise to make network management easier, more intelligent and highly automated. OpenStack enables the set-up and configuration of network elements using a lot less manpower, and OpenFlow assists in network automation for greater flexibility to support new pressures such as data center automation, BYOD, security and application acceleration.

From a size standpoint, Software-Defined Networking also plays a critical role in developing network infrastructure for Big Data, facilitating streamlined management of thousands of switches, as well as the interoperability between vendors that lays the groundwork for accelerated network build out and application development. OpenFlow, a vendor-agnostic protocol that works with any vendor’s OpenFlow-enabled devices, enables this interoperability, unshackling organizations from the proprietary solutions that could hinder them as they work to transform Big Data into information capital.

As the powerful implications and potential of Big Data become increasingly clear, ensuring that the network is prepared to scale to these emerging demands will be a critical step in guaranteeing long-term success. It is clear that a successful solution will leverage two key elements – the existence of patterns in Big Data applications and the programmability of the network that SDN offers. From that vantage point, SDN is indeed poised to play an important role in enabling the network to adapt further and faster, driving the pace of knowledge and innovation.

© HPC Today 2024 - All rights reserved.

Thank you for reading HPC Today.

Express poll

Do you use multi-screen
visualization technologies?

Industry news

Brands / Products index