Home >> Insights  >>

Getting Behind Hadoop And On Top Of The Blurring Lines Between Traditional BI And The Daunting World Of Big Data

by Anil Paul , - Principal Consultant Wipro Analytics, Wipro Limited January-2019

It has been more than a decade that Hadoop has been around. Yet, to most, Hadoop continues to appear like a recent phenomenon. That is because, today, it is being called upon to enable entirely new capabilities and applications such as Business Intelligence and Big Data insights.

Hadoop, EDW or both?

EDWs continue to work well for standard BI dashboards, scorecards, and even for online analytical processing (OLAP) cubes for advanced analytics across millions of records. Hadoop currently cannot match the ease with which EDW supports a wide range of traditional BI needs. In many instances, Hadoop may help enhance EDW capabilities. However, there are areas where Hadoop has a head start over EDW, especially where analytics would benefit from maintaining unprocessed data, such as for emerging data discovery platforms.

This is why Hadoop has stimulated the interest of BI professionals. It can be called upon to deliver increasingly sophisticated results from multi-structured data that has become the norm. We believe enterprises will benefit immensely from adopting HADOOP ecosystem. The speed of processing, flexibility in Architecture and scalability of HADOOP in an ever complex global business scenario was exactly the missing link. This is where the lines between traditional BI and the world of Big Data will begin to blur, leading to exponential value.

Simpler Hadoop tools on their way

Once enterprises realize how extensively Hadoop can advance their BI capabilities, they will no longer be overwhelmed by Big Data. They will begin to use it to predict riots and coups, the spread of a virus and the cost of delivering healthcare, the demand for solar energy and its impact on oil production, the chances that a specific container will not reach its retail destination in time and predicting flight delays, how to manage attrition and churn, etc.

But the harsh truth is that an alarming number of IT executives remain unclear about what Hadoop is or what it does. If anything, this spells an opportunity for those considering Hadoop – early deployment will mean a competitive advantage. In the months to come, this should become possible with new Hadoop tools and platforms that make deployment “non-nerdy” -- or at least not as challenging as it is today. Leading the way will be tools that have the highest ROI and the simplest use cases such as MapReduce, HDFS, Java, HBase, Pig, Manhout, Zookeeper and HCatalog.

Opening the world to new data insights

The pre-Big Data world was top-down. Source data was staged using pre-defined data structures. In this world, developers designed the data and semantic model so users could ask questions such as “How will my promotion budget of X dollars impact inventory?” In other words, users would unearth relationships between pre-defined objects and their metrics.

The post-Big Data world is bottoms up. This is a world in which data users say, “I know the question, just give me the data to explore and I’ll reach my own conclusions.” In other words, the queries are ad hoc in nature, calling for modern BI architecture and schema-free analytical ecosystems. This is the ideal world of Peak Insights that users will demand and which enterprises should aim for. Hence, Hadoop and the world of Big Data are inevitable.

The use case for both is being driven by a number of vectors:

• The growth in data – sooner or later, enterprises will have to learn to deal with the reality of petabytes of structured and unstructured data

• Today’s businesses demand real-time insights using new and exotic data types (social media, streaming video and audio, IoT) plus speed and scale are of paramount importance

• Enterprises can opt for the comfort of hybrid architecture, making Hadoop and Big Data adoption almost risk free

• There is an enterprise-wide digital transformation trend sweeping the world – and your data ware house cannot afford to lag

• Hadoop scales at low costs, making it affordable

In effect, the only barriers to Hadoop adoption remain the newness of its applications and the lack of available investments.

But should these affect your BI and Big Data initiatives? We don’t think so. More –and cheaper—tools will become available through the growing number of vendors. Adoption will further be simplified with the increasing code automation that is on its way. And finally, with the efforts of a fast and inventive Open Source community, skill availability will not remain a challenge for long.