Big DataBlogsCloud Storage

Harnessing the Value of Data in Motion

Data is vital. It can drive an organisation forward, enabling informed decision making and hyper-specific customisation, among others. It is so important that organisations need to figure out how to best uncover insights from troves of data of various sizes and formats, scattered across on-prem, the cloud and edge environments.

To this end, dealing with data at rest—or data already stored in a specific place—is challenging in itself. If that is the case, imagine handling data in motion.

Data in motion, on its own, is just like any other data. But when it is analysed while it is still in transit, then it can be valuable to a business—extremely valuable. Data in motion when analysed gives an organisation valuable insights from said data in real-time, which then allows the same organisation to do amazing things with those insights. Consider:

  • Sensor data from supply chains can be analysed while still in transit to optimise the supply chain process, detect anomalies and predict possible problems for quick remediation.
  • Health information from wearables can be used for real-time monitoring of a patient’s condition.
  • Organisers of sports or entertainment events or venue companies can analyse data in motion for real-time ticket availability and offer personalised experiences and navigation recommendations.
  • Couriers can use traffic data for real-time route optimisation and avoid congested roads.

Put simply, analysing data while still in transit helps an organisation be proactive and able to make informed, data-driven decisions quickly and efficiently. The problem is, maintaining control of and protecting data in motion is extremely difficult because business networks have expanded tremendously over the years. This means more nodes are connected to these networks, and more data moves across these same nodes—sometimes all at the same time.

This is the reason organisations need a streaming data architecture, whose focus is primarily on effectively processing data in motion. But not any streaming data architecture will do. What organisations need is Cloudera’s unified end-to-end streaming architecture, which is anchored on three critical, complementary tenets as outlined in the Cloudera Data-in-Motion Philosophy whitepaper:

  1. Flow Management. This is the process of collecting, distributing and transforming data across multiple points of producers and consumers.
  2. Streams Messaging. This is the provisioning and distribution of messages between producers and consumers.
  3. Stream Processing and Analytics. This is the process of generating real-time analytical insights from the data being streamed between producers and consumers.

That platform is Cloudera DataFlow (CDF), a scalable, real-time streaming data platform that ingests, curates and analyses data in motion. In particular, DataFlow helps organisations:

  • Process real-time data streaming at high volume and high scale.
  • Track data provenance, as well as the lineage of streaming data.
  • Manage and monitor edge applications and streaming sources.
  • Glean insights and actionable intelligence from streaming data—in real-time.

Cloudera Dataflow, which can be deployed either at the data hub or for the public cloud, provides the complete toolset that an organisation will need so it can manage, secure and govern its data from the edge up to the cloud. That’s because it features the three tenets described above—Flow Management, Streams Messaging and Stream Processing and Analytics—and integrates with the Shared Data Experience (SDX) of the Cloudera Data Platform. The result is an end-to-end streaming architecture that not only unifies data security and governance but also enables real-time analysis of data in motion.

No less than NASA, through its Reanalysis Ensemble Service (RES), is leveraging CDF for backend analytics of the voluminous and ever-growing climate research data RES is processing daily. Put simply, CDF enables researchers to analyse large data sets—but without the need to download them. And since they no longer have to download these large data sets, they get to spend more time analysing data and less time downloading it. This is just one real-world use case of CDF; but if it can handle NASA’s data needs, it can most certainly do the same for any other organisation.

To find out more about CDF and how it can help your organisation make the most out of data in motion, click here.

DSA Editorial

The region’s leading specialist IT news publication focused on Data Lifecycle, Storage Infrastructure and Data-Driven Transformation. DSA has nearly 17,000 e-news subscribers, over 6500 unique visitors per day, over 20,000 social media followers and a reputation for deep domain knowledge.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *