Login  |  Join Us  |  Subscribe to Newsletter
Login to View News Feed and Manage Profile
☰
Login
Join Us
Login to View News Feed and Manage Profile
Agency
Agency
  • Home
  • Information
    • Discussion
    • Articles
    • Whitepapers
    • Use Cases
    • News
    • Contributors
    • Subscribe to Newsletter
  • Courses
    • Data Science & Analytics
    • Statistics and Related Courses
    • Online Data Science Courses
  • Prodigy
    • Prodigy Login
    • Prodigy Find Out More
    • Prodigy Free Services
    • Prodigy Feedback
    • Prodigy T&Cs
  • Awards
    • Contributors Competition
    • Data Science Writer Of The Year
  • Membership
    • Individual
    • Organisational
    • University
    • Associate
    • Affiliate
    • Benefits
    • Membership Fees
    • Join Us
  • Consultancy
    • Professional Services
    • Project Methodology
    • Unlock Your Data
    • Advanced Analytics
  • Resources
    • Big Data Resources
    • Technology Resources
    • Speakers
    • Data Science Jobs Board
    • Member CVs
  • About
    • Contact
    • Data Science Foundation
    • Steering Group
    • Professional Standards
    • Government And Industry
    • Sponsors
    • Supporter
    • Application Form
    • Education
    • Legal Notice
    • Privacy
    • Sitemap
  • Home
  • Information
    • Discussion
    • Articles
    • Whitepapers
    • Use Cases
    • News
    • Contributors
  • Courses
    • Data Science & Analytics
    • Statistics and Related Courses
    • Online Data Science Courses
  • Prodigy
    • Prodigy Login
    • Prodigy Find Out More
    • Prodigy Free Services
    • Prodigy Feedback
    • Prodigy T&Cs
  • Awards
    • Contributors Competition
    • Data Science Writer
  • Membership
    • Individual
    • Organisational
    • University
    • Associate
    • Affiliate
    • Benefits
    • Membership Fees
    • Join Us
  • Consultancy
    • Professional Services
    • Project Methodology
    • Unlock Your Data
    • Advanced Analytics
  • Resources
    • Big Data Resources
    • Technology Resources
    • Speakers
    • Data Science Jobs Board
    • Member CVs
  • About
    • Contact
    • Data Science Foundation
    • Steering Group
    • Professional Standards
    • Government And Industry
    • Sponsors
    • Supporter
    • Application Form
    • Education
    • Legal Notice
    • Privacy
    • Sitemap
  • Subscribe to Newsletter

Real Time Data Analytics: Mitigating the Risks and the Challenges

A DSF Whitepaper
09 September 2020
Michael Baron
Author Profile
Other Articles
Follow (8)

Share with your network:

Real Time Data Analytics: Mitigating the Risks and the Challenges

In the early days of the Data Analytics (DA) companies would sort data, assemble the data sets and establish the respective requirements prior to carrying on with the analysis. No wonder that by the time the analytics activities have been completed and the reports generated, the findings would often turn out to be outdated and could no longer be used for activities other than historic analysis. As the main purpose of employing DA in a commercial environment is performance improvement/optimization rather than making sense of historic studies, once the Real-Time Data (RTD) Analysis tools and patterns emerged, they were immediately embraced by the DA practitioners. While practical benefits of the Real-time DA over historic data analytics are obvious, it is also essential to understand and cater for its risks, challenges and limitations.

Real time (aka live) Data is processed and analysed immediately upon collection so there are no delays in the timeliness of the information provided. For example, if we are to follow up on the current state of the COVID-19 spread in Victoria, Australia (where I happen to reside), only the latest daily figures can provide a real-time snapshot of the situation. Live processing systems enable companies to adjust their activities and processes based on the latest data if required. So what could possibly go ‘’wrong’’ if we rely on the ‘’very latest facts’’? Can the Real-Time Data let us down?

The author believes that while importance and effectiveness of the Real Time DA is beyond doubt, we should acknowledge and consider the following 4 key challenges:

  • Data Validation
  • DA Process Optimization
  • Trend Representation
  • ‘’Seasonal’’ Data

 

Data Validation

Data Validation involves confirming accuracy and quality of both the data sources and the data sets collected and assembled from those sources. It is essential for ensuring fitness and consistency of the data prior to carrying out further analytical studies. Needless to say, it is the initial task that every DA process has to go through, and should it fail, the entire process is going to be corrupted.

Validation of historic data can be carried out at one’s convenience. The validation planning process can be started by establishing the validation requirements + the time frame fulfillment of those requirements is going to take. However, in case of the Real Time Data, it is essential that the validation process is completed promptly, otherwise by the time the data is validated – it will be no longer current and fit for the analysis. This forces the analytics teams adopt very tight (as opposed to some of the projects that involve working with historic data) timelines for getting the validation processes completed.

So how to process the Real Time Data fast as the speedy processing opens doors for further mitigation concerns? On the one hand, sticking to strict timelines will result in a greater %age of discrepancies in data source management, formatting, error identifications & corrections etc. On the other hand, extending the processing timelines increases the risk of the data losing a significant share of its value and relevance all together. With the Real Time data, the principles of diminishing data value (e.g. 90/90) are particularly telling.

In cases where currency of the data is not to last, validation processes have to be fully automated as there is no time left to apply ‘’additional tests & check ups’’ – even on a random basis. It increases the risk of the data validation failures significantly.

 

 DA Process Optimization

DA works best when it is a dynamic process that is subjected to ongoing reviews and updates rather than a static process that is pre-set at the commencement of a DA project/operation.  The analysis patterns and dimensions need to be fine-tuned throughout. For example, the initial approaches to identifying, processing and analysing the COVID-19 data (e.g. daily cases, sources of the cases, clusters etc.) have been updated as we’ve been learning more about the virus patterns as well as the analysis patterns that are to be used. Likewise, investment companies keep upgrading DA applications that they require for understanding the trading data successfully. These updates/upgrades are taking place in response to the ongoing process reviews.

While the DA Process Optimization appears to be standard for ‘’grand-scale’’ projects, from a smaller firm’s perspective, it is a tedious task. Historic DA enables them to use a ‘’lessons learned’’ approach where they keep going through the projects completed. This approach is significantly harder to implement for the Real Time DA process optimisation. As discussed in the Data Validation Section, the timelines are very tight and the process optimisation has to be as real-time as the data.

Another aspect of the DA Process Optimisation (at times, even more critical than fixing of the errors) is incorporation of additional tools/technologies. Any such update requires preliminary pilot testing. In a dynamic environment, testing also often has to be done in the live mode with the changes accepted/rejected ‘’on the move’’. This is particularly demanding with multi-level analysis. At each of the levels, findings are consolidated and passed on for the next stage of the analysis. True impact of the process optimization may not be transparent without assessing full impact of the later (more advanced) stages.

Last but not least, as DA may involve creation of Data Marts, further optimization challenges could be anticipated. Data Marts are subsets of the data warehouses that focus on specific aspects of the data. In other words, they are condensed and more focused than the complete data warehouses. Therefore, the DA process may involve creation of multiple Data Marts with each of the Data Marts requiring a separate range of optimization processes. I can recall several instances where instead of a single team overseeing the entire DA process from start to finish, the Data Marts were managed by various teams throughout the enterprise. That required further consolidation of the findings and greater collaboration throughout the optimization processes. If not for the Agile (aka collaborative) approaches, the DA process optimization would be mission impossible. 

 

Trend Representation

With the Real Time DA, we are aiming to not only identify the current situation but also look into the future developments. When drilling-down the data, we need to establish sufficient trends to model the possible scenarios as well as relative probability of those scenarios. In a dynamic environment, multiple trends (at times, even including conflicting ones) could be anticipated. The role of the DA is to establish those trends accurately.

The Real Time data is ‘’hot out of the stove’’ but it is not always sufficient for pinpointing the trends as many of these trends can only be identified correctly through studies of longitudinal data. This would entail combining the Real Time data with the Historic data and balancing within the two data types accordingly. If the historic data is to brought into the DA for the purpose of the trends identification, the project will have to incorporate a number of additional dimensions such as for instance: historic data ‘’cut-off’’ dates, additional formatting and source verification.

 

‘’Seasonal’’ Data

Seasonality of data refers to the data experiencing regular and predictable changes. Any predictable fluctuations that occur over a distinct time period (e.g. winter) could be referred to as seasonable but with the Real Time data, it is not always easy to establish ‘’Seasons’’ Furthermore, the Real Time data sets may be covering time frames that are not matching (e.g. shorter) the entire seasons.

Longitudinal studies incorporate seasonality as one of the dimensions but with stand-alone sets of the Real Time data, it is sometimes difficult to establish whether a particular phenomena is within the data character, out of character …or simply a seasonal factor. The shorter the data life cycle is, the harder it is for the DA teams to address the data seasonality!

 

Conclusion

To sum up, it should be emphasized once again that this paper has not been put together with the aim of discouraging the Real-Time DA. Overall, Real-Time approach is definitely far more dynamic and responsive to the fast-paced environment we are currently operating within. However, the more complex the DA gets, the more diligent the Data Analysts need to be when executing the analytics processes. It shows once again that the DA is NOT about the tools used but it is about the PEOPLE who are using the tools!

Rate this Whitepaper
Rate 1 - 10 by clicking on a star
(2 Ratings) (1 Comments) (1578 Views)
Download

If you found this Whitepaper interesting, why not review the other Whitepapers in our archive.

Login to Comment and Rate

Email a PDF Whitepaper

Comments:

Mayank Tripathi

20 Sep 2020 11:15:35 PM

Agree, with the time researchers has found various tools and techniques that today one can do real time data analysis, but again it depends on the person, how (s)he will be using the tools and techniques with respect to the creativity. As to achieve there could be multiple ways, to use right weights & parameters are the most effective part of it.


Thanks for sharing this valuable information.

Go to discussion page

Categories

  • Data Science
  • Data Security
  • Analytics
  • Machine Learning
  • Artificial Intelligence
  • Robotics
  • Visualisation
  • Internet of Things
  • People & Leadership Skills
  • Other Topics
  • Top Active Contributors
  • Balakrishnan Subramanian
  • Abhishek Mishra
  • Mayank Tripathi
  • Michael Baron
  • Santosh Kumar
  • Recent Posts
  • AN ADAPTIVE MODEL FOR RUNWAY DETECTION AND LOCALIZATION IN UNMANNED AERIAL VEHICLE
    12 November 2021
  • Deep Learning
    05 November 2021
  • Machine Learning
    05 November 2021
  • Data is a New oil : A step into WSN enabled IoT and security
    26 October 2021
  • Highest Rated Posts
  • Piecewise hazard model for under-five child mortality
  • The transformational shift in educational outcomes in London 2003 to 2013: the contribution of local authorities
  • TOP 10 BEST FREE AND OPEN SOURCE BACKUP SOLUTIONS
  • Graph Analytics and Big Data
  • DEEP LEARNING: FIGHTING COVID-19 WITH NEURAL NETWORKS
To attach files from your computer

    Comment

    You cannot reply to your own comment or question. You can respond to another member's comment in this thread.

    Get in touch

     

    Subscribe to latest Data science Foundation news

    I have read and agree to the Data science Foundation Privacy Policy

    • Home
    • Information
    • Resources
    • Membership
    • Services
    • Legal
    • Privacy
    • Site Map
    • Contact

    © 2022 Data science Foundation. All rights reserved. Data S.F. Limited 09624670

    Site By-Peppersack

    We use cookies

    Cookie Information

    We are using cookies to provide statistics that help us to improve your experience of our site. You can choose to use the site without cookies. However, by continuing to use the site without changing your settings, you are agreeing to our use of cookies.

    Contact Form

    This member is participating in the Prodigy programme. This message will be directed to Prodigy Admin the Prodigy Programme manager. Find out more about Prodigy

    Complete your membership listing and tell others about your interests, experience and qualifications with a Personal Profile page.

    Add a Personal Profile

    Your Personal Profile page is missing information about your experience and qualifications that other members would find interesting. Click here to update.

    Login / Join Us

    Login to your membership account to view your personalised news feed, update your profile, manage your preferences. publish articles and to create a following.

    If you are not a member but work with or have an interest in Data Science, Machine Learning and Artificial Intelligence, join us today.

    Login | Join Us

    Support the work of the Data Science Foundation

    Help to fund our work and enable us to provide free communications and knowledge sharing services to members across the globe.

    Click here to set-up a donation of £30 per year

    Follow

    Login

    Login to follow this member

    Login