Login  |  Join Us  |  Subscribe to Newsletter
Login to View News Feed and Manage Profile
☰
Login
Join Us
Login to View News Feed and Manage Profile
Agency
Agency
  • Home
  • Information
    • Discussion
    • Articles
    • Whitepapers
    • Use Cases
    • News
    • Contributors
    • Subscribe to Newsletter
  • Courses
    • Data Science & Analytics
    • Statistics and Related Courses
    • Online Data Science Courses
  • Prodigy
    • Prodigy Login
    • Prodigy Find Out More
    • Prodigy Free Services
    • Prodigy Feedback
    • Prodigy T&Cs
  • Awards
    • Contributors Competition
    • Data Science Writer Of The Year
  • Membership
    • Individual
    • Organisational
    • University
    • Associate
    • Affiliate
    • Benefits
    • Membership Fees
    • Join Us
  • Consultancy
    • Professional Services
    • Project Methodology
    • Unlock Your Data
    • Advanced Analytics
  • Resources
    • Big Data Resources
    • Technology Resources
    • Speakers
    • Data Science Jobs Board
    • Member CVs
  • About
    • Contact
    • Data Science Foundation
    • Steering Group
    • Professional Standards
    • Government And Industry
    • Sponsors
    • Supporter
    • Application Form
    • Education
    • Legal Notice
    • Privacy
    • Sitemap
  • Home
  • Information
    • Discussion
    • Articles
    • Whitepapers
    • Use Cases
    • News
    • Contributors
  • Courses
    • Data Science & Analytics
    • Statistics and Related Courses
    • Online Data Science Courses
  • Prodigy
    • Prodigy Login
    • Prodigy Find Out More
    • Prodigy Free Services
    • Prodigy Feedback
    • Prodigy T&Cs
  • Awards
    • Contributors Competition
    • Data Science Writer
  • Membership
    • Individual
    • Organisational
    • University
    • Associate
    • Affiliate
    • Benefits
    • Membership Fees
    • Join Us
  • Consultancy
    • Professional Services
    • Project Methodology
    • Unlock Your Data
    • Advanced Analytics
  • Resources
    • Big Data Resources
    • Technology Resources
    • Speakers
    • Data Science Jobs Board
    • Member CVs
  • About
    • Contact
    • Data Science Foundation
    • Steering Group
    • Professional Standards
    • Government And Industry
    • Sponsors
    • Supporter
    • Application Form
    • Education
    • Legal Notice
    • Privacy
    • Sitemap
  • Subscribe to Newsletter

Big Data Analytics: Idea, Data Types and Reference Architecture

A DSF Whitepaper
01 January 2020
Balakrishnan Subramanian
Author Profile
Other Articles
Follow (91)

Share with your network:

  1. INTRODUCTION

    Analytics has, it might be said, been around since 1663, when John Graunt managed "overpowering measures of data," utilizing insights to consider the bubonic plague. In 2017, 2,800 experienced experts who worked with Business Intelligence were studied, and they anticipated Data Discovery and Data Visualization will turn into a significant pattern. Data Visualization is a type of visual correspondence (think infographics). It portrays data which has been converted into schematic arrangement, and incorporates changes, factors, and vacillations. A human mind can process visual examples effectively.

    Visualization models are relentlessly getting progressively famous as a significant technique for picking up bits of knowledge from Big Data. (Designs are normal, and liveliness will get normal. At present, information perception models are somewhat cumbersome, and could utilize some improvement.)

    Data Analytics is the study of breaking down information to change over data to helpful information. This information could assist us with understanding our reality better, and in numerous settings empower us to settle on better choices.

    A schematic view of AI, ML, and Big Data Analytics

     

    Difference between Traditional Analytics with Big Data Analytics

    Type Traditional Analytics or Business Intelligence (BI) Big Data Analytics
    Focus on
    • Descriptive analytics
    • Diagnosis analytics
    • Predictive analytics Data Science
    Data Sets
    • Limited data sets
    • Cleansed data
    • Simple models
    • Large scale data sets More types of data Raw data
    • Complex data models
    Supports Causation: what happened, and why? Correlation: new insight More accurate answers
  2. IDEA OF BIG DATA
    1. Methods of obtaining knowledge (Erkenntnissprozess)

      Scientific method consists of the following phases: question (model), hypothesis, prediction, testing and analysis.

      • Explorative : start theory with empirical observations of phenomena and experimentation
      • Constructivism : starts with axioms and reason implications (other theoretical approaches)
    2. Types of Data Analytics and Value of Data
      • Descriptive analytics (Beschreiben)
        • “What happened ?”
      • 2 Diagnostic analytics
        • Why did this happen, what went wrong ?
      • Predictive analytics (Vorhersagen)
        • “What will happen ?”
      • Prescriptive analytics (Empfehlen)
        • What should we do and why ?
    3. The Fourth Paradigm

      In short,

      (Big) Data + Analytics ⇒Insight (prediction of the future)

      Example,

      For industry: insight = business advantage and money...

      Types of Analytics used: follow an explorative approach and study the data

      To infer knowledge, use statistics / machine learning algorithm. And construct a theory (model) and validate it with the data.

  3. DATA TYPES USED IN ANALYTICS

    Data types engaged with Big Data analytics are many: "structured, unstructured, geographic, real-time media, natural language, time series, event, network and linked". It is essential here to recognize human-created information and gadget produced information since human information is frequently less dependable, boisterous and unclean.

    A short depiction of each sort is given underneath.

     
    • Structured data

      In Structured data, information put away in lines and sections, for the most part numerical, where the significance of every datum thing is characterized. This kind of information comprises about 10% of the present absolute information and is available through database the board frameworks. Model wellsprings of organized (or customary) information incorporate authority enrolls that are made by legislative establishments to store information on people, undertakings and genuine bequests; and sensors in businesses that gather information about the procedures.

    • Unstructured Data

      In Unstructured data, information of various structures like for example content, picture, video, archive, and so on. It can likewise be as client grumblings, contracts, or inner messages. This kind of information represents about 90% of the information made in this century. Actually, the volcanic development of web based life (for example Facebook and Twitter), since the center of the most recent decade, is liable for the significant piece of the unstructured information that we have today. Unstructured information can't be put away utilizing customary social databases.

    • Geographic data

      In Geographic data, information identified with streets, structures, lakes, addresses, individuals, work environments, and transportation courses, that are created from geographic data frameworks. These information interface between spots, time, and qualities (for example unmistakable data). Geographic information, which is advanced, have gigantic advantages over customary information sources, for example, maps, for example, paper maps, composed reports from travelers, and spoken records in that computerized information are anything but difficult to duplicate, store, and transmit.

    • Real-time media

      Real-time streaming of live or put away media information. An extraordinary quality of continuous media is the measure of information being delivered which will be additionally confounding later on as far as capacity and preparing. One of the primary wellsprings of media information is administrations like for example YouTube, Flicker, and Vimeo that produce an enormous measure of video, pictures, and sound. Another significant source or ongoing media is video conferencing (or visual cooperation) which enables at least two areas to impart all the while in two-manner video and sound transmission.

    • Natural language Data

      In Natural language data, human-created information, especially in the verbal structure. Such information vary as far as the degree of deliberation and level of publication quality. The wellsprings of regular language information incorporate discourse catch gadgets, land telephones, cell phones, and Internet of Things that create huge sizes of content like correspondence between gadgets.

    • Time series

      Time series is a grouping of information focuses (or perceptions), ordinarily comprising of progressive estimations made over a period interim. The objective is to recognize patterns and abnormalities, distinguish setting and outer impacts, and analyze individual against the gathering or look at individual at changed occasions. There are two sorts of time arrangement information: (I) persistent, where we have a perception at each moment of time and (ii) where we have a perception at (typically normally) separated interims. Instances of such information incorporate sea tides, tallies of sunspots, the day by day shutting estimation of the Dow Jones Industrial Average, and estimating the degree of joblessness every long stretch of the year.

    • Event data

      In Event data, information produced from the coordinating between outer occasions with time arrangement. This requires the distinguishing proof of significant occasions from the insignificant. For instance, data identified with vehicle accidents or mishaps can be gathered and broke down to help comprehend what the vehicles were doing previously, during and after the occasion. The information in this model is produced by sensors fixed in better places of the vehicle body. Occasion information comprises of three mains snippets of data: (I) activity, which is simply the occasion, (ii) timestamp, when this occasion occurred, and (iii) state, which portrays all other data important to this occasion. Occasion information is generally portrayed as rich, denormalized, settled and schemaless.

    • Network data

      In Network data, information concerns exceptionally huge systems, for example, interpersonal organizations (for example Facebook and Twitter), data systems (for example the World Wide Web), organic systems (for example biochemical, biological and neural systems), and mechanical systems (for example the Internet, phone and transportation systems). System information is spoken to as hubs associated through at least one sorts of relationship.

    • Linked data

      In Linked data, information that is based upon standard Web advancements, for example, HTTP, RDF, SPARQL and URIs to share data that can be semantically questioned by PCs (instead of serving human needs). This enables information from various sources to be associated and read. The term was authored by Tim Berners-Lee, chief of the World Wide Web Consortium, in a structure note about the Semantic Web venture.

  4. BIG DATA ANALYTICS REFERENCE ARCHITECTURES

    Big Data are turning into another innovation center both in science and in industry and persuade innovation move to information driven engineering and operational models. There is an essential need to characterize the fundamental data/semantic models, design segments and operational models that together involve a purported Big Data Ecosystem.

    Extended Relational Reference Architecture:

    This is progressively about Relational Reference Architecture however parts with yellow squares can't deal with huge information challenges.

  5. CONCLUSION

    Big data is an expansive, quickly advancing point. While it isn't appropriate for a wide range of registering, numerous associations are going to enormous information for particular kinds of remaining tasks at hand and utilizing it to enhance their current investigation and business apparatuses. Big data frameworks are remarkably appropriate for surfacing hard to-identify designs and giving knowledge into practices that are difficult to discover through ordinary methods. By effectively actualize frameworks that manage large information, associations can increase mind blowing an incentive from information that is now accessible.

Rate this Whitepaper
Rate 1 - 10 by clicking on a star
(26 Ratings) (1732 Views)
Download

If you found this Whitepaper interesting, why not review the other Whitepapers in our archive.

Login to Comment and Rate

Email a PDF Whitepaper

Categories

  • Data Science
  • Data Security
  • Analytics
  • Machine Learning
  • Artificial Intelligence
  • Robotics
  • Visualisation
  • Internet of Things
  • People & Leadership Skills
  • Other Topics
  • Top Active Contributors
  • Balakrishnan Subramanian
  • Abhishek Mishra
  • Mayank Tripathi
  • Michael Baron
  • Santosh Kumar
  • Recent Posts
  • AN ADAPTIVE MODEL FOR RUNWAY DETECTION AND LOCALIZATION IN UNMANNED AERIAL VEHICLE
    12 November 2021
  • Deep Learning
    05 November 2021
  • Machine Learning
    05 November 2021
  • Data is a New oil : A step into WSN enabled IoT and security
    26 October 2021
  • Highest Rated Posts
  • DEEP LEARNING: FIGHTING COVID-19 WITH NEURAL NETWORKS
  • What have the changes made to primary and secondary assessment frameworks done to the ‘London effect’ in school performance?
  • Graph Analytics and Big Data
  • Understanding Imbalanced Datasets and techniques for handling them
  • Data Driven Business Models in FMCG & Retail
To attach files from your computer

    Comment

    You cannot reply to your own comment or question. You can respond to another member's comment in this thread.

    Get in touch

     

    Subscribe to latest Data science Foundation news

    I have read and agree to the Data science Foundation Privacy Policy

    • Home
    • Information
    • Resources
    • Membership
    • Services
    • Legal
    • Privacy
    • Site Map
    • Contact

    © 2022 Data science Foundation. All rights reserved. Data S.F. Limited 09624670

    Site By-Peppersack

    We use cookies

    Cookie Information

    We are using cookies to provide statistics that help us to improve your experience of our site. You can choose to use the site without cookies. However, by continuing to use the site without changing your settings, you are agreeing to our use of cookies.

    Contact Form

    This member is participating in the Prodigy programme. This message will be directed to Prodigy Admin the Prodigy Programme manager. Find out more about Prodigy

    Complete your membership listing and tell others about your interests, experience and qualifications with a Personal Profile page.

    Add a Personal Profile

    Your Personal Profile page is missing information about your experience and qualifications that other members would find interesting. Click here to update.

    Login / Join Us

    Login to your membership account to view your personalised news feed, update your profile, manage your preferences. publish articles and to create a following.

    If you are not a member but work with or have an interest in Data Science, Machine Learning and Artificial Intelligence, join us today.

    Login | Join Us

    Support the work of the Data Science Foundation

    Help to fund our work and enable us to provide free communications and knowledge sharing services to members across the globe.

    Click here to set-up a donation of £30 per year

    Follow

    Login

    Login to follow this member

    Login