This article has been inspired by the recent ‘’happy and proud’’ achievement reported by the Victorian (my home State) State Government: According to the State Government’s rupors, Victoria could show the fastest recovery of any state or territory after suffering through the worst downturn in the country, with new forecasts showing the national economy will grow 4.4 per cent in 2021.The Victorian economy is forecast to grow 5.3 per cent this year, outstripping an expected 4.6 per cent growth in Queensland and 4.4 per cent in NSW. Upon learning about such an ‘’impressive turnaround’’ many Victorians started rejoicing. For months, we have been having been the worst performing Australian State and suddenly we have become the best performing one!
Unfortunately, if we examine the data sources and features more carefully, untimeliness of the celebration is going to become obvious. Ever since the start of the Pandemics-fuelled lockdowns and the consequent contraction of all the business activities, the Victorian economy suffered the greatest (by far!) losses of all. Therefore, the economic growth reported has been a direct result of the extremely dramatic decline that has taken place right before the economic recovery commenced!
The real-life scenario shared above is a fantastic illustration of MIS-USING INTERPRETIVE ANALYTICS FOR DATA TORTURING! This article aims to discuss challenges and dangers of the interpretive analytical studies along with the strategies for minimizing the risks for data misrepresentation and errors.
Interpretive Analytics focuses not only (and not so much) on number crunching (quantitative analysis of the data available) but also on qualitative interpretations of the data sets. Furthermore, the interpretivism is sometimes expanded beyond the data sets and is applied to the contextual settings (such as the assumption in the case discussed above where the pre-decessing economic decline is to be considered/not considered). Applications of the Qualitative data analysis methods and tools are obviously essential for a balanced analysis but have to be done with great care.
More specifically, for Interpretive studies, the Data Analytics Teams need to address risks related to the:
- Analytics Context Definition & Evolution
- Identification of the Greatest Variables
- Analytics Validation Processes
Analytics Context Definition & Evolution
As evident from the economic recovery assessment example above, quantitative values can not be taken out of the qualitative context and vice-versa. Naked figures are often representative of the data sets considered but fail to examine origins of these data sets. The challenge of context definition is becoming increasingly critical in the light of the socio-economic environments becoming faster paced than ever! Furthermore, the contextual environments are now becoming subjected to many newly emerging trends and influences.
One example of the influences’ emergence is the field of investment. Traditionally, when analysing data for consumers’ investment patterns, data scholars had to consider not only alternative banking arrangements available by other banks but also alternative investment opportunities such as investment into stocks or real estate and this alone was complicating their analysis but it was nevertheless far lesser dimensional than the analytics studies that are to be undertaken now. Nowadays, attractiveness of the banking investment products is also impacted by profits, risks and most importantly public perceptions of the crypto investments. It makes the context significantly more volatile and therefore more difficult to define. The banks may be having great investment propositions but if returns on crypto currencies are asymmetrical (e.g. higher, less consistent etc.), the very context of the investment banking will be affected severely.
Another challenge with defining Context for analytics projects could be difficulties in establishment of the time periods and the scopes relevant to the environment. It is sometimes difficult to establish the exact point in time when data becomes outdated as depreciation of data is an ongoing process and does not happen overnight. For instance, how relevant is 2-year-old data set to a study? Is it still relevant and if Yes, should it be assigned different (e.g. lower) values from the more recent data sets?
Pandemics increased the pace of the context evolution significantly. In virtually every industry, analytics activities now need to be adjusted continuously as it is getting harder and harder to ensure ongoing data currency. Also, with the ongoing changes to the business environment, it is evident that the context evolution is an ongoing process, so the analytics tools and patterns have to be adjusted accordingly. On the other hand, not all of the data sources have been impacted to the same extent so traditional data balancing arrangements also have to be reviewed!
Identification of the Greatest Variables
As discussed above, Interpretive analytical projects inevitably involve some Qualitative Value Judgements. In order to increase accuracy and consistency of the Judgements, variational scopes should be established. If it is impossible to assign exact values to various dimensions of a multidimensional analytics project, at very least, the greatest variables and variational scopes for these respective variables need to be identified. It will enable the analytics team to pinpoint areas of the greatest accuracy risks. The greater the possible output variations are, the higher the Interpretive analytics risks are going to be.
Interdependence of the variables is arguably the most complex factor to accommodate for in Interpretive Analytics. Variations within a single denominator may have a profound impact on the analysis as it may (directly or indirectly) influence many of the other denominators. For example, in event management analytics studies, lack of consistency in event attendance data will impact accuracy of the analysis for a large range of factors such as, catering requirements, security requirements, public health risks, quality assurance issues etc.
Analytics Validation Processes
While all of the analytics processes are essential to validate, the Interpretive Analytics processes require additional validation activities due to their qualitative nature. For instance, reproducibility is one of the critical aspects of any validation process. Reproducibility refers to ability to repeat the analytics work as closely as possible by the means of using the very same experimental procedures and with the same data and tools, to confirm the initial results. Within the interpretive environment, reproducibility is often problematic as significantly greater variations from the ‘’mirror reproduction’’ could be expected (as opposed to the purely quantitative data studies). Likewise, it opens possibilities for the analytics team to ‘’play’’ with the data in order to bake the analysis outcomes ‘’desired”.
In the light of the validation challenges discussed above, Analytics teams should increase the numbers of testing rounds as well as to attempt to come up NOT only with set data interpretations but with varieties of interpretive scenarios to see whether the assumptions made are justifiable or not!
The Interpretive Analytical methods are doomed to be doubted for the lack of accuracy (as compared with the Quantitative data analysis methods) and ‘’inferior’’ validity. Furthermore, they are prune to errors and discrepancies. However, there are many data analysis projects where numbers alone can not tell the full story and this is when the Interpretive methods, tools and even value judgements are required. While we cannot illuminate the errors all together, by addressing the 3 factors outlined in this article, we can at very least optimize accuracy and effectiveness of the analytics projects!
If you found this Article interesting, why not review the other Articles in our archive.