When the ruthless Pandemic hit the State of Victoria (Australia) The State Government placed its hopes for containing its spread on what they ‘ve been referring to ‘’a highly sophisticated’’ Contact Tracing Program. Unfortunately, the initial contact tracing efforts appeared to have failed miserably as the authorities struggled to trace the virus carriers’ contacts efficiently and to identify the people/groups that could be classified as ‘’At Risk’’.
The purpose of this paper is to explain the reasons behind this failure and more specifically – the Data Analytics mistakes committed. The paper has been prepared on the basis of the author’s contribution to the ‘’Inquiry into the Victorian Government’s COVID-19 Contact Tracing System and Testing Regime’’ that was carried out by the Victorian State Government. Furthermore, it aims not only to highlight the mistakes, misjudgements and discrepancies made but also outline strategies and patterns for running COVID-19 Contact Tracing Analytics Programs.
Setting the Scene
Now that the so-called 2nd wave of the Pandemics is over, and Victorians are reflecting upon their journey out of it while keeping their fingers crossed that the Pandemics does not return to our garden state, the post-mortem of the Government’s analytical programs appears to be timely. The complete data for the contact tracing activities is now available and we can see how the tracing evolved over time.
Initially, when the wave commenced, both the people and the State Government felt optimistic about its ability to trace and contain its spread. In order to improve data collection and dissemination, the COVIDSafe app was introduced and was expected to become panacea for tracing the virus carries and their contacts. Furthermore, every single case of the COVID-19 was to be recorded on a state-wide database with the contacts of the virus carriers identified and monitored.
At first sight, the measures appeared to be sufficient for combating the spread of the infection. However, the approach selected the Victorian health authorities failed to withstand the test of time. While some of the other Australian States were remaining virus free (due to our unique ‘’island status’’ and ability to self-isolate from the rest of the word), Victoria was clocking over 700 cases a day. The quarantine failure for travellers returning from overseas has led the genie out of the bottle and the State Authorities failed to put it back due to the large number of cases proving to be ‘’impossible to trace’’. By now, the failure of the contact tracing analytics systems has already been confirmed. Consequently, the big question is: what was it that went wrong and more specifically, at what point and how exactly did the Government fail its COVID-19 data analytics activities?
COVIDSafe: A Case of a ‘’Faulty App’’ or a Case of Poor Data Processing and Analytics?
COVIDSafe is the App that the Australian Government adopted for monitoring and slowing down the spread of the coronavirus. To quote the Victorian State and Human Services Website:
‘’When you are near someone who also has the app installed and running, your device will take note of this contact by securely logging the other person’s reference code, in a process called a “digital handshake”.
Then if you or someone you have been in contact with tests positive for coronavirus (COVID-19), the information securely stored by the app in your phone can be uploaded and used – with your consent – by state or territory health officials. This will help to quickly trace people who have been exposed to the virus.’’
The App has been powered by Bluetooth technology and at the time, it was promoted as an easy-to-download-and-use tool. It was certainly true, the downloading part was very easy. From an Analytics Perspective, it also appeared to be fairly straightforward to utilize (at least the Victorian Health Authorities were thinking so).
However, the author believes that while handing the COVIDSafe Analytics, several discrepancies occurred, namely:
- Unreliable Data Collection and Processing Methods
- Poor COVIDSafe Adoption Rate
- Lack of Segment-Defying Variables
The Technology Implementation Challenge Explained
As the author is no health analytics expert, this paper is not going to go into detailed analysis of the rationale behind the abnormally high numbers of ‘’false positives’’ and ‘’false’’ negatives identified during the testing stages (clear signs of analytics failures rather than the failures of the health professionals), therefore making the data processing and consequent performance of the COVIDSafe App in identifying the virus carriers extremely unreliable. However, it should have been transparent from the very start of the App’s implementation that in high-density buildings such as large shopping venues, exhibition centres, cinema complexes etc. (that are traditionally known for having poor mobile reception) the App was not going to perform consistently. Such connectivity disruptions were likely to have impact not only on the workability of the App during certain peak times (while inside those buildings) but also created additional obstacles to data collection.
Some of the suburbs that have been experiencing mobile network connectivity issues are in the populous areas and consequently reduce overall accuracy of the Analytics dramatically. Furthermore, Open Signal coverage maps suggest that in some parts of Victoria, there are also problems with the signals’ receptions and transmissions for some of the major Australian providers of the mobile networks. If we are to translate this into the ‘’Data Science Language’’ – the analytics has been carried out based on incomplete data and the findings from such data studies are inaccurate by default!
Proponents of the COVIDSafe’s implementation have been suggesting that the data collection and analysis challenges outlined above were ‘’hard to anticipate at the time of implementation’’ as this has been the very first major pandemic of the modern times (at least in Australia) and there was insufficient time available for the analytics team to complete the pre-release testing in full.
While there is little disagreement over the fact that this is indeed our very first pandemic of the modern times (and consequently the very first challenge to handle data collection on such a grand scale), it was certainly NOT a valid justification for the data scientists to put an untested App to work and ‘’ keep finetuning the processes as required while the App was in use already’’. On the contrary, a greater emphasis should’ve been placed on both the ‘’ known unknowns’’ and the ‘’unknown unknowns’’ and this was not done to a sufficient extent! Furthermore, creators and users of the Blue Trace Protocol (the Open Source application protocol that was used by the analytics team for the COVIDSafe App creation) had already discovered many of the challenges and adjustments required to complete the finetuning and one can assume that their feedback on the protocol had already been known to the Australian team prior to the COVIDSafe implementation and deployment. Without going into technical details, it can be argued that there appeared to be little evidence that those shortcomings were taken into account prior the App’s release as the discrepancies outlined did persist throughout the App’s deployment in the State of Victoria!
The lower than expected COVIDSafe App adoption rate has been yet another factor that hindered the data collection and analytics. Notably, the App adoption rate was particularly low among certain segments of The Victorian community, namely older Victorians & Victorians coming from less-privileged backgrounds (low income earners, unemployed, refugees etc.). No evident efforts had been made to increase adoption of the App among those user groups. As a result, (even if it could be assumed that the COVIDSafe App manages data collection and analytics effectively) data on many of the COVID19 carriers was inevitably going to be omitted from the analytics activities. Given that those segments/communities are most vulnerable of all – it once again proved on of the ‘’unspoken rules of Data Science’’: The data that is hardest to collect and analyse – is often particularly critical for handling the analytics successfully!
The Technology Implementation Challenge in Victoria: Has the Analytics Deployment Done ‘’More Good…Than Damage?
The inner-city Melbourne (19,500 people per sq km)is currently the most densely-populated SA2 in the entire Australia. Overall, Victoria has the 2nd highest population density out of all of the Australian States and Territories behind the ACT (Australian Capital Territory) only. Therefore, given the doubtful analytics management performance, it was obvious from the very start, that in the inner suburbs of Melbourne – deployment of the App could do more damage rather than good by not only providing misleading information to the State authorities but also allowing people to have a false sense of security when armed with the App!
As pointed out by SAS, Macro forecasting of the COVID-affected environment is hard…but the micro forecasting is even harder. Also, reports that discuss the contact tracing dynamics are usually relatively hard to adopt for the usage by ‘’ordinary people’’ as they are (if in-depth and detailed) are using a range of fairly complex variables (e.g. Social Network Analysis) and whether this data can be ‘’adopted’’ for the use of the general population as easily as the classical works of literature can be turned into ‘’easy reading books for language learners’’ is not so clear.
More specific shortcomings could be identified by comparing how the data analytics has been handled in the State of Victoria as opposed to other States/Countries. A lot has been written on the obvious differences between the New South Wales (the neighboring State) and the Victorian contact tracing systems. One distinct feature of the ‘’Victorian Saga’’ has been a significantly greater share of mystery cases from the total number of the cases than in the NSW. Once the rise of the mystery cases has become transparent – one would expect a swift action to address the problem by reviewing the analytics processes and protocols (and it has taken the State Health authorities way too long to do so) . It also became clear that with many mystery cases emerging – the COVIDSafe App was becoming far less reliable for all of the stakeholders to use…yet it remained in usage and was promoted for downloading rather proactively. In other words, from a Data Science perspective, it was nothing short of the so-called ‘’Data Torturing’’.
As discussed above, the COVIDSafe App adoption rate appears to be particularly poor throughout locations/clusters of the virus hotspots. The very idea behind employing Data Analytics is not just to collect quantitative data but to analyse this data efficiently and to be able to utilize it for improving the virus response. Once the trend of certain communities/socio-economic/age groups being left behind with the App’s adoption and usage became evident, some action should had been taken in order to ensure that the discrepancy was addressed! The Victorian Health Authorities needed to look more into the Preliminary Analytics tools and patterns rather than wait for months to obtain the inevitable confirmations that their analytical activities were not handled in an optimal manner. Preliminary analytics is all about making speedy adjustments to the analytics procedures based on the data collected.
To sum up the Victorian Health Authorities’ Data Analytics struggles: in the light of the failure of the COVIDSafe App to provide reliable and accurate information, the Victorian State Government should have considered the possibility (would obviously involve consultations with the Federal authorities) of advising people of Victoria NOT to rely on the App too much! In some of the other States and Territories of Australia, deployment of the App, may (just may!) have had its merits but in Victoria, as far as contact tracing and keeping the public safe was concerned – it had not been the case…and the sense of the ‘’false security’’ could only do harm!
In a way, the App’s data collection and analysis was a collaborative effort of the State Health Authorities and the so-called ‘’Citizen Science’’. While the more ‘’technical’’ discrepancies with the data handling are discussed below (see the Data Analytics Challenge: The Failure Explained section of the paper), the poor performance of the App shows that one of the core principles of data analytics has been broken: The Data Analysis and the consequent findings were not tailored to the clients’ (aka the end users) needs.
The Data Analytics Challenge: The Failure Explained
It is clear by now that in the State of Victoria, Data Analytics for the contract tracing has not been carried out in the best possible manner. The fairly recent overhauling of the initial contact tracing system suggests that the contact tracing management team is already fully aware of it (irrespective of whether they choose to admit it or not). Even without comparing sheer effectiveness the Victorian contact tracing efforts with the efforts demonstrated both elsewhere in Australia and internationally, there are some obvious shortcomings to be noted. So what was it that went wrong?
In Data Science, when sorting and processing data collected, it is absolutely essential to make sure that the data features are identified and labeled accurately making it possible to assemble databases where all of the data sets are brought to a common denominator. Needless to say, processing and analysis of the data should be carried out without any data torturing or data dredging along the way.
Once the contact tracing data analytics commenced, not only the data scientists but also the epidemiologists were immediately able to identify a number of discrepancies. To quote Catherine Bennett, Chair of Epidemiology, Deakin University:
‘’ Fundamentally, NSW’s system of decentralized local area health districts meant when the second wave hit, that state was able to draw on teams embedded in their local communities to manage contact tracing. These teams worked independently but also in concert under the mothership of NSW Health.’’
Furthermore, despite being a health professional rather than a data scientist, Dr Bennett was ‘’spot on’’ with her assessment of the data analytics patterns and routine:
‘’ What’s crucial is a nuanced understanding of local, social, and cultural factors that may facilitate spread or affect how people understand self-isolation and what’s being asked of them. It can also make a critical difference in encouraging people to come forward for testing.’’
In the light of the dramatic socio-economic differences between the different cohorts that are outlined above, the data collection and the consequent Contact Tracing activities should have been carried out NOT through a single standard method/analysis pattern but through a range of methods and patterns that were to be tailored to analytics needs for contract tracing within those communities! Without such pattern recognition, all the analytics activities had a questionable degree of accuracy. Yet, this was not done! One may wonder how it was possible for an experienced team of Data Analytics experts to allow such obvious data torturing!
The likely reason for poor structuring of the analytical activities was insufficient inclusion of the Data Science professionals into the contact tracing investigation team. Health Analytics should be carried out in PARTNERSHIP between the health professionals and the medical data scientists and the collaboration has to be ONGOING throughout the projects all the way to completion!
The Data Analytics Challenge: So How Could the Victorian State Authorities do a Better Job?
The author believes that, there was a far better way for the authorities to manage the contact tracing. Instead of handling contact tracing the way they did, the Victorian Health Authorities should have considered tailoring the contact tracing activities and the consequent data analysis to each and every local area/postcode/community from the VERY START of the virus tracing project! I am NOT a medical/sociology expert but I trust that relevant differences between the communities/local areas across Victoria had already been well-established long before the pandemic commenced. Furthermore, the current wave is the so-called ‘’second wave’’ and since there has already been a ‘’preceding’’ first wave, there were lessons to be learned. Likewise, similar COVID19 data analytics challenges have been happening around the globe, so there has certainly been no shortage of case studies and scenarios to learn from.
More specifically, it was NOT even about having separate teams of people analyzing the data or not but about treating each and every community/local area as a separate entity (data set) for the data analysis! Having such dedicated Data Marts would almost certainly increase accuracy and relevance of the contact tracing!
Unfortunately, it is hard to carry out a review of government projects without ‘’political factors’’ coming up’’. The Authors’ submission to the Inquiry into the COVID-19 Contact Tracing System was met not so much with the feedback on the data analytics discrepancies outlined but with political slogans! Proponents of the Victorian Government have been pointing out that ‘’even if things were not done in the best possible way’’ – it was mainly due to the lack of sufficient funding/workers to attend to the tasks of the contact tracing. This could possibly be true in relation to the manual tasks (contacting people around the communities, raising awareness etc.). However, as far as usage of the Data Analytics Tools and Technologies was concerned, the analytics processes were all fully automated so could have been improved at a relatively low cost, subject to better data management and data analytics implementation practices.
To sum up, from the Data Analytics perspective (the one and only perspective that the author has been looking into the contact tracing failure from!), the Victorian Health Authorities failed to:
- Establish the data patterns accurately
- Manage Big Data collection and formatting in a way that would bring data sets to a common denominator rather than simply ‘’bring all the data together’’
- Tailor data analysis methods used to the requirements of the segments considered (instead of analyzing data for the entire state in a very same way, using the very same methods and patterns without taking into account differences between the data sets across Victoria)
In conclusion, the author would like to emphasize once again that as evident from the discussion above, criticism of the Victorian Health Authorities focuses exclusively on the ways Data Analytics part of the contact tracing has been handled and reflects upon the Analytics processes alone rather than the individuals and teams involved! Politics aside, The Victorian Government failed processing and analysis of the COVID-19 data and no efforts or means could compensate for the failure!
I would be more than happy to respond to any further questions on the Victorian Governments’ Contact Tracing Analytics and I can be contacted on email@example.com