Understanding Population Sampling Approach from Coronavirus(COVID-19) Testing Perspective

Note: I am not an epidemiologist or healthcare professional, this article is written for awareness purpose from my understanding (as a data professional) towards population sampling (without using any mathematical jargon) for current Coronavirus(COVID-19) testing.

Testing for the Coronavirus(COVID-19) in a sample of a population is a talk of the hour due to the massive spread. For many jurisdictions, population size and resource limitations are imposing serious concerns to respective authorities to control the crisis. This makes sampling strategy especially important to effectively control the spread and manage the infected patients using available resources. Indiscriminate testing cannot be possible in any scenario since it may become a counter-productive exercise. Here, I would like to discuss common sampling techniques and their application in the current pandemic crisis.

No alt text provided for this image

Probabilistic Sampling Methods

Simple Random Sampling (SRS)
No alt text provided for this image

The random sampling technique is about randomly screening a subset of a chosen size from a population or a set. Here a lottery-like selection of population elements is applied to form a sample. This is one of the easiest sampling processes for a large-sized population. At the same time, there is always a possibility of getting a sample biased or skewed in a particular direction or area. It means, at times this method may fail to represent the entire population. In the pandemic crisis like COVID-19, the tests cannot be performed using a random sampling approach since that would not lead to any conclusive results for a larger population

Systematic Sampling
No alt text provided for this image

Systematic sampling is a method where every particular serial number in a population is sampled. As an example, if every 10th member of the population is selected then the final sample would include {10th, 20th, 30th…. n*10th }. This method may be too biased towards the chosen sequence, which means a careful selection strategy is required to target the expected subset of the population. 

Stratified Random Sampling
No alt text provided for this image

Stratified Sampling involves breaking the main population group into several smaller groups or strata, then a simple random sampling approach (SRS) is applied to every subgroup. This is a relatively effective method for many applications since if offers better representative distribution across the entire population in comparison with the SRS method. The disadvantage is, it requires some subject matter expertise to create groups or strata in a given population. 

Clustered Sampling
No alt text provided for this image

Clustered sampling involves the creation of smaller clusters in a population based on geographic parameters. As an example, a country can be clustered into high and low population density clusters. Then those clusters are randomly selected for the creation of a sample. This is a little different approach as compared to stratified sampling since not every cluster is needed for forming a sample. Thus, this method is relatively very biased and not representative of a full population. But if a proper subject knowledge is applied then it can be a relatively cheap solution. 

Non-Probabilistic Sampling

Convenience Sampling

This is a method that chooses samples conveniently available. In many cases, this sample is formed using primarily available data for pilot studies. Thus, this is a highly biased method and never represents the entire population.  

Judgement sampling

A sampling approach based on expert advice towards the subject matter or the purpose of the sampling.

Quota Sampling

The core of this method lies in creating quotas based on characteristics, traits, and any other phenomenon of the population. Thus, the quota sampling method creates a known frame for sampling. This method allows the representation of most possible categories in the population. In cases, it can be a cumbersome experience since requires plenty of groundwork to create those quotas.   

Snowball Sampling
No alt text provided for this image

This approach uses a continuation of the currently involved participants in a sample to invite or refer to new participants. Thus, this process involves a chain reaction to create a snowball effect to grow the sample size. The advantage of this method is, it allows identifying potential participants which are otherwise unknown. Growth of the snowball sampling is very similar to the Exponential growth in the pandemic spread.  

Managing Pandemic Test Response and Population Sampling 

deCODE a case Study from Iceland

Iceland is one of the success stories in ‘flattening the curve’ or managing the pandemic crisis successfully. Iceland used a combination of random and targeted sampling methods using the following criterion:

·        Population Screening: open invitation and a random invitation to participants.

·        Targeted testing: Potential candidates in this category were those who recently traveled to high-risk countries, symptomatic to the disease and had a contact with an infected person.

The test Results were found as the table below: 

No alt text provided for this image

As expected, this study indicates the targeted approach has worked very well as compared to the random sampling. This is due to the foreign origin of the infectious patients. Also, the high positive percentage of the targeted tests was due to the immediate response in the early days of the pandemic spread. Iceland being a low population country and very proactive with testing strategy could contain the disease successfully. 

Australia’s Testing Response to COVID19

Stratification

Population at Risk: Based on the available pandemic data and health history, the Australian federal and state health departments identified the following groups at risk:

·        Indigenous and Torres islanders

·        Elderly People

·        Aged care residents

·        Travelers

No alt text provided for this image

Symptomatic Profiling: Persons with any of the following symptoms were considered to be potential cues for testing purposes:

·   Cough     

·   Fever       

·   Sore throat

·   Shortness of breath 

Clustering to Find Hotspots

Most of the states including NSW formed clusters based on cases being found and potential to spread those. Few of the hotspots declared by the local health authorities included:

  • Residential care facilities,
  • Boarding schools,
  • Prisons,
  • Airports (airline staff only) etc.

Sampling for Testing  

Based on the stratification and clustering during the initial stages, only selective testing was applied to the population sample belonging to the groups identified. Since the imported cases have stopped now and we are in the phase of community transmission, now wider testing will be conducted through invitations from targeted groups. Thus, in Australia it is mostly about targeted testing and now population screening through invitation. Later stages may follow the sentinel surveillance system to employ a proactive testing approach. So far (20th April), more than 430000 tests have been conducted with 6606 (1.53% ) being tested positive.

No alt text provided for this image

Charts above show the numerical journey of the day-to-day testing conducted in Australia. Here, as we can see the ratio of number of tests to cases during the mid March period was remarkably high. This was a great strategy considering the pandemic crisis during that period was on the rise. Also, at this moment, through the plot for daily cases has started saturating still the total tests are increased linearly, means no complacency. By now, almost 17000/million tests are conducted with 257.30 positive cases. It’s been great journey so far from 11% positive cases to just 1.53% due to various measures including proper test sample selection. 

Conclusion

No alt text provided for this image

Though a wide range of sampling methods are available, still the early stages of the pandemic crisis mostly requires a targeted and systematic approach. In the later stage usually the foreign originated cases are reduced, and local cases are at an increase. This means, the community level stratification and clustering methods are required to start sampling the potential candidates from wider community. This sampling may be either random or targeted depending on the symptoms and preconditions. During the ‘stand-down’ phase or the post-pandemic stage, a much broader testing regime is acquired to reduce the impact of randomness or inaccuracy due to invitation sampling. Also, to be noted by the final stages it is expected that curve is flattened, and medical systems have a full capacity to accommodate available candidates. Finally, this is a discussion around a parameter called sampling for a common understanding and not a suggestion to stop other precautionary measures at individual or government level.

X