ӣƵ

Skip to main content
  • Research
  • Published:

Geographic factors associated with SARS-CoV-2 prevalence during the first wave − 6 districts in Zambia, July 2020

Abstract

Background

Geographical factors can affect infectious disease transmission, including SARS-CoV-2, a virus that is spread through respiratory secretions. Prioritization of surveillance and response activities during a pandemic can be informed by a pathogen’s geographical transmission patterns. We assessed the relationship between geographical factors and SARS-CoV-2 prevalence in Zambia.

Methods

We did a cross-sectional study of SARS-CoV-2 prevalence in six districts in July 2020, which was during the upslope of the first wave in Zambia. In each district, 16 Standard Enumeration Areas (SEAs) were randomly selected and 20 households from each SEA were sampled. The SEA PCR prevalence was calculated as the number of persons testing PCR positive for SARS-CoV-2 in the SEA times the individual sampling weight for the SEA divided by the SEA population. We analysed SEA geographical data for population density, socioeconomic status (SES) (with lower scores indicating reduced vulnerability), literacy, access to water, and sanitation, and hygiene (WASH) factors. Gaussian conditional autoregressive (CAR) models and Generalised estimating equations (GEE) were used to measure adjusted prevalence Ratios (aPRs) and 95% confidence intervals (CIs) for SARS-CoV-2 prevalence with geographical factors, after adjusting for clustering by district, in R.

Results

Overall, the median SARS-CoV-2 prevalence in the 96 SEAs was 41.7 (Interquartile range (IQR), 0.0-96.2) infections per 1000 persons. In the multivariable CAR analysis, increasing SES vulnerability was associated with lower SARS-CoV-2 prevalence (aPR) = 0.85, 95% CI: 0.78–0.94). Conversely, urban SEAs and poor access to WASH were associated with a higher SARS-CoV-2 prevalence (aPR = 1.73, 95% CI: 1.46–2.03, No soap: aPR = 1.47, 95% CI: 1.05–2.05, households without piped water: aPR = 1.32, 95% CI: 1.05–1.65, 30min to fetch water: aPR = 23.39, 95% CI: 8.89–61.52). Findings were similar in the multivariable GEE analysis.

Conclusions

SARS-CoV-2 prevalence was higher in wealthier, urban EAs, with poor access to WASH. As this study was conducted early in the first wave could have impacted our findings. Additional analyses from subsequent waves could confirm if these findings persist. During the beginning of a COVID-19 wave in Zambia, surveillance and response activities should be focused on urban population centres and improving access to WASH.

Peer Review reports

Background

The first wave of the COVID-19 pandemic occurred in Zambia from July to September 2020, peaking in late July [1]. Increased numbers of confirmed cases were first reported in the capital, Lusaka, which was followed by cases in cities in other provinces; rural areas reported few cases during the first wave [2], although testing was limited in Zambia in 2020. A population-representative SARS-CoV-2 prevalence study conducted in six districts in July 2020 showed that SARS-CoV-2 prevalence was significantly greater among persons residing in urban areas than in rural areas [1].

As SARS-CoV-2 transmission occurs through exposures to droplets, aerosols, and fomites, geographic factors such as urbanicity, socioeconomic status, population density, accessibility to water, sanitation, and hygiene (WASH) could contribute to such geographical spreading patterns of SARS-CoV-2 in Zambia [3,4,5]. In Africa between country spread of SARS-CoV-2 has been shown to vary by geographic location with evidence of the influence of spread from neighbouring countries [3]. Increasing population density has been associated with increasing SARS-CoV-2 prevalence in Kenya and Sudan [4, 5]. In Namibia, the COVID-19 pandemic began in urban regions and progressively spread to more rural regions, over all four waves of the pandemic, urban regions were associated with higher transmission rates than rural regions [6]. However, associations between geographic factors and SARS-CoV-2 prevalence in Zambia have still not been explored. In this study, we aimed to determine associations between SARS-CoV-2 prevalence and geographic factors in six districts of Zambia using SARS-CoV-2 prevalence data collected during the first wave in July 2020.

Methods

Study area

We conducted a retrospective analysis of SARS-CoV-2 prevalence data from a cross-sectional study in six districts of Zambia (i.e., Kabwe, Lusaka, Livingstone, Ndola, Nakonde, and Solwezi, Fig.1) between July 4 and 27, 2020, which coincided with the first wave in the country [1]. These districts were purposively selected because of the reported high SARS-CoV-2 incidence. Sixteen standard enumeration areas (SEAs) were randomly selected from each of the six districts, resulting in a total of 96 SEAs [1]. From each SEA, 20 households were randomly selected for participation [1].

Fig. 1
figure 1

Map of Zambia showing the distribution of the six districts in Zambia and the distribution of the selected standard enumeration areas within these districts

Outcome

The SEA PCR prevalence was calculated as the number of persons testing PCR positive for SARS-CoV-2 in the SEA times the individual sampling weight for the SEA divided by the SEA population. The SEA-specific prevalence rates by serology were not included in this analysis as the response rate for this variable was low.

Independent variables

Digital maps of Zambia showing the geographical distribution of SEAs and geographical factors (e.g., population density, socioeconomic status, accessibility to WASH) were developed using QGIS version 3.10A Coruna (). The geographical data was retrieved from an open-access data source generated by Zambia Data Hub [7]. Geographical factors that were analysed in this study included urbanicity, population density, socioeconomic risk score (with higher scores indicating greater socioeconomic risk), proportion of literate individuals aggregated by sex, proportion of radio listeners, and WASH factors (i.e., individuals without soap/detergent at home, individuals without piped-in drinking water at home, individuals without water for hand washing at home, individuals who need > 30min to get water, and individuals who share toilets with others or do not have toilets at home) (Supplementary Table 1) [7]. Apart from Urbanicity which was binary, all other geographical factors were continuous. For each SEA, the proportion of the population of that SEA with the geographic factor was calculated using QGIS.

Statistical analysis

Statistical analysis was carried out using R ver.3.5.0 [8] (R Foundation for Statistical Computing, Vienna, Austria), RStudio [9] and QGIS R packages geepack [10, 11] and spatialreg [12]. We performed the Wilcoxon rank sum test to compare continuous variables between urban and rural SEAs. The Spearman’s rank correlation coefficient (rho) was performed to analyse the correlations between continuous variables (e.g., SARS-CoV-2 prevalence and the proportion of individuals with geographic factors).

We calculated Prevalence Ratios (PR) using univariable analysis of all geographical factors using Gaussian conditional autoregressive (CAR) models. For the CAR models, we used the nearest neighbourhood structure with neighbours defined as SEAs with centroids within 80.47km (approximately 50 miles) of each other. SARS-CoV-2 prevalence was the dependent variable, and each geographic factor was the independent variable in the univariable analysis. We also conducted a univariable analysis of all geographic factors using generalized estimating equations (GEE), adjusting for clustering within districts. We used GEE because clustered data, and observations within the same district may be correlated. GEE accounts for this correlation structure and provides valid inference for regression coefficients, standard errors, and related statistics in the presence of correlated data.

In both models, geographic factors with a p-value less than 0.3 in the univariable analysis were selected for inclusion in the multivariable analysis to measure adjusted Prevalence Ratios (aPR) between SARS-CoV-2 prevalence and the same geographic factors. Socioeconomic Risk Score, Population Density, and Urban were included in both multivariable models regardless of the p-value from the univariable analysis. For the GEE model, Need 30min to water was excluded from the multivariable model as the estimate seemed unstable. A p-value < 0.05 was considered statistically significant.

IRB and informed consent

This study utilized SARS-CoV-2 prevalence data from a research project that had already received approval from the University of Zambia Biomedical Research Ethics Committee (“Novel Coronavirus (SARS-CoV-2) Prevalence Survey, Zambia,” Federal Assurance No. FWA00000338, IRB00001131 of IORG0000774, REF. No. 1030–2020). Additionally, the study incorporated publicly accessible geographic data obtained from the GRID 3 Zambia data hub. Given that the SARS-CoV-2 data was collected under an existing IRB-approved protocol and the geographical data was open-source, no further IRB approval or informed consent was required for this study. The ethical considerations of the original data collection and the public nature of the geographical data sufficiently covered the scope of this research.

Results

The median SARS-CoV-2 prevalence across the 96 SEAs was 41.7 infections per 1000 persons (IQR: 0.0-96.2). Urban SEAs (n = 74) exhibited a significantly higher median prevalence of 52.1 infections per 1000 persons (IQR: 0.0-111.1) compared to rural SEAs (n = 22), where the median was 21.3 infections per 1000 persons (IQR: 0.0–69.0) (p < 0.001). Negative correlations were observed between SARS-CoV-2 prevalence and both socioeconomic risk scores (rho = -0.32, p = 0.001) and the proportion of households without piped drinking water (rho = -0.24, p = 0.018), while no significant correlations were found with population density, literacy rates, regular radio listening, or other WASH factors.

In the univariable CAR model, the most significant association was observed for areas where it took 30min to fetch water (PR = 3.72, 95% CI: 1.94–7.14). The lack of soap at home (PR = 1.24, 95% CI: 0.89–1.74) and urban residency (PR = 1.12, 95% CI: 0.94–1.34) were nearly significant. Conversely, the GEE model identified urban residency as a significant factor (PR = 4.80, 95% CI: 2.01–11.44) and shared toilet facilities showed a substantial association, though with high variability (PR = 8.89, 95% CI: 1.11–70.83). The need to fetch water for 30min also displayed a strong association in the GEE model (PR = 35.38, 95% CI: 0.15-8352.03), albeit with extreme uncertainty (Table1).

Table 1 Univariable analysis using GEE and CAR for geographical factors associated with SARS-CoV-2 prevalence in 6 districts of Zambia, July 4–27, 2020

In the multivariable CAR model, higher socioeconomic risk scores were significantly associated with lower SARS-CoV-2 prevalence (aPR = 0.85, 95% CI: 0.78–0.94), while urban SEAs were linked to higher prevalence (aPR = 1.73, 95% CI: 1.46–2.03). SEAs without adequate WASH facilities, such as those lacking soap (aPR = 1.47, 95% CI: 1.05–2.05), piped water (aPR = 1.32, 95% CI: 1.05–1.65), or requiring 30min to fetch water (aPR = 23.39, 95% CI: 8.89–61.52), were also significantly associated with higher SARS-CoV-2 prevalence (Table2).

Table 2 Multivariable analysis using GEE and CAR for geographical factors associated with SARS-CoV-2 prevalence in 6 districts of Zambia, July 4–27, 2020

The GEE model yielded similar results, particularly for urban SEAs (aPR = 11.00, 95% CI: 3.51–34.47) and the absence of soap at home (aPR = 3.34, 95% CI: 1.19–9.35). However, the GEE model demonstrated wider confidence intervals, reflecting greater uncertainty, and only urban SEAs and lack of soap remained significant, highlighting the more conservative and spatially sensitive nature of the CAR model (Table2; Fig.2).

Fig. 2
figure 2

Forest plot of geographical factors associated with SARS-COV-2 prevalence in Zambia, 2020

Discussion

High SARS-CoV-2 prevalence was significantly associated with urbanicity, poor access to WASH and higher socioeconomic status during the first COVID-19 wave in Zambia. This finding is consistent with findings from a study done in Namibia that showed that more urban areas had earlier exposure to SARS-CoV-2 and increased transmission rates during waves [6].

In Zambia, such distribution patterns of cases could have been because the introduction of SARS-CoV-2 was related to international travellers who generally reside in affluent urban communities, where the first spread of infections occurred [2]. We postulate that there was an interval between the surge of COVID-19 cases in urban areas and the introduction of the virus into adjacent rural communities, which was the potential mechanism for the smaller SARS-CoV-2 prevalence in rural SEAs in our study. A similar observation was seen in the United States, where the COVID-19 incidence was much higher in urban counties compared to rural counties in March-May 2020, however, the difference was greatly reduced in mid-June 2020, due to the spread of infections into rural counties [13]. The exact timing of the introduction of SARS-CoV-2 into rural communities of Zambia was not elucidated in this study. However, considering the increased number of cases in provincial areas between August and September 2020, such geographical spread of SARS-CoV-2 from urban to rural areas is assumed to have occurred during the latter phase of the first wave after the peak month [2].

Our findings that reduced access to WASH was significantly associated with higher SARS-COV-2 prevalence is consistent with established COVID-19 transmission models, and current guidance towards COVID-19 prevention [14, 15].

Improved access to WASH has been documented to reduce the spread of COVID-19 [14]. The timely provision of WASH services along with other public health interventions, during a COVID-19 outbreak can be vital to control the pandemic.

The CAR model provided more conservative estimates with narrower confidence intervals, suggesting moderate associations between geographical factors and SARS-CoV-2 prevalence, particularly highlighting the protective effect of higher socioeconomic risk scores. Conversely, the GEE model revealed stronger associations, especially for urban SEAs and inadequate hygiene practices, though with wider confidence intervals, reflecting greater uncertainty. These differences underscore the importance of considering multiple modelling approaches in spatial epidemiology, as they can yield varying interpretations and implications for public health interventions.

Limitations of our study include an ecological study using a small number of SEAs for analysis. Although these SEAs were randomly selected, districts were purposefully selected. Additionally, SARS-CoV-2 prevalence data was collected over a few weeks and thus reflects prevalence estimates then. The district-specific geographical factors collected from GRID 3 were based on the 2018 Demographic Health Survey and the distribution of these factors may have changed since then. We are limited by the geographical factors available and couldn’t look at other ones of interest, such as Household size or health care utilization.

Conclusion

SARS-CoV-2 prevalence was associated with urbanicity, high socioeconomic status and poor access to WASH during the peak month of the first wave in July 2020. Zambia might focus surveillance and response activities on urban population centres and ensure adequate provision of WASH earlier in a wave to mitigate COVID-19 spread.

Data availability

Deidentified participant data used for this analysis can be requested from the Zambian Ministry of Health after December 31, 2023. Interested researchers must submit a research proposal for consideration by the study investigators. If approved, the requestor must sign a data use agreement. Additionally, the study protocol is available for request. All data requests should be directed to the corresponding author.

References

  1. Mulenga LB, Hines JZ, Fwoloshi S et al. Prevalence of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infection in Six Districts in Zambia—July 2020: a cross-sectional multi-stage cluster-sampled household survey. Lancet Glob Health. 2021 Publication pending. 2021.

  2. Chipimo PJ, Barradas DT, Kayeyi N, Zulu PM, Muzala K, Mazaba ML, et al. First 100 persons with COVID-19 — Zambia, March 18–April 28, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(42):1547–8.

    CAS

  3. Gayawan E, Awe OO, Oseni BM, Uzochukwu IC, Adekunle A, Samuel G, et al. The spatio-temporal epidemic dynamics of COVID-19 outbreak in Africa. Epidemiol Infect. 2020;148:e212.

    CAS

  4. Ngere I, Dawa J, Hunsperger E, Otieno N, Masika M, Amoth P, et al. High seroprevalence of SARS-CoV-2 but low infection fatality ratio eight months after introduction in Nairobi, Kenya. Int J Infect Dis. 2021;112:25–34.

    CAS

  5. Abd El-Raheem GOH, Elamin HES, Ahmad ZMO, Noma M. Spatial–temporal trends of COVID-19 infection and mortality in Sudan. Scientific Reports 2022 12:1. 2022;12(1):1–11.

  6. Okano JT, Valdano E, Mitonga HK, Blower S. Predicting the transmission of SARS-CoV-2 in Africa: the case of Namibia. J Travel Med. 2022;29(3). .

  7. Search for ’Zambia. ’ | GRID3 Data Hub. . Accessed 3 October 2022.

  8. Download. April R-4.3.3 for Windows. The R-project for statistical computing. . Accessed 12 2024.

  9. RStudio Desktop - Posit. . Accessed 12 April 2024.

  10. Halekoh U, Højsgaard S, Yan J. Generalized estimating equation Package [R package geepack version 1.3.10]. J Stat Softw. 2024;15(2):1–11.

  11. CRAN - Package geepack. . Accessed 12 April 2024.

  12. CRAN - Package spatialreg. . Accessed 12 April 2024.

  13. COVID-19 Stats. COVID-19 incidence, by Urban-Rural classification — United States, January 22–October 31, 2020. MMWR Morb Mortal Wkly Rep. 2022;69(46):1753.

  14. Khatib MN, Sinha A, Mishra G, Quazi SZ, Gaidhane S, Saxena D, et al. WASH to control COVID-19: a rapid review. Front Public Health. 2022;10. .

  15. Handwashing an effective. tool to prevent COVID-19, other diseases. . Accessed 14 August 2024.

Acknowledgements

We thank the Ministry of Health, Zambia, for providing SARS-CoV-2 prevalence data in Zambia. We thank Zambia Data Hub for providing geographical data. This work was funded by the CDC Emergency Response to the COVID-19 pandemic and the US President’s Emergency Plan for AIDS Relief through the US Centers for Disease Control and Prevention (CDC). The findings and conclusions are those of the authors and do not necessarily represent the official position of the CDC.

Funding

This study was funded by the Centers for Disease Control and Prevention, Lusaka, Zambia.

Author information

Authors and Affiliations

Authors

Contributions

SC, TI, and JH were the principal investigators. SC, N S, and JH designed the study. SC, WM, NS, and JH collected the data. CM generated geospatial data for analysis.SC, TI, RB, and JH analysed the data. SC, TI, and JH wrote the manuscript.

Corresponding author

Correspondence to Stephen Longa Chanda.

Ethics declarations

Ethics approval and consent to participate

Ethical approval to conduct the primary study was obtained from the University of Zambia Biomedical Research Ethics Committee (IRB00001131 of IORG0000774) study reference (REF. NO. 1030–2020).

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

About this article

Cite this article

Chanda, S.L., Imamura, T., Malambo, W. et al. Geographic factors associated with SARS-CoV-2 prevalence during the first wave − 6 districts in Zambia, July 2020. ӣƵ 25, 123 (2025). https://doi.org/10.1186/s12889-025-21347-w

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12889-025-21347-w

Keywords