Cholera Response DRC

village_river_min.jpg
Village along the river, Bor, South Sudan.

Fighting Cholera in Democratic Republic of Congo

This project began after a call from a friend of mine deployed in the Democratic Republic of Congo (DRC). She reported that cholera outbreaks were ongoing in many villages along the Kasai and Sankuru rivers. Her organization planned to launch a comprehensive Water Sanitation and Hygiene (WASH) campaign targeting all settlements within a 20 km radius and monitoring those between 30 and 50 km from these rivers. They lacked a reliable list of villages with population estimates, and she asked if I could assist by examining available datasets.

“I’ll see what I can do,” I said. After some days of intensive work, I provided a list of over 5000 villages located within 20 to 50 km of the two rivers, complete with population estimates from two different datasets and an interactive R Shiny Web Application to navigate the results

Area of interest

The area of interest is extensive, covering more than 3000 km of river length and an area of approximately 100,000 km2 within the 20 km buffer and 250,000 km2 within the 50 km buffer.

Rivers Kasai and Sankuru and buffer around the rivers.
Figure 1 - Rivers Kasai and Sankuru (Left). Buffer around the rivers (Right).
Table 1 - Length of the rivers Kasai, Sankuru and total length of both combined.
RiverRiver LengthUnit
River Kasai1926km
River Sankuru1280km
Total length3206km

Village dataset

After evaluating datasets from various sources and validating them with satellite images, I selected the dataset compiled by the National Geospatial Intelligence Agency (NGA) for its completeness and accuracy, featuring 5277 villages in the 50 km buffer zone.

Download the data

Population dataset

To estimate the population, I utilized both a well-established dataset and a newer product available at the time of analysis. These datasets are the Global Human Settlement Layer (GHSL) published by the EU Commission, Joint Research Center and the Facebook - High-Resolution Population Density Maps from the Humanitarian Data Exchange platform.

Facebook- Download the data
GHSL - Download the data

Methods and results

The data processing and analysis was carried out using R Studio and QGIS.Below is a list of the libraries used:

library(sf)
library(dplyr)
library(tidyr) 
library(tmap) 
library(raster)
library(rgdal) 

From a technical standpoint, allocating population figures to settlement point geometries required resampling the Facebook population raster to match the GHSL raster, which has a resolution of 2x2 km. The allocation of population figures to individual settlements carries a small degree of approximation at the micro-level; for example, if two or more points fall within a 2 km x 2 km square, the population count is evenly distributed among them.

Comparing the two population datasets shows that the GHSL dataset generally estimates higher populations around the Kasai River and lower populations around the Sankuru River. The discrepancy increases further from the rivers into the mainland. However, the overall population figures from both datasets are relatively consistent, with a divergence of only 5.4%.

Total Villages Analysis
Table 2 - Population figures for all villages.
DescriptionVillage CountPopulation GHSLPopulation FBDiff % GHSL-FB
Total villages5,2774,895,7754,631,871+5.4%
Villages within 20 km
Table 3 - Population figures for villages within 0 km buffer.
DescriptionVillage CountPopulation GHSLPopulation FBDiff % GHSL-FB
General2,4752,668,2992,549,717+4.4%
Near Sankuru8061,110,7661,165,728-4.9%
Near Kasai1,6691,557,5331,383,989+11.1%
Villages within 50 km
Table 4 - Population figures for villages within 50 km buffer.
DescriptionVillage CountPopulation GHSLPopulation FBDiff % GHSL-FB
General2,8022,227,4762,082,154+6.5%
Near Sankuru1,044715,225839,952-17.4%
Near Kasai1,7581,512,2511,242,202+17.9%

Additional Notes

  • GHSL: Represents data from the Global Human Settlement Layer, which provides detailed global population data.
  • FB: Stands for data derived from Facebook’s high-resolution population datasets, used for humanitarian and research purposes.
  • Percent differences are calculated based on the discrepancy between GHSL and FB data, highlighting potential variances in data collection methods or actual changes in population.

R Shiny Web Application

To facilitate the visual exploration of the dataset, I created and deployed a fully responsive and dynamic R-Shiny Web Application named Shiny Afrique. This application allows users to retrieve information for each village on-click, and to filter villages by distance from the river or by name. Additionally, it dynamically calculates and displays statistics based on the villages visible on the map. These statistics include an histogram depicting the number of villages relative to their distance from the rivers and a table displaying the total population and the average population per village for the current selection.

Open the Web App

View of the Shiny Afrique landing page.
Figure 2 - View of the Web Application landing page.


Written By

Marco Pizzolato

Data Engineer and Geospatial Expert