Coverage gap analysis (without facility lists) for Baseline Facility Inventory

This is an attempt to figure out whether any lessons on baseline facility inventory coverage gaps can be identified without the use of facility lists, which will ultimately needed for a proper analysis of gap coverage.

However, we can do a very simple analysis to identify lgas which have very few facilities collected, or which have very few facilities collected given the population of the LGA. We will do this analysis for 147 LGAs for which the CGS planning/granting process will take place in 2013.

The analysis follows. First we load and prepare our datasets.

library(plyr)
library(stringr)
library(ggplot2)
one47 <- read.csv("147_final_list.csv")
one47 <- merge(one47, read.csv("population_774.csv"), by = "X_lga_id")

e <- read.csv("in_process_data/merged/Education_661_Merged.csv")
h <- read.csv("in_process_data/merged/Health_661_Merged.csv")

e <- subset(e, select = c("mylga", "mylga_state", "mylga_zone", "unique_lga", 
    "X_lga_id", "level_of_education", "school_managed", "school_name"))
h <- subset(h, select = c("mylga", "mylga_state", "mylga_zone", "unique_lga", 
    "X_lga_id", "facility_type", "facility_owner_manager.private_forprofit"))

e147 <- merge(e, one47, by = "X_lga_id", all.x = F)
h147 <- merge(h, one47, by = "X_lga_id", all.x = F)

etots <- merge(ddply(e147, .(X_lga_id), nrow), one47, by = "X_lga_id", 
    all.y = T)
names(etots) <- replace(names(etots), names(etots) == "V1", "Total")
etots$TotalByPopulation <- etots$Total/etots$Population

htots <- merge(ddply(h147, .(X_lga_id), nrow), one47, by = "X_lga_id", 
    all.y = T)
names(htots) <- replace(names(htots), names(htots) == "V1", "Total")
htots$TotalByPopulation <- htots$Total/htots$Population

Some LGAs missing data entirely

We find, right away, that 3 facility for health and 4 for education has no data among the 147 LGAs for the initial granting process.

print(str_c("Education has data for ", length(levels(as.factor(h147$X_lga_id))), 
    " / 147 lgas"))  #TODO: print exactly what LGAs missing
## [1] "Education has data for 143 / 147 lgas"
print(str_c("Health has data for ", length(levels(as.factor(e147$X_lga_id))), 
    " / 147 lgas"))  #TODO: print exactly what LGAs missing
## [1] "Health has data for 144 / 147 lgas"

Education – low facility numbers, and low facility count per population

Lets start with education, and plot the number of facilities in an LGA, separated by zone:

qplot(data = etots, x = zone, y = Total) + geom_point()
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-3

This allows us to start noting LGAs with a very low number of facilities compared to the rest. By simply eye-balling, we can establish the following cutoffs per zone:

After that, we need to check the facility count per population. Here, let me present a slightly zoomed in view of the data:

qplot(data = etots, x = zone, y = TotalByPopulation) + geom_point() + 
    coord_cartesian(ylim = c(0, 0.001))
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-4

Again, eyeballing, we come up with the cutoffs:
* .0004: north_central
* .0002: northeast, northwest, south_south, and southeast
* 0.0003: southeast

Programming the cutoffs in, we see that the following LGAs might need to be re-surveyed

education_to_review <- subset(etots, (zone == "north_central" & (is.na(Total) | 
    Total <= 80 | TotalByPopulation <= 4e-04)) | (zone == "northeast" & (is.na(Total) | 
    Total <= 55 | TotalByPopulation <= 2e-04)) | (zone == "northwest" & (is.na(Total) | 
    Total <= 50 | TotalByPopulation <= 2e-04)) | (zone == "south_south" & (is.na(Total) | 
    Total <= 40 | TotalByPopulation <= 2e-04)) | (zone == "southeast" & (is.na(Total) | 
    Total <= 50 | TotalByPopulation <= 2e-04)) | (zone == "southwest" & (is.na(Total) | 
    Total <= 60 | TotalByPopulation <= 3e-04)))
subset(education_to_review[order(-education_to_review$Total), ], 
    select = c("zone", "state", "lga", "Total", "TotalByPopulation"))
##              zone     state              lga Total TotalByPopulation
## 96  north_central  Nasarawa         Nasarawa   173         2.899e-04
## 5       southwest     Lagos Ajeromi-Ifelodun   154         2.251e-04
## 3       southwest      Ogun      Ado Odo/Ota   142         2.697e-04
## 97  north_central  Nasarawa   Nasarawa Eggon    78         5.230e-04
## 116 north_central     Kwara          Oke-Ero    70         1.215e-03
## 4   north_central     Niger           Agwara    64         1.115e-03
## 41  north_central     Kwara            Ekiti    59         1.076e-03
## 118     southwest      Osun        Ola-Oluwa    58         7.572e-04
## 131     northeast     Gombe          Shongom    52         3.432e-04
## 80      northwest    Jigawa           Kiyawa    49         2.834e-04
## 84      northwest    Kaduna            Kudan    49         3.525e-04
## 13      southeast      Abia        Arochukwu    45         2.644e-04
## 103     southeast     Enugu       Nkanu East    44         2.958e-04
## 39      southwest     Ekiti             Efon    40         4.601e-04
## 76      northwest     Kebbi            Kalgo    39         4.567e-04
## 73      southeast       Imo              Isu    38         2.310e-04
## 22  north_central      Kogi            Bassa    37         2.643e-04
## 145     northeast    Taraba            Yorro    37         4.138e-04
## 130     northeast    Bauchi            Shira    36         1.538e-04
## 134     northeast      Yobe          Tarmuwa    36         4.663e-04
## 35      northeast     Borno            Dikwa    31         2.927e-04
## 8       southeast   Anambra     Anambra East    28         1.826e-04
## 56      northwest    Kaduna             Giwa    27         9.426e-05
## 58      northwest    Sokoto             Gudu    26         2.721e-04
## 47    south_south Akwa Ibom        Esit Eket    24         3.768e-04
## 54      northeast      Yobe           Geidam    22         1.399e-04
## 140     southeast      Abia        Ukwa East    22         3.737e-04
## 57      northeast     Borno            Gubio    21         1.375e-04
## 53      northeast    Taraba           Gassol    20         8.172e-05
## 63      southwest      Osun          Ifedayo    17         4.587e-04
## 92      northeast     Borno           Mobbar    16         1.372e-04
## 61    south_south Akwa Ibom      Ibiono Ibom    14         7.382e-05
## 139   south_south Akwa Ibom        Udung Uko    14         2.628e-04
## 110   south_south    Rivers         Ogu/Bolo    13         1.741e-04
## 146     northeast      Yobe         Yusufari    13         1.170e-04
## 30      northeast     Borno           Chibok     2         3.025e-05
## 31      northeast     Borno           Damboa    NA                NA
## 121     southeast       Edo       Ovia South    NA                NA
## 128     northeast    Taraba         Sardauna    NA                NA

Health – low facility numbers, and low facility count per population

Similarly for health, we start by plotting the facility counts and facility count per population:

qplot(data = etots, x = zone, y = Total) + geom_point()
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-6

qplot(data = htots, x = zone, y = TotalByPopulation) + geom_point() + 
    coord_cartesian(ylim = c(0, 5e-04))
## Warning message: Removed 4 rows containing missing values (geom_point).
## Warning message: Removed 4 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-6

And we come up with the cut-offs
* north_central: Total <= 35 | TotalByPopulation <= 0.00025
* northeast: Total <= 35 | TotalByPopulation <= 0.00015
* northwest: Total <= 20 | TotalByPopulation <= 0.0001
* south_south: Total <= 10 | TotalByPopulation <= 0.0001
* southeast: Total <= 10 | TotalByPopulation <= 0.00011
* southwest: Total <= 20 | TotalByPopulation <= 0.0001

Programming the cutoffs in, we see that the following LGAs might need to be re-surveyed

health_to_review <- subset(htots, (zone == "north_central" & (is.na(Total) | 
    Total <= 35 | TotalByPopulation <= 0.00025)) | (zone == "northeast" & (is.na(Total) | 
    Total <= 35 | TotalByPopulation <= 0.00015)) | (zone == "northwest" & (is.na(Total) | 
    Total <= 20 | TotalByPopulation <= 1e-04)) | (zone == "south_south" & (is.na(Total) | 
    Total <= 10 | TotalByPopulation <= 1e-04)) | (zone == "southeast" & (is.na(Total) | 
    Total <= 10 | TotalByPopulation <= 0.00011)) | (zone == "southwest" & (is.na(Total) | 
    Total <= 20 | TotalByPopulation <= 1e-04)))
subset(health_to_review[order(-health_to_review$Total), ], select = c("zone", 
    "state", "lga", "Total", "TotalByPopulation"))
##              zone     state              lga Total TotalByPopulation
## 96  north_central  Nasarawa         Nasarawa    71         1.190e-04
## 119 north_central     Benue          Okpokwu    42         2.378e-04
## 5       southwest     Lagos Ajeromi-Ifelodun    40         5.847e-05
## 67  north_central     Kwara      Ilorin East    38         1.860e-04
## 4   north_central     Niger           Agwara    32         5.574e-04
## 94      northeast     Gombe           Nafada    25         1.809e-04
## 134     northeast      Yobe          Tarmuwa    23         2.979e-04
## 22  north_central      Kogi            Bassa    22         1.572e-04
## 91      northwest      Kano         Minjibir    21         9.823e-05
## 39      southwest     Ekiti             Efon    20         2.300e-04
## 41  north_central     Kwara            Ekiti    20         3.646e-04
## 89      northwest    Jigawa        Maigatari    20         1.113e-04
## 116 north_central     Kwara          Oke-Ero    20         3.471e-04
## 28      northwest      Kano          Bunkure    19         1.112e-04
## 40    south_south   Bayelsa         Ekeremor    19         7.030e-05
## 60      northeast    Taraba              Ibi    19         2.260e-04
## 137     northwest    Sokoto           Tureta    19         2.779e-04
## 143     northwest      Kano           Warawa    19         1.475e-04
## 54      northeast      Yobe           Geidam    18         1.144e-04
## 62      southwest       Oyo              Ido    18         1.743e-04
## 76      northwest     Kebbi            Kalgo    18         2.108e-04
## 93      northeast   Adamawa       Mubi North    18         1.191e-04
## 35      northeast     Borno            Dikwa    17         1.605e-04
## 130     northeast    Bauchi            Shira    17         7.265e-05
## 145     northeast    Taraba            Yorro    17         1.901e-04
## 109   south_south   Bayelsa           Obalga    16         8.893e-05
## 117   south_south    Rivers           Okrika    16         7.206e-05
## 19      northwest      Kano           Bagwai    15         9.211e-05
## 103     southeast     Enugu       Nkanu East    15         1.008e-04
## 14    south_south    Rivers       Asari-Toru    14         6.361e-05
## 74      southwest       Oyo          Iwajowa    12         1.165e-04
## 115 north_central      Kogi            Okehi    12         6.000e-05
## 58      northwest    Sokoto             Gudu    11         1.151e-04
## 63      southwest      Osun          Ifedayo    11         2.968e-04
## 80      northwest    Jigawa           Kiyawa    10         5.783e-05
## 34    south_south    Rivers           Degema     9         3.603e-05
## 57      northeast     Borno            Gubio     9         5.891e-05
## 92      northeast     Borno           Mobbar     9         7.715e-05
## 47    south_south Akwa Ibom        Esit Eket     8         1.256e-04
## 8       southeast   Anambra     Anambra East     7         4.565e-05
## 139   south_south Akwa Ibom        Udung Uko     7         1.314e-04
## 61    south_south Akwa Ibom      Ibiono Ibom     6         3.164e-05
## 110   south_south    Rivers         Ogu/Bolo     6         8.034e-05
## 53      northeast    Taraba           Gassol     5         2.043e-05
## 56      northwest    Kaduna             Giwa     5         1.746e-05
## 146     northeast      Yobe         Yusufari     5         4.501e-05
## 30      northeast     Borno           Chibok    NA                NA
## 31      northeast     Borno           Damboa    NA                NA
## 121     southeast       Edo       Ovia South    NA                NA
## 128     northeast    Taraba         Sardauna    NA                NA