This is an attempt to figure out whether any lessons on baseline facility inventory coverage gaps can be identified without the use of facility lists, which will ultimately needed for a proper analysis of gap coverage.
However, we can do a very simple analysis to identify lgas which have very few facilities collected, or which have very few facilities collected given the population of the LGA. We will do this analysis for 147 LGAs for which the CGS planning/granting process will take place in 2013.
The analysis follows. First we load and prepare our datasets.
library(plyr)
library(stringr)
library(ggplot2)
one47 <- read.csv("147_final_list.csv")
one47 <- merge(one47, read.csv("population_774.csv"), by = "X_lga_id")
e <- read.csv("in_process_data/merged/Education_661_Merged.csv")
h <- read.csv("in_process_data/merged/Health_661_Merged.csv")
e <- subset(e, select = c("mylga", "mylga_state", "mylga_zone", "unique_lga",
"X_lga_id", "level_of_education", "school_managed", "school_name"))
h <- subset(h, select = c("mylga", "mylga_state", "mylga_zone", "unique_lga",
"X_lga_id", "facility_type", "facility_owner_manager.private_forprofit"))
e147 <- merge(e, one47, by = "X_lga_id", all.x = F)
h147 <- merge(h, one47, by = "X_lga_id", all.x = F)
etots <- merge(ddply(e147, .(X_lga_id), nrow), one47, by = "X_lga_id",
all.y = T)
names(etots) <- replace(names(etots), names(etots) == "V1", "Total")
etots$TotalByPopulation <- etots$Total/etots$Population
htots <- merge(ddply(h147, .(X_lga_id), nrow), one47, by = "X_lga_id",
all.y = T)
names(htots) <- replace(names(htots), names(htots) == "V1", "Total")
htots$TotalByPopulation <- htots$Total/htots$Population
We find, right away, that 3 facility for health and 4 for education has no data among the 147 LGAs for the initial granting process.
print(str_c("Education has data for ", length(levels(as.factor(h147$X_lga_id))),
" / 147 lgas")) #TODO: print exactly what LGAs missing
## [1] "Education has data for 143 / 147 lgas"
print(str_c("Health has data for ", length(levels(as.factor(e147$X_lga_id))),
" / 147 lgas")) #TODO: print exactly what LGAs missing
## [1] "Health has data for 144 / 147 lgas"
Lets start with education, and plot the number of facilities in an LGA, separated by zone:
qplot(data = etots, x = zone, y = Total) + geom_point()
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).
This allows us to start noting LGAs with a very low number of facilities compared to the rest. By simply eye-balling, we can establish the following cutoffs per zone:
After that, we need to check the facility count per population. Here, let me present a slightly zoomed in view of the data:
qplot(data = etots, x = zone, y = TotalByPopulation) + geom_point() +
coord_cartesian(ylim = c(0, 0.001))
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).
Again, eyeballing, we come up with the cutoffs:
* .0004: north_central
* .0002: northeast, northwest, south_south, and southeast
* 0.0003: southeast
Programming the cutoffs in, we see that the following LGAs might need to be re-surveyed
education_to_review <- subset(etots, (zone == "north_central" & (is.na(Total) |
Total <= 80 | TotalByPopulation <= 4e-04)) | (zone == "northeast" & (is.na(Total) |
Total <= 55 | TotalByPopulation <= 2e-04)) | (zone == "northwest" & (is.na(Total) |
Total <= 50 | TotalByPopulation <= 2e-04)) | (zone == "south_south" & (is.na(Total) |
Total <= 40 | TotalByPopulation <= 2e-04)) | (zone == "southeast" & (is.na(Total) |
Total <= 50 | TotalByPopulation <= 2e-04)) | (zone == "southwest" & (is.na(Total) |
Total <= 60 | TotalByPopulation <= 3e-04)))
subset(education_to_review[order(-education_to_review$Total), ],
select = c("zone", "state", "lga", "Total", "TotalByPopulation"))
## zone state lga Total TotalByPopulation
## 96 north_central Nasarawa Nasarawa 173 2.899e-04
## 5 southwest Lagos Ajeromi-Ifelodun 154 2.251e-04
## 3 southwest Ogun Ado Odo/Ota 142 2.697e-04
## 97 north_central Nasarawa Nasarawa Eggon 78 5.230e-04
## 116 north_central Kwara Oke-Ero 70 1.215e-03
## 4 north_central Niger Agwara 64 1.115e-03
## 41 north_central Kwara Ekiti 59 1.076e-03
## 118 southwest Osun Ola-Oluwa 58 7.572e-04
## 131 northeast Gombe Shongom 52 3.432e-04
## 80 northwest Jigawa Kiyawa 49 2.834e-04
## 84 northwest Kaduna Kudan 49 3.525e-04
## 13 southeast Abia Arochukwu 45 2.644e-04
## 103 southeast Enugu Nkanu East 44 2.958e-04
## 39 southwest Ekiti Efon 40 4.601e-04
## 76 northwest Kebbi Kalgo 39 4.567e-04
## 73 southeast Imo Isu 38 2.310e-04
## 22 north_central Kogi Bassa 37 2.643e-04
## 145 northeast Taraba Yorro 37 4.138e-04
## 130 northeast Bauchi Shira 36 1.538e-04
## 134 northeast Yobe Tarmuwa 36 4.663e-04
## 35 northeast Borno Dikwa 31 2.927e-04
## 8 southeast Anambra Anambra East 28 1.826e-04
## 56 northwest Kaduna Giwa 27 9.426e-05
## 58 northwest Sokoto Gudu 26 2.721e-04
## 47 south_south Akwa Ibom Esit Eket 24 3.768e-04
## 54 northeast Yobe Geidam 22 1.399e-04
## 140 southeast Abia Ukwa East 22 3.737e-04
## 57 northeast Borno Gubio 21 1.375e-04
## 53 northeast Taraba Gassol 20 8.172e-05
## 63 southwest Osun Ifedayo 17 4.587e-04
## 92 northeast Borno Mobbar 16 1.372e-04
## 61 south_south Akwa Ibom Ibiono Ibom 14 7.382e-05
## 139 south_south Akwa Ibom Udung Uko 14 2.628e-04
## 110 south_south Rivers Ogu/Bolo 13 1.741e-04
## 146 northeast Yobe Yusufari 13 1.170e-04
## 30 northeast Borno Chibok 2 3.025e-05
## 31 northeast Borno Damboa NA NA
## 121 southeast Edo Ovia South NA NA
## 128 northeast Taraba Sardauna NA NA
Similarly for health, we start by plotting the facility counts and facility count per population:
qplot(data = etots, x = zone, y = Total) + geom_point()
## Warning message: Removed 3 rows containing missing values (geom_point).
## Warning message: Removed 3 rows containing missing values (geom_point).
qplot(data = htots, x = zone, y = TotalByPopulation) + geom_point() +
coord_cartesian(ylim = c(0, 5e-04))
## Warning message: Removed 4 rows containing missing values (geom_point).
## Warning message: Removed 4 rows containing missing values (geom_point).
And we come up with the cut-offs
* north_central: Total <= 35 | TotalByPopulation <= 0.00025
* northeast: Total <= 35 | TotalByPopulation <= 0.00015
* northwest: Total <= 20 | TotalByPopulation <= 0.0001
* south_south: Total <= 10 | TotalByPopulation <= 0.0001
* southeast: Total <= 10 | TotalByPopulation <= 0.00011
* southwest: Total <= 20 | TotalByPopulation <= 0.0001
Programming the cutoffs in, we see that the following LGAs might need to be re-surveyed
health_to_review <- subset(htots, (zone == "north_central" & (is.na(Total) |
Total <= 35 | TotalByPopulation <= 0.00025)) | (zone == "northeast" & (is.na(Total) |
Total <= 35 | TotalByPopulation <= 0.00015)) | (zone == "northwest" & (is.na(Total) |
Total <= 20 | TotalByPopulation <= 1e-04)) | (zone == "south_south" & (is.na(Total) |
Total <= 10 | TotalByPopulation <= 1e-04)) | (zone == "southeast" & (is.na(Total) |
Total <= 10 | TotalByPopulation <= 0.00011)) | (zone == "southwest" & (is.na(Total) |
Total <= 20 | TotalByPopulation <= 1e-04)))
subset(health_to_review[order(-health_to_review$Total), ], select = c("zone",
"state", "lga", "Total", "TotalByPopulation"))
## zone state lga Total TotalByPopulation
## 96 north_central Nasarawa Nasarawa 71 1.190e-04
## 119 north_central Benue Okpokwu 42 2.378e-04
## 5 southwest Lagos Ajeromi-Ifelodun 40 5.847e-05
## 67 north_central Kwara Ilorin East 38 1.860e-04
## 4 north_central Niger Agwara 32 5.574e-04
## 94 northeast Gombe Nafada 25 1.809e-04
## 134 northeast Yobe Tarmuwa 23 2.979e-04
## 22 north_central Kogi Bassa 22 1.572e-04
## 91 northwest Kano Minjibir 21 9.823e-05
## 39 southwest Ekiti Efon 20 2.300e-04
## 41 north_central Kwara Ekiti 20 3.646e-04
## 89 northwest Jigawa Maigatari 20 1.113e-04
## 116 north_central Kwara Oke-Ero 20 3.471e-04
## 28 northwest Kano Bunkure 19 1.112e-04
## 40 south_south Bayelsa Ekeremor 19 7.030e-05
## 60 northeast Taraba Ibi 19 2.260e-04
## 137 northwest Sokoto Tureta 19 2.779e-04
## 143 northwest Kano Warawa 19 1.475e-04
## 54 northeast Yobe Geidam 18 1.144e-04
## 62 southwest Oyo Ido 18 1.743e-04
## 76 northwest Kebbi Kalgo 18 2.108e-04
## 93 northeast Adamawa Mubi North 18 1.191e-04
## 35 northeast Borno Dikwa 17 1.605e-04
## 130 northeast Bauchi Shira 17 7.265e-05
## 145 northeast Taraba Yorro 17 1.901e-04
## 109 south_south Bayelsa Obalga 16 8.893e-05
## 117 south_south Rivers Okrika 16 7.206e-05
## 19 northwest Kano Bagwai 15 9.211e-05
## 103 southeast Enugu Nkanu East 15 1.008e-04
## 14 south_south Rivers Asari-Toru 14 6.361e-05
## 74 southwest Oyo Iwajowa 12 1.165e-04
## 115 north_central Kogi Okehi 12 6.000e-05
## 58 northwest Sokoto Gudu 11 1.151e-04
## 63 southwest Osun Ifedayo 11 2.968e-04
## 80 northwest Jigawa Kiyawa 10 5.783e-05
## 34 south_south Rivers Degema 9 3.603e-05
## 57 northeast Borno Gubio 9 5.891e-05
## 92 northeast Borno Mobbar 9 7.715e-05
## 47 south_south Akwa Ibom Esit Eket 8 1.256e-04
## 8 southeast Anambra Anambra East 7 4.565e-05
## 139 south_south Akwa Ibom Udung Uko 7 1.314e-04
## 61 south_south Akwa Ibom Ibiono Ibom 6 3.164e-05
## 110 south_south Rivers Ogu/Bolo 6 8.034e-05
## 53 northeast Taraba Gassol 5 2.043e-05
## 56 northwest Kaduna Giwa 5 1.746e-05
## 146 northeast Yobe Yusufari 5 4.501e-05
## 30 northeast Borno Chibok NA NA
## 31 northeast Borno Damboa NA NA
## 121 southeast Edo Ovia South NA NA
## 128 northeast Taraba Sardauna NA NA