block by patcon df1859b30b3498fbe4d1987ffa3a8972

City of Toronto Youth Services dataset (Unofficial)

Derivation

  1. On this page, find the JS file called youthServices.min.js.

  2. Load it in a JS beautifier and copy the resulting JS into a viewer where you can search properly (textfields like this don’t seem to be searcheable off-screen).

  3. Find instances of the word “drive” in a Google Drive url. This should turn up two urls, the response of which you can transform into proper JSON:

curl --location https://drive.google.com/uc?id=0B-j2Y49nfiw2bjZqOGgtcmJZbGs | sed 's/);$//' | sed 's/^srchCallBack(//' | python -m json.tool > YS_OrgSearchData.json

curl --location https://drive.google.com/uc?id=0B-j2Y49nfiw2MnBjWHBqcDR4eW8 | sed 's/);$//' | sed 's/^TopxCallBack(//' | python -m json.tool > YS_TopicDescriptions.json
  1. Grepping through the rest of the minifier JS will eventually tell you that there are other https://drive.google.com/uc?id= urls built dynamically using a gid var, which we can see is in each item in YS_OrgSearchData.json.

  2. For each item, we can download another JSON response, like so:

# See first item keys in `YS_OrgSearchData.json` for where these envvars come from
GID=0B-j2Y49nfiw2NnRwXzhpUUtZX1U
FID=149105
curl --location "https://drive.google.com/uc?id=$GID" | sed 's/);$//' | sed 's/^odetCallBack(//' | python -m json.tool > $FID.json

149105.json