The little engine that could
Antoine Bello · December 8, 2021
No news from the Population Project doesn’t mean bad news. Quite the opposite actually: it means that, like the little engine that could, we’re firing on all cylinders!
Our team of list sourcers now collects an average of 1 million names per day, weekends included. Lists mostly come from South America and Africa (we're limiting ourselves to the roman alphabet for the moment).
Our biggest contributors: social programs in Mexico, vaccination reports in Brazil, electoral rolls in Ivory Coast, exam results in Mali and Cameroon. When we find a good list, we dig previous years’ archives and make a mental note to check back next year. It’s too early to compute ratios but it’s safe to say that a fair amount of lists are fully renewed every year (think exam results or high school graduations).
Collecting all these names would be pointless without our expert staff of processors. Based in Madagascar, they evaluate the reliability of the source, clean up the files, remove unnecessary columns and duplicate records, and save the resulting spreadsheets in a giant Dropbox (see below).
We’re also moving on the IT front, building the 4D database that will host all this data and, ultimately, power our website. Now that we’ve settled on an HMVC model (see previous post), we work on three separate but interconnected bases:
- Background (where we host all the name knowledge we’ve amassed about every country);
- Import (where we’ll process the lists with a view to separating new humans from returning ones);
- Repository (where all records will be stored and available for query).
We work a lot but no one complains! Knowing we’re in it for the long haul helps. Adding 1 million names a day as we do now (and assuming all of them are new, which they’re not) means it would take us over 20 years to complete our mission. Let’s see if we can do better than that…