Where we stand - July 2021

We owe it to ourselves and to our volunteers to honestly review the good and the not so good if we want to have a shot at executing our gigantic task.

AB

Antoine Bello · July 12, 2021

Where we stand - July 2021

Two months after its inception, it seems a good time to assess the first steps of the Population Project. We owe it to ourselves and to our volunteers to honestly review the good and the not so good if we want to have a shot at executing our gigantic task.

The good

There have been plenty of reasons to rejoice:

  • The warm response to the project is the most important of all. From every corner of the world, we have heard support for our initiative. Granted, the naysayers probably didn’t bother coming forward but we can safely say that our vision is resonating with many.
  • We are gradually honing our message. When we launched the Population Project, we faced the same question over and over again: why list the name and date of birth of all humans alive? Today we have better answers to offer: to acknowledge every life; to bear testimony of who lived when; to give each earthling a chance to know their neighbors; last but not least, because it would be beautiful.
  • We’re tackling head-on the delicate issue of privacy. We have hired Privacy Interpol, a Berlin-based consulting firm that is currently crafting our privacy policy and making sure that we comply with all rules and regulations.
  • Our website works well. Its content management system makes it incredibly easy to update the Volunteer Handbook or edit blog posts. It’s still a pain to find us on Google but we will get there.
  • A few incredible individuals (you know who you are) have proved that one motivated and skillful volunteer can source several dozens of lists a day. Some of them will contain 80 names, others 400,000. Some will be limited to first and last name, others will include date and place of birth. We take them all.
  • We have gathered about 300 millions profiles, mostly in the US, France and the UK, but also in India, Zambia, Algeria, Cameroon, Canada, the Philippines and many other countries.
  • We’re constantly streamlining the core process that goes from sourcing a list to creating an Excel file ready for import. Volunteers now seamlessly upload the URL of the lists they’ve found. A ticket is automatically created in a workflow management software, where anybody can follow the progression of the job (downloading of the file, conversion from PDF to Excel, cleaning of the Excel file, import in our database, etc.). As much as we’d love to rely on volunteers for all tasks, we’ve started hiring data-processing workers in developing countries.

The not so good

There are several key areas that haven’t lived up to our expectations.

  • Our Postgres database is the matrix of the Population Project. Every time we feed it a name, Postgres determines whether it’s a new entry or another iteration of a profile already in store. If the latter, Postgres compares the records field by field (middle name, date of birth, etc.) to determine whether they should be merged or not. The difficulty is compounded by the fact that there can be 30,000 John Smiths and that we plan on ingesting several millions of records a day at scale. We’ve pretty much written the instructions but they’re still too slow to execute. If you know Postgres and are looking for a challenge, contact us by all means!
  • As we source lists from more and more countries, it’s becoming clear that we need a reliable method to break down full names into first, middle and last name. What is relatively easy in French or English turns out to be a nightmare in Portuguese and Arabic. This looks like a typical use case for machine learning but we don’t possess this skill yet.
  • We’re not sourcing enough lists, especially in developing countries. This is a shame because our ads generate a fair amount of interest. But many applicants seem to like the idea of volunteering more than the actual work it entails. They talk the talk where we’d like to see them walk the walk. Part of the responsibility undeniably lies with us (sourcing lists involved too much technical skills in the beginning), but we must instill an action-based culture among our recruits.
  • We don’t have specific strategies for India and China, which together account for more than a third of the world population.
  • Antoine can’t do it alone. We will soon announce crucial additions to the team.

Asks – How you can help

You can become a research volunteer, by sourcing lists of names in your country. More details in the Volunteer Handbook.

Not ready to commit? Anybody can submit one or several lists on our website. It takes less than a minute.

We’re looking for strong profiles in the following fields :

  • Back-end (preferably Postgres);
  • Front-end, to prepare the second version of our website;
  • Machine learning;
  • Data science;
  • Scraping;
  • Search engine optimization.

Whether you view your contribution as a one-time gig or you’re willing to contribute several hours a week, let’s talk!