Why Africa needs it's own Data
Closing the Data Gap to Empower Innovation and Participate in the AI Revolution
"We are in a race between revolution and disaster. The revolution is the 4th Industrial Revolution based on Artificial Intelligence and big data. The disaster is that much of Africa has been left out of the early stages of this race." - Dr. Ibrahim Assane Mayaki
If you were to google the words “Beautiful Baby”, you’d get images of white babies. Now, don’t get me wrong there’s nothing with white babies. But that little experiment shows the bias (however slight) inherent in the search results. It shows the effects of systems trained without diverse datasets.
I’ve spent a lot of time thinking about the current state of AI development in Africa and this is one of a group of essays where I hope to detail my thoughts on the existing problems and possible solutions.
This essay aims to discuss the urgent need for Africa to develop its data capabilities to participate fully in the AI revolution and the hovering risk of exclusion and increased inequality if it does not.
Let’s dive right in.
Why Africa needs it’s own Data Capabilities
Data is the lifeblood of AI and AI systems are moulded to the pre-existing social and cultural biases available in the data with which it is trained (remember our earlier “beautiful baby” example? ). Africa needs its own AI systems trained using local data for two core reasons; Localized Context and the Geographical nature of AI development. Let’s go deeper into these reasons.
Localized Context
Current AI systems are built using data from the United States and other Western Countries. When these systems are deployed across Africa in sectors like Health, Agriculture, and Education the lack of localized context leads to bias.
For example, An AI system trained to detect breast cancer in women would be trained on data obtained from Caucasian Women, as a result, the system may not perform accurately when used to diagnose cancer in African Women This is because there exist certain physiological variations between Caucasian breasts and African Breasts such as increased glandular density (African Women have bigger breasts), differences in texture and shape, etc.
When these physiological variations aren’t accounted for, the model’s predictions are skewed and this results in inaccurate diagnosis and prognosis and consistently low survival rates.
The solution to this would be to train the model on a dataset which better reflects the population distribution within which the system will be deployed. However, such a dataset will be non-existent if there exists no data which accurately describes the African Populace.
Geographical Nature of AI Development
In the current Age of AI, problems solved are slowly becoming more culturally and geographically centred. This is due to the intrinsic nature of AI as an accelerator for human progress, as such it’s dependent on people solving problems based on their growth levels and priorities.
For example, Americans may be more focused on solving death and creating new drugs (think Calico and Isomorphic Labs ), whereas Education may be a less pressing issue for them. On the other hand, better education and increased agricultural output would be a more pressing need for African Developers.
End Note
AI accelerates human progress, but progress depends on the current developmental stage which varies geographically - Africa has different developmental needs than the West/US, therefore Africa needs its own AI systems tailored to its specific needs and current developmental stage.
However, AI systems evolve to match human context and to build these systems engineers and scientists need access to localized datasets which will then provide these models with the local context required to make them useful.
If quality data isn’t made available then we stand the risk of missing out on the current AI revolution and if we do the developmental gap between Africa and the West/US will increase. But this time, due to the ability of AI to exponentially increase human functional capability, the gap won’t progress arithmetically but geometrically. This means we’ll be behind not by years but by orders of magnitude.
The continent could be locked out of future progress to an unrecoverable degree. This could have disastrous downstream effects; entire generations disadvantaged when compared to their counterparts in other parts of the world and economies ploughed out of existence. Africa needs to begin collecting and using its data to provide solutions to its problems.
Aurum Finds
Here is a non-exhaustive list of articles I’ve recently read and highly recommend.
Renaissance 2.0: AI as the Catalyst: In this compelling and highly readable essay,
talks about AI as a stimulator for a modern Rennaisance. It’s an intuitive look at what the future of AI could look like.The Business of Venture Capital: Here
provides a foundational look at the world of Venture Capitalism. A real masterpiece.Albuquerque Part 2: The Uncomfortable:
continues his Alberquerque series with a look at a huge problem with AI systems: Bias and how it affects the healthcare sector.Cognitive Bias at the Cofee Shop: In this piece,
talks about how we unconsciously employ cognitive biases in our day-to-day lives. He does this in a way I have to know as Goatesque.The Genius Myth: Is there any such thing as a genus?
talks about how humans place too much emphasis on Genius when it’s just a synonym for high skill level.
Let me ask you this, Edem: do you think an independently trained LLM in Africa (or other places) could then be used to improve existing LLMs like ChatGPT, Bard, etc where "beautiful" is too often a synonym for "white"? In other words, could a much better, more comprehensive system result from these efforts?
Also, LOL @ "Goatesque"
Very kind of you, Edem! Thanks for the mention! I hope this dramatic situation will be solved soon. Back in 2016 I was denied a grant because I was from Georgia. I know what you're talking about and I understand your POV.