Everyone is suddenly abuzz about AI. Since OpenAI released ChatGPT, the idea of artificial general intelligence seems like a plausible reality. Whether you believe AI will benefit the world or bring about disaster, it’s hard to deny how compelling ChatGPT/GPT-4 is when it comes to the quality of its answers and the range of tasks it can perform.
This has sparked an “AI arms race”, with the largest tech companies all desperately investing in their own AI capabilities after being beaten to the punch by OpenAI.
Permissionless data
ChatGPT was trained using a vast collection of written material found online — such as a collection of some 7000 unpublished books; WebText; millions of “high quality” outbound links from Reddit; CommonCrawl, a web archive depository containing petabytes of web data, Wikipedia entries and more.
ChatGPT also used human trainers, including “labellers” and feedback managers who constantly refined its language models. Once the software was further developed, OpenAI released it to the public for more feedback.
This model of using a vast collection of free, public data points to monetise products is a favoured one by Silicon Valley. The big tech model of using human data, labour and output without permission, compensation or recourse has meant that a small handful of companies become the gatekeepers of new technology.
But even some Big Tech insiders are starting to get worried. Geoffrey Hinton, dubbed “the godfather of AI”, recently resigned from Google, admitting concerns about the speed and development of the technology, calling the current chatbots “quite scary”.
He follows other Google employees who have raised the alarm on AI, including Timnit Gebru and Margaret Mitchell, who both spearheaded Google’s AI ethics team. Despite these warnings, Google is ploughing ahead.
Human toll
OpenAI trained ChatGPT on publicly available datasets, benefiting from the collective work of millions. Google’s search engine indexes all publicly available websites to create a directory of information that it profits from. Its other applications — like Maps, Gmail and Google Docs — combine datasets collected from millions of users to create marketable user profiles to power its advertising business. Facebook harvests all the data around our habits, likes, dislikes and personal connections to micro-target advertising.
Not to mention considerations around copyright, intellectual property and moral rights. While the current datasets may not have copyright rules applied, when will this type of appropriation end?
AI models require ever larger and more diverse datasets to truly capture the nuances and subtleties of expression. One can imagine copyrighted works of literature, art and music can only be next. Already Google is advocating for breaking open copyright law in Australia to better accommodate its AI systems.
AI might seem like magic, but in reality, it benefits from the work of many humans, often in an exploitative and extractive manner.
OpenAI outsourced its labelling function to help make ChatGPT less toxic by hiring workers in Kenya who were paid less than $2 per hour. Facebook has been notorious for this similar practice for years, hiring poorly paid moderators to sift through harmful content, resulting in significant mental health issues and PTSD.
An ecosystem of labour
In her book Atlas of AI, scholar Kate Crawford exposes how extractive the AI industry is. In contrast to the images we associate with it — innovation, virtual (non-physical), machine-led (independent of humans) — AI is actually bodied, material, and reliant on an ecosystem that consumes raw materials and practices exploitative labour.
Crawford describes how AI requires rare resources like lithium for batteries, which are often found in developing countries and conflict zones, and latex from South-East Asia — with concerning environmental impacts. AI is also reliant on outsourced, manual data labelling and classification, usually done by people from poor countries paid a pittance, like Amazon’s Mechanical Turk (and the aforementioned Open AI labellers and Facebook moderators). Or it is reliant on indirect labour whereby humans don’t even realise they are contributing to training AI — like Google’s reCAPTCHA feature.
With the latest version of AI that’s now sweeping the globe, ChatGPT, we are in danger of once again reinforcing the myth of AI as something disembodied and free from human intervention, a unique and novel thing that genius tech bros have come up with in isolation.
Importantly, we are again in danger of allowing a small group of big tech companies to capitalise and profit from the latest technology built off the work of millions of people, without acknowledging the extractive nature of those initiatives.
So, if AI does take over our jobs, making our skills obsolete or severely disrupting our industries, we shouldn’t be surprised. We only have ourselves to blame for allowing these harmful big tech practices to continue unchallenged — or worse, for actively contributing to it.

How does this differ from any prior example of capitalist exploitation? Indeed, isn’t expoitation the essence of what capitalism is? The exploitation of any and all available resources for the benefit and profit of the exploiters?
Once upon a time it was the exploitation of the natural world and its resources. Then it was the exploitation of vulnerable human beings. Then it was human culture: myth and knowledge and beliefs.
AI is not the source of the evil that may arise from its use. Capitalism and its total lack of ethics and decency is where the evil comes from.
I can think of other extractive industries that get away with all manner of sins.. seriously, what is the government going to do other than fight a losing battle to regulate and maybe take their pocket change with fines to prove we’re ‘doing something about it”?
There are endless opportunities with this powerful tool. Perhaps using AI to help guide us to fairer future is more productive than trying to beat big tech to the punch. Otherwise Google government will arrive while we’re busy tilting at windmills.
And let’s face it, humans are taking way too long to evolve. Given the choice, i’d vote for an ethical AI algorithm to represent me in parliament over half the party robots currently. To start, it wouldn’t take dodgy donations, hold out for a gold plated pension, then pass the safe seat on to its next of kin.
Yes, I’m optimistic humans and AI are going to combine in surprisingly productive and beautiful ways, as a nicely balanced cognitive and emotive pairing that will end up encouraging each other to embrace their respective sentient ‘better angels’. It’s going to be an interesting and fun next decade, regardless.
Reading AI is to see, writ plain (sic!), the peril of Keats’ “unweaving the Rainbow..” – not so much as through a glass, darkly as a prism, if not a kaleidoscope.
In the back of underpaid or even free labour, alright – how many times have we had to ostensibly ‘prove you’re not a robot’ – IOW, train robots to better imitate humans, for free.
It’s a good piece, Jordan, but you fall into the same trap all professional writers fall into, which is assuming that ‘free’ – non-monetised – information is automatically exploitative. You’ve got it a*se about, mate. AI is actually returning us to the heady days of the early net when it was all open source and unpaid and human beings threw their digitised contributions into the collective info-mix for no more kick than participating in the next great human communication adventure.
That AI will, due to business model imperatives, be overwhelmingly ‘raised’ and ‘educated’ on free internet content in fact represents a radically liberating, democratic and egalitarian step beyond the reach of cynical commercialisation. It’s the oldest internet story there is: if you want your content free, then it’ll be content you can’t control. AI is simply accelerating the shift that the internet was always going to force upon us: rendering the very notion of ‘commodifying information’ – am abstraction – untenable and redundant. We’re heading back to where we were before information recording technology – slates, papyrus, ink and paper, presses, computers – allowed us to commodify information by commodifying its delivery platform. The tech companies will still make coin by flogging hardware, software, networks…but the origin content will arbitrage inevitably towards democratically free again. It’ll be you poor tenured info-luddites, dutifully writing and graphic-designing and coding and archiving your content exclusively for a paying audience, behind pay-walled e-citadels ruled by publishing moguls and corporations with agendas, who’ll be shut out of inclusion in the next evolutionary iteration of human intelligence. And I am, frankly, flipping delighted that GPTChat’s busily harvesting algos can’t nick (most) bits of Murdochian content to add to its store of collected wisdom!
The pros of the ‘legacy media’ have spent 20 years laughing at us amateur content-creators here in the unpaid sewers of Teh Interwebz…but it’s us, not them and their dinosaurian editors, publishers, moguls and political backers, who will parent AI into adulthood.
We really need to get out of the mindset that pretends that we can ‘do something about AI’. There is nothing we can do. AI will change everything from driving to bricklaying to medicine to lawyering. And much of it will happen in the next ten to fifteen years. Eight years ago Oxford University put out a research paper outlining over 700 occupations and the % threat of them disappearing due to computerisation. I wonder how many politicians are even aware of it, indeed how many journos? Conservatives complain on endlessly about dole bludgers and get apoplectic if you mention UBI. I wonder what they’ll say when all the jobs start disappearing and people won’t have money to spend. Literally!
I think we’ll end up as intelligent, useless, pampered creatures of leisure in a sophisticated galactic zoo. Pets, in essence. We’ve just got to ensure AI is raised on our better angels’ values – in all their complex richness and fullness!. (Current AI definitely needs injections of irony, eroticism and subversive humour…)