Photo credits: Getty images

In an investigation for The Conversation, Clément Le Ludec and Maxime Cornet look at the working conditions of data workers and suggest ways to enrich the discussions around the regulation of Artificial Intelligence (AI) systems. 

With the hyper-mediatization of ChatGPT, the general public has become aware of the progress made in AI. Many articles have highlighted ChatGPT's impressive abilities to understand and generate natural language. But these models hide precarious workers, often from the South.  

AI models need to be trained, to be fed with an extremely large amount of data that will shape their functioning. For example, an AI algorithm whose task is to recognise cats in photographs must, during the training phase, be fed photos that are known to represent cats and photos that are known not to represent cats. In this way, the model will not only be able to train but also to self-assess its performance, and thus optimise the detection process. This training data needs to be collected, sorted, verified and translated into a form that the AI can assimilate. Time-consuming and low-value tasks that are typically outsourced via the platform economy, and in particular via crowd-sourcing giant Amazon Mechanical Turk. Recently, an investigation by Time revealed that the workers responsible for ensuring that ChatGPT's training data was free of discriminatory content were Kenyan and paid less than three euros an hour.  

As part of the Digital Platform Labor research group, the study by Clément Le Ludec and Maxime Cornet of the Institut Mines-Télécom (IMT) aims to reveal the working conditions of data workers used by French AI companies. The survey is based on interviews with 147 workers, managers, and executives of 10 Malagasy companies, as well as a questionnaire administered to 296 data workers located in Madagascar. In France, most data work is outsourced to service providers located in Madagascar, mainly because of the low cost of skilled labour and the large number of organisations offering these services. These workers are integrated into a wider sector of business service production, ranging from call centres to web content moderation to editorial services for search engine optimisation.  

They are mostly men (68%), under 34 years of age (87%) and have completed higher education (75%). Most of them earn between 96 and 126 euros per month, with significant salary differences, up to 8 to 10 times higher for team supervision positions, which are also occupied by Malagasy workers. These workers are at the end of a long outsourcing chain, which partly explains the low wages of these skilled workers, even in the Malagasy context. Indeed, the AI business involves many actors: the GAFAMs that offer data hosting and computing power services, French companies that sell AI models and Malagasy companies that offer data annotation services - each intermediary capturing part of the value produced. In addition to the cost of labour, the outsourcing industry benefits from well-trained workers: most have gone to university and are fluent in French, learned at school, via the Internet and through the Alliances françaises network.  

Malagasy rating companies are very dependent on their French clients - who manage this outsourced workforce almost directly, with dedicated middle management positions within the Parisian start-ups. The fact that these positions are filled by foreigners, either employed by the client companies in France or by expatriates on the spot, represents a major obstacle to the career development opportunities offered to data workers, who remain stuck in the lower levels of the value chain. Furthermore, the majority of Malagasy companies offering digital services are run by French nationals. In addition to these formal companies, the sector has developed around a mechanism of "cascading subcontracting", with, at the end of the chain, informal companies and individual entrepreneurs, who are less well treated than in formal companies, and who are mobilised in the event of a lack of manpower by the companies in the sector. 

Thus, the development of AI does not mean the end of work due to automation but rather its displacement in developing countries. "It seems to us necessary to take better account of the human work that is essential for training models," conclude the two researchers. "Making the involvement of these workers visible means questioning globalised production chains, which are well known in the manufacturing industry, but which also exist in the digital sector. [...] It also means making visible the consequences of their work on the models. Some of the algorithmic biases lie in the work on data, which is still largely invisible to companies. A truly ethical AI must therefore involve an ethics of AI work".