Rooz + Beh Note ... یادداشت روز+ به
  • Home
  • About
  • Home
  • About
Picture
Picture
Picture

We are all AI's unpaid data workers.

6/14/2023

0 Comments

 
​Lately, I've been contemplating the human effort behind advanced AI models. The key to making AI chatbots appear intelligent and produce less harmful content is reinforcement learning from human feedback. This approach involves incorporating input from individuals to enhance the model's responses.

The process heavily relies on human data annotators who assess text strings' coherence, fluency, and naturalness. They determine whether a response should be retained in the AI model's database or discarded.

Even the most remarkable AI chatbots necessitate thousands of human work hours to exhibit the desired behavior, and even then, their performance can be unreliable. The labor involved can be grueling and distressing, as will be discussed at ACM Conference on Fairness, Accountability, and Transparency (FAccT). This conference convenes researchers who delve into topics such as how to make AI systems more accountable and ethical, which aligns with my interests.

One particular panel I am anticipating features Timnit Gebru, an AI ethics pioneer who formerly co-led Google's AI ethics department before her termination. Gebru will address the exploitation of data workers in Ethiopia, Eritrea, and Kenya, tasked with cleansing online hate speech and misinformation. In Kenya, data annotators were compensated with less than $2 per hour to sift through distressing content related to violence and sexual abuse, all to reduce toxicity in ChatGPT. These workers are now organizing into unions to advocate for improved working conditions.

We are on the verge of AI establishing a new global order reminiscent of colonialism, with data workers bearing the brunt of its impact. Shedding light on exploitative labor practices surrounding AI has become increasingly urgent and vital, especially with the popularity surge of AI chatbots like ChatGPT, Bing, and Bard, and image-generating AI models such as DALL-E 2 and Stable Diffusion.

Data annotators are involved at every stage of AI development, from model training to verifying outputs and providing feedback that aids in fine-tuning models post-launch. They are often compelled to work at an exceedingly fast pace to meet demanding targets, and deadlinesThe notion that large-scale systems can be built without human intervention is utterly false.

Data annotators offer AI models the crucial contextual information required to make informed decisions on a large scale and to appear sophisticated. For example, in India, a data annotator had to distinguish between images of soda bottles and identify ones resembling Dr. Pepper. However, Dr. Pepper is not sold in India, leaving the burden on the annotator to make the distinction.

Annotators are expected to discern the values that matter to the company. They aren't just learning about distant and irrelevant things but also figuring out the additional contexts and priorities of the system they are building.

Researchers from the University of California, Berkeley, the University of California, Davis, the University of Minnesota, and Northwestern University argue in a new paper presented at FAccT that we all are data laborers for major technology companies, whether we realize it or not.

Text and image AI models are trained using vast datasets scraped from the internet, which includes our data and copyrighted works by artists. The data we generate is forever embedded within AI models designed to generate profits for these companies. Unwittingly, we contribute our labor for free by uploading photos to public platforms, upvoting comments on Reddit, labeling images on reCAPTCHA, or conducting online searches.

Currently, the power dynamics heavily favor the largest technology companies worldwide. To address this, a data revolution and regulatory measures are imperative. One way for individuals to reclaim control over their online existence is by advocating for transparency in data usage and finding mechanisms to provide feedback and share in the revenues generated from their data.

Despite data labor being the backbone of modern AI, it remains chronically undervalued and invisible globally, with low wages prevailing for annotators. There needs to be recognition of the contribution of data work.  
0 Comments

    Author

    Roozbeh, born in Tehran - Iran (March 1984)

    Archives

    December 2024
    April 2024
    February 2024
    December 2023
    November 2023
    June 2023
    April 2023
    March 2023
    October 2022
    September 2022
    January 2022
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    January 2021
    December 2020
    November 2020
    September 2020
    June 2020
    May 2020
    April 2020
    March 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    August 2019
    June 2019
    March 2019
    February 2019
    August 2018
    July 2018
    May 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    September 2017

    Categories

    All
    Business
    Comedy
    Life
    Poetry
    Politics
    Random

    RSS Feed

© COPYRIGHT 2025. ALL RIGHTS RESERVED.