Innovation Hub
An initiative by Al Bayat Mitwahid in UAE

Course 2
Using and Training AI
Unit 2 – Generating images
Lesson 2: All about training images
We have seen that AI image generators are powerful and useful tools. It is important to put them in context so that we can understand the arguments and worries for an against using and developing them.
​
The practice of scraping image datasets for AI training has sparked debates and ethical considerations. Scraping programs automatically locate and copy images from the internet to be used in a training dataset. People who have developed these scraped datasets argue that scraping, which involves collecting images from various sources on the internet, is necessary. Their aim is to create diverse, very large and varied datasets which they believe is essential for training good AI models. Supporters of using scraping methods argue that it means that images with a wide spectrum of demographics, objects, and scenarios are included, which is essential for improving AI. Some image datasets are not scraped, instead they are developed from stock photography, which has been created with licences for use by anyone who agrees to the terms.
​
Critics of image scraping have ethical concerns regarding scraping images without explicit consent. Without consent, private images and personal or copyrighted images could be included in the datasets for AI training. Bias can also creep into the AI programs if the presence of certain types of images in the dataset are used to represent ideas or concepts by the AI.
​
Many believe that there is a need for ethical guidelines, transparency, and respect for privacy rights with this process. The debate revolves around balancing the necessity of diverse data with the ethical considerations and implications of the scraping process.
​
There are currently tools in development to programmatically prevent a scraping system from using an image for its dataset, meaning that creators could prevent their images from being used. Some campaigners argue that this shouldn’t be necessary, and perhaps international agreements can take their place by enacting legislation to give that choice to the image copyright holders.
