Generative artificial intelligence is, symbolically, about to turn one. Always if we want to refer to November 30, 2022, the date on which ChatGPT was launched on the market.
Let’s say that in anticipation of this anniversary, AI is going through a period that is not exactly the most relaxing. Just think of the revolutions and counter-revolutions that in just a few days have affected OpenAI, the company producing ChatGPT: Sam Altman’s dismissal, his probable hiring at Microsoft, a fiery letter from a good part of the employees and Altman’s sudden reinstatement at the helm of the company.
And now here comes news of the opening of an investigation by the Guarantor, which wants to clarify how data is collected to train AI algorithms. What is it about?
Privacy Commissioner: investigation into the collection of personal data to train AI
The news appeared as a press release published on Wednesday 22 November on the website of the Guarantor for the protection of personal data, better known as the Privacy Guarantor.
In the press release we can read about the opening of an investigation into the method of collecting personal data for the purpose of training artificial intelligence algorithms. More precisely, “the initiative is aimed at verifying the adoption of security measures by public and private sites.”
Webscraping and security
In short, the investigation by the Privacy Guarantor aims to clarify what public and private websites do to prevent a massive collection of personal data with which to train AI algorithms by third parties.
It is the phenomenon known by the English term webscrapingan activity with which various AI platforms, we always read in the Guarantor’s press release, “collect, for different uses, enormous quantities of data, including personal ones, published for specific purposes (news, administrative transparency, etc.) within websites managed by entities public and private.”
The Guarantor’s appeal
The last part of the text is an appeal addressed to organizations and figures who in various ways may be interested in the topic (trade associations, consumer associations, experts and representatives of the academic world) “to send their comments and contributions on the security measures adopted and that can be adopted against the massive collection of personal data for the purposes of algorithm training.”
Also indicated are an email address to which to write ([email protected]) and a maximum date, sixty days from the publication of the press release.
Following the investigation, the Guarantor “reserves the right to take the necessary measures, even on an urgent basis.”
The investigation by the Guarantor and the Dsa
The investigation by the Privacy Guarantor therefore aims to clarify whether the collection of customer data for the training of artificial intelligence algorithms also occurs on a massive scale and without the consent of those directly involved.
This would violate the provisions of the Digital Service Act (DSA), in force for a year but officially operational since August, which among other things protects the privacy of online users.
The Privacy Guarantor and ChatGPT
One precedent that many readers will remember took place in late March and early April.
When the Guarantor suspended ChatGPT in our country (but in reality it was a self-suspension following the Guarantor’s complaint) for two reasons: the lack of clear information on the methods of collecting user data (as well as a solid legal basis for the maximum collection of the data itself), and the lack of a filter for verifying the age of users.
The word to the consumer associations
Consumer associations applaud the Guarantor’s initiative.
Massimo Dona, president of the National Consumers Association, after expressing his satisfaction, hopes that “alongside this fact-finding investigation on the massive collection of personal data for the purposes of algorithm training, the Guarantor will also intervene on the use that is then made of this data once collected.”
Codacons broadens the discussion: “We believe that AI systems that could be used in an intrusive and discriminatory way, with unacceptable risks for the fundamental rights of citizens, for health, safety, the environment, democracy, the rule of law such as, for example, the manipulation of the behavior of people or vulnerable groups or social scoring, the classification of people based on their behavior or characteristics.”
Leave a Reply
View Comments