LOCATION: No one knows, and very few can find out.
OpenAI is a commercial software developer - not open source, and definitely not free. The company’s primary product, ChatGPT, is a search engine with many ‘bots’ that ‘scrape’ the world wide web for material which it stores and uses to build a database and an ‘analytical model’ - which it uses to respond to queries and then generate (apparently) original text.
Some say the original (scraped) data is deleted after the model is built. This is a dubious claim - every skeptical cell in one’s being should scream, ‘That can’t be right!’ This is because the cost of reacquiring that data would be far too great - and the data would have to be reacquired if the next generation of the model is to have a full dataset as its foundation.
OpenAI recently made a written submission to the United Kingdom’s House of Lords Communications and Digital Select Committee that said the following: “Because copyright today covers virtually every sort of human expression - including blogposts, photographs, forum posts, scraps of software code, and government documents - it would be impossible to train today’s leading artificial intelligence [AI] models without using copyrighted materials.”
In other words, unless ChatGPT collects, stores, and processes the intellectual property generated by individuals and companies, it cannot produce derivative works. The ‘generative AI community’ does not like it, however, when this is spelled out so clearly. They claim that “it is not derivative.”
OpenAI had more to say on the subject, too: “Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.”
That seems to imply the needs of a nebulous and undefined group it describes as “citizens” are actually needs - which, according tThe content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.