[This piece was sent to The Assam Tribune on 7th January 2024 with a request for publication. Apparently, they could not find it worth publishing]
A correction: It seems, the article was not so bad after all. The Tribune has published it on 7th March 2024. I am sharing the jpg file at the bottom of this write-up.
We have seen numerous discussions about the enormous possibilities that artificial intelligence (AI) has opened up in front of us – both desirable and not so desirable. There have been voices on possible replacement of humans in various fields which may create largescale job-loss, voices for regulations on the use of AI and praise on the possible revolutionary application of AI in health, agriculture, industry, defense, education, entertainment and other sectors for benefits of mankind. However, a lawsuit by The New York Times against Microsoft and Open AI (the company that pioneered the use of text generative AI with ChatGPT) filed just after Christmas has opened up a whole new front for discussion. The lawsuit is the culmination of failure of more than seven months of discussion on terms and conditions of licensing the newspapers content to AI companies.
The two companies allegedly used millions of articles of the newspaper to train their Large Language Model (LLM) text generator AI applications – ChatGPT and Copilot – which now produces similar content in the same style of the newspaper, thereby competing with the newspaper itself as a source of reliable information. The New York Times is one of the early integrators of print and online journalism after the proliferation of internet and decline of newspaper circulation in the West. They had invested heavily to build a subscription-based model of online journalism which had been largely successful so far. The lawsuit alleged copyright violation by unauthorized use of unique content generated by it, thereby causing significant loss to the newspaper.
To fathom the issue, we need to understand how generative AI works with LLMs and go back to the year 2012 when Google used 16000 computer processors to process 10 million digital images found on YouTube and identify a cat! The neural network of these 16000 computers had then actually identified the cat, with 16% accuracy. However, work on creating a machine that can act like human brain started in 1950s and a system of connected computers was developed that spends days, weeks or even months identifying patterns in large amounts of digital data. For example, after analyzing names and addresses scribbled on hundreds of envelopes, the system could read handwritten text.
By 2012, a neural network was trained to recognize common objects like flower and cars. The same basic technology of learning by analyzing patterns was used by Google to identify the cat and now used by various AI applications to generate text, images or sound. A Large Language Model or LLM is essentially a neural network that learns from enalyzing an enormous amount of digital text. Once trained on the digital text, it can produce text on its own. In 2015 Elon Musk funded a company called Open AI which released the first AI application capable of generating text in 2020 (then called GPT-3). Today, there are so many companies, applications and search engines that are using generative AI that it has become really difficult to believe that it has only been three years that artificial intelligence came out of the lab.
While the debate goes on whether AI would elevate the world or destroy it, let us face another question - what will happen to journalism if AI continues to generate both online and print texts instead of a human being? It is almost impossible to differentiate between actual truth and ‘truth’ generated by AI. Our older generation keeps on saying that Google is not always right and they are correct to some extent. The truth or fact generated by AI is actually the truth and fact used to train it. What will happen if the LLM used to train it is biased or fake in the first place? We have already experienced the menace of fake news, fake videos and doctored photos in social media and have seen it spill over to both print and electronic media. Despite the onslaught, print media has been able to somewhat hold on to its credibility. If AI trained on ‘fake truth’ is used to generate textual or visual content for journalism, the implications are horrifying to even imagine.
AI is not so intelligent if we go by the true meaning of intelligence – the ability to differentiate between what is right and what is correct in a particular situation in time or space. For example, let us imagine a situation where a law enforcer finds a ninety-year-old visibly deranged person entering a restricted area where he / she has instructions to shoot any violator. AI will shoot the person, but an intelligent human being would perhaps use all the faculties to take a humane decision. Journalism is also a world where intelligence to differentiate between what is right and what is correct is in premium.
How does AI transforms communication itself? Generative AI – whether for text or for images - helps people to express themselves better. Use of AI for image generation would enable one to express more vibrantly and imaginatively than one could previously. Tools like Midjourney can create an entire imagined landscape like Marine Drive in Mumbai or a Yeti in Himalayas or an adorable portrait of a dog and a cat together or a scene in an American town from 1950s! Just imagine how easy and fruitful communication would be if one can create a picture of what one imagined. In terms of generating text for communication, we have already seen that AI tools can write better communicative text than most of the humans.
The legal battle between The New York Times and Open AI may prove to be the proverbial tip of the iceberg as several other publishers including Gannett, the largest U.S. newspaper company; Rupert Murdoch’s News Corp, The Daily Beast and the magazine publisher Dotdash Meredith are also trying to negotiate with AI companies to make the later pay for using the formers content.