On ChatGPT and the Free Content

A new article published this week surveys the influences of large language models (LLMs) such as the one that ChatGPT uses on the creation of new free content on the web. To do so, the authors checked a famous open website (Stack-Overflow) over time. After the introduction of ChatGPT, the amount of new content reduced by some 16% and after six months it reduced by 26%.

This is not accidental. In case you do not know it, Stack-Overflow is the place where people can ask specific coding related questions, and get very focused answers from the crowd's hive-brain. However, ChatGPT can do pretty much the same, after all coding is a language, and one which is fairly regular so a lot easier to learn than human speech.

One concern, however, is that we are going to see less free content and rely more on commercial companies. In the long run this might effect large language models too, since these rely on publicly available content for training.

Internet meme: mock O'Reilly book cover (Copying and Pasting from Stack Overflow)
Internet Meme: Will this joke be obsolete soon?

Saying this, I did try this week to use ChatGPT to create a simple code. The good: It worked. ChatGPT created clean, readable code, and did so politely and patiently. The bad: It took some twenty something rounds to generate the code I wanted, and debug it. Wording is crucial here, and I suspect that it is a learned skill. The main benefit of the chat over human interaction was that it was very patient and polite. The chat doesn't scold you for asking the same question again, does not rebuke misunderstanding, and always provide a polite response. In the long run, it would be nice to see such tools incorporated directly into the IDE.

For the original article see:

del Rio-Chanona, Maria, Nadzeya Laurentsyeva, and Johannes Wachs. "Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow." arXiv preprint arXiv:2307.07367 (2023).

