The world’s people increasingly rely on large language model (LLM) chatbots such as ChatGPT or Copilot to receive and organize information. There is growing concern over how these chatbots often make mistakes or provide made-up or false information (which data scientists have taken to calling “hallucinations”).
An LLM algorithm functions by scanning enormous volumes of text to learn which words and sentences that frequently appear near one another and in what context. LLMs can be adapted to perform a wide range of tasks across different domains and they produce forms of synthetic learning.
These technologies are said to hallucinate because they are built on problematic data sets or incorrect assumptions. Hence, the growing popularity of LLMs has raised concerns about their accuracy. These chatbots can be used to provide information, but it may be tainted by errors, or made-up or false information (hallucinations) caused by problematic data sets or incorrect propositions made by the model.
The questionable results produced by chatbots has led to growing disquiet among users, developers, and policy makers. The author argues that policy makers need to develop a systemic approach to address these concerns.
The current piecemeal approach does not reflect the complexity of LLMs or the magnitude of the data upon which they are based, therefore, the author recommends incentivizing greater transparency and accountability around data-set development.
A new research paper provides one of the first analyses of the many issues in data collection for LLM Chatbots and how governments are dealing with these e-challenges. In the paper, the author argues that these chatbots are creating disquiet because of how they are designed. In AI, there is a significant focus on privacy or IPR related to chatbots by the news and press.
The paper is titled “Data Disquiet Concerns about the Governance of Data for Generative AI” and it reviews the process and the specific problems in design. The paper has been produced by the Centre for International Governance Innovation.
Looking at the paper, Libza Mannan, Communications Advisor at the Centre for International Governance Innovation (CIGI), a Canada (Waterloo)-based think that works on issues at the intersection of technology and international governancey, tells Digital Journal that the paper is of significance to the business and academic community.
Mannan points out that the paper “Is one of the first analyses of the many issues in data collection for LLM Chatbots and how governments are dealing with these e-challenges.”
Mannan says: “The author argues that these chatbots are creating disquiet because of how they are designed. In AI, there is a significant focus on privacy or IPR related to chatbots by the news and press. This paper reviews the process and the specific problems in design.”
The paper argues that policy makers need to develop a systemic approach to address these concerns.
Source: https://www.digitaljournal.com/tech-science/new-warning-about-the-false-data-produced-by-large-language-models/article#ixzz8XSCuQJjk