{"id":10939,"date":"2023-08-24T18:04:48","date_gmt":"2023-08-24T18:04:48","guid":{"rendered":"https:\/\/nft.runfyers.com\/index.php\/2023\/08\/24\/ai-language-models-are-nothing-without-humans-sociologist-explains\/"},"modified":"2023-08-24T18:04:48","modified_gmt":"2023-08-24T18:04:48","slug":"ai-language-models-are-nothing-without-humans-sociologist-explains","status":"publish","type":"post","link":"https:\/\/nft.runfyers.com\/index.php\/2023\/08\/24\/ai-language-models-are-nothing-without-humans-sociologist-explains\/","title":{"rendered":"AI Language Models Are Nothing Without Humans, Sociologist Explains"},"content":{"rendered":"<p><\/p>\n<div>\n<p>The media frenzy surrounding ChatGPT and other large language model artificial intelligence systems spans a range of themes, from the prosaic \u2013\u00a0<a href=\"https:\/\/blogs.microsoft.com\/blog\/2023\/02\/07\/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web\/\" target=\"_blank\" rel=\"noopener\">large language models could replace conventional web search<\/a>\u00a0\u2013 to the concerning \u2013 AI will eliminate many jobs \u2013 and the overwrought \u2013 AI poses an extinction-level threat to humanity. All of these themes have a common denominator: large language models herald artificial intelligence that will supersede humanity.<\/p>\n<p>But large language models, for all their complexity, are actually really dumb. And despite the name \u201cartificial intelligence,\u201d they\u2019re completely dependent on human knowledge and labor. They can\u2019t reliably generate new knowledge, of course, but there\u2019s more to it than that.<\/p>\n<p>ChatGPT can\u2019t learn, improve or even stay up to date without humans giving it new content and telling it how to interpret that content, not to mention programming the model and building, maintaining and powering its hardware. To understand why, you first have to understand how ChatGPT and similar models work, and the role humans play in making them work.<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-how-chatgpt-works\">How ChatGPT works<\/h2>\n<p>Large language models like ChatGPT work, broadly, by\u00a0<a href=\"https:\/\/writings.stephenwolfram.com\/2023\/02\/what-is-chatgpt-doing-and-why-does-it-work\/\" target=\"_blank\" rel=\"noopener\">predicting what characters, words and sentences<\/a>\u00a0should follow one another in sequence based on training data sets. In the case of ChatGPT, the training data set contains immense quantities of public text scraped from the internet.ChatGPT works by statistics, not by understanding words.<\/p>\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\">\n<p>\n<iframe loading=\"lazy\" title=\"How ChatGPT Works Technically | ChatGPT Architecture\" width=\"696\" height=\"392\" src=\"https:\/\/www.youtube.com\/embed\/bSvTVREwSNw?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen><\/iframe>\n<\/p>\n<\/figure>\n<p>Imagine I trained a language model on the following set of sentences:<\/p>\n<p>Bears are large, furry animals. Bears have claws. Bears are secretly robots. Bears have noses. Bears are secretly robots. Bears sometimes eat fish. Bears are secretly robots.<\/p>\n<p>The model would be more inclined to tell me that bears are secretly robots than anything else, because that sequence of words appears most frequently in its training data set. This is obviously a problem for models trained on fallible and inconsistent data sets \u2013 which is all of them, even academic literature.<\/p>\n<p>People write lots of different things about quantum physics, Joe Biden, healthy eating or the Jan. 6 insurrection, some more valid than others. How is the model supposed to know what to say about something, when people say lots of different things?<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-the-need-for-feedback\">The need for feedback<\/h2>\n<p>This is where feedback comes in. If you use ChatGPT, you\u2019ll notice that you have the option to rate responses as good or bad. If you rate them as bad, you\u2019ll be asked to provide an example of what a good answer would contain. ChatGPT and other large language models learn what answers, what predicted sequences of text, are good and bad through feedback from users, the development team and contractors hired to label the output.<\/p>\n<p>ChatGPT cannot compare, analyze or evaluate arguments or information on its own. It can only generate sequences of text similar to those that other people have used when comparing, analyzing or evaluating, preferring ones similar to those it has been told are good answers in the past.<\/p>\n<p>Thus, when the model gives you a good answer, it\u2019s drawing on a large amount of human labor that\u2019s already gone into telling it what is and isn\u2019t a good answer. There are many, many human workers hidden behind the screen, and they will always be needed if the model is to continue improving or to expand its content coverage.<\/p>\n<p>A recent investigation published by journalists in Time magazine revealed that\u00a0<a href=\"https:\/\/time.com\/6247678\/openai-chatgpt-kenya-workers\/\" target=\"_blank\" rel=\"noopener\">hundreds of Kenyan workers spent thousands of hours<\/a>\u00a0reading and labeling racist, sexist and disturbing writing, including graphic descriptions of sexual violence, from the darkest depths of the internet to teach ChatGPT not to copy such content. They were paid no more than US$2 an hour, and many understandably reported experiencing psychological distress due to this work.<\/p>\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\">\n<p>\n<iframe loading=\"lazy\" title=\"Doing Grueling Work for an AI: Data Labeling\" width=\"696\" height=\"392\" src=\"https:\/\/www.youtube.com\/embed\/ug_p2wHhla0?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen><\/iframe>\n<\/p>\n<\/figure>\n<h2 class=\"wp-block-heading\" id=\"h-what-chatgpt-can-t-do\">What ChatGPT can\u2019t do<\/h2>\n<p>The importance of feedback can be seen directly in ChatGPT\u2019s tendency to \u201c<a href=\"https:\/\/doi.org\/10.1145\/3571730\" target=\"_blank\" rel=\"noopener\">hallucinate<\/a>\u201d; that is, confidently provide inaccurate answers. ChatGPT can\u2019t give good answers on a topic without training, even if good information about that topic is widely available on the internet. You can try this out yourself by asking ChatGPT about more and less obscure things. I\u2019ve found it particularly effective to ask ChatGPT to summarize the plots of different fictional works because, it seems, the model has been more rigorously trained on nonfiction than fiction.<\/p>\n<p>In my own testing, ChatGPT summarized the plot of J.R.R. Tolkien\u2019s \u201c<a href=\"https:\/\/www.harpercollins.com\/products\/the-lord-of-the-rings-jrr-tolkien?variant=39999349817378\" target=\"_blank\" rel=\"noopener\">The Lord of the Rings<\/a>,\u201d a very famous novel, with only a few mistakes. But its summaries of Gilbert and Sullivan\u2019s \u201c<a href=\"https:\/\/www.eno.org\/operas\/the-pirates-of-penzance\/\" target=\"_blank\" rel=\"noopener\">The Pirates of Penzance<\/a>\u201d and of Ursula K. Le Guin\u2019s \u201c<a href=\"https:\/\/www.penguinrandomhouse.ca\/books\/538943\/the-left-hand-of-darkness-by-ursula-k-le-guin\/9780143111597\" target=\"_blank\" rel=\"noopener\">The Left Hand of Darkness<\/a>\u201d \u2013 both slightly more niche but far from obscure \u2013 come close to playing\u00a0<a href=\"https:\/\/www.madlibs.com\/\" target=\"_blank\" rel=\"noopener\">Mad Libs<\/a>\u00a0with the character and place names. It doesn\u2019t matter how good these works\u2019 respective Wikipedia pages are. The model needs feedback, not just content.<\/p>\n<p>Because large language models don\u2019t actually understand or evaluate information, they depend on humans to do it for them. They are parasitic on human knowledge and labor. When new sources are added into their training data sets, they need new training on whether and how to build sentences based on those sources.<\/p>\n<p>They can\u2019t evaluate whether news reports are accurate or not. They can\u2019t assess arguments or weigh trade-offs. They can\u2019t even read an encyclopedia page and only make statements consistent with it, or accurately summarize the plot of a movie. They rely on human beings to do all these things for them.<\/p>\n<p>Then they paraphrase and remix what humans have said, and rely on yet more human beings to tell them whether they\u2019ve paraphrased and remixed well. If the common wisdom on some topic changes \u2013 for example,\u00a0<a href=\"https:\/\/doi.org\/10.1093\/eurheartj\/ehaa586\" target=\"_blank\" rel=\"noopener\">whether salt<\/a>\u00a0is\u00a0<a href=\"https:\/\/doi.org\/10.1007\/s13668-021-00383-z\" target=\"_blank\" rel=\"noopener\">bad for your heart<\/a>\u00a0or\u00a0<a href=\"https:\/\/doi.org\/10.1002\/ijc.32211\" target=\"_blank\" rel=\"noopener\">whether early breast cancer screenings are useful<\/a>\u00a0\u2013 they will need to be extensively retrained to incorporate the new consensus.<\/p>\n<h2 class=\"wp-block-heading\" id=\"h-many-people-behind-the-curtain\">Many people behind the curtain<\/h2>\n<p>In short, far from being the harbingers of totally independent AI, large language models illustrate the total dependence of many AI systems, not only on their designers and maintainers but on their users. So if ChatGPT gives you a good or useful answer about something, remember to thank the thousands or millions of hidden people who wrote the words it crunched and who taught it what were good and bad answers.<\/p>\n<p>Far from being an autonomous superintelligence, ChatGPT is, like all technologies, nothing without us.<\/p>\n<p><em>This article is republished from\u00a0<\/em><a href=\"https:\/\/theconversation.com\/https:\/\/theconversation.com\/chatgpt-and-other-language-ais-are-nothing-without-humans-a-sociologist-explains-how-countless-hidden-people-make-the-magic-211658\" target=\"_blank\" rel=\"noopener\">The Conversation<\/a><em>\u00a0under a Creative Commons license. Read the<a href=\"https:\/\/theconversation.com\/3-ways-ai-is-transforming-music-210598https:\/\/theconversation.com\/chatgpt-and-other-language-ais-are-nothing-without-humans-a-sociologist-explains-how-countless-hidden-people-make-the-magic-211658\" target=\"_blank\" rel=\"noopener\">\u00a0<\/a><a href=\"https:\/\/theconversation.com\/3-ways-ai-is-transforming-music-210598\" target=\"_blank\" rel=\"noopener\">original article\u00a0<\/a>by<a href=\"https:\/\/theconversation.com\/profiles\/john-p-nelson-1458177\" target=\"_blank\" rel=\"noopener\">\u00a0John P. Nelson <\/a><\/em>Postdoctoral Research Fellow in Ethics and Societal Implications of Artificial Intelligence, Georgia Institute of Technology<\/p>\n<\/p><\/div>\n<p><a href=\"https:\/\/nftnow.com\/ai\/ai-language-models-are-nothing-without-humans-sociologist-explains\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The media frenzy surrounding ChatGPT and other large language model artificial intelligence systems spans a range of themes, from the prosaic \u2013\u00a0large language models could replace conventional web search\u00a0\u2013 to the concerning \u2013 AI will eliminate many jobs \u2013 and the overwrought \u2013 AI poses an extinction-level threat to humanity. All of these themes have [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":10942,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[10],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/nftnow.com\/wp-content\/uploads\/2023\/08\/jpg-12.jpg","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/10939"}],"collection":[{"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=10939"}],"version-history":[{"count":0,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/10939\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/10942"}],"wp:attachment":[{"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=10939"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=10939"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nft.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=10939"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}