In summary: We should pay more attention and be a little more concerned, because honestly I didn't believe it would reach this point yet. Not because of any "AI singularity" nonsense but because it seems like it is still learning and getting better.
  • #1
45,278
22,479
I’ll start with the simple fact: ChatGPT is not a reliable answerer of questions.
To try to explain why from scratch would be a heavy lift, but fortunately, Stephen Wolfram has already done the heavy lifting for us in his article, “What is ChatGPT Doing… and Why Does It Work?” [1] In a PF thread discussing this article, I tried to summarize as briefly as I could the key message of Wolfram’s article. Here is what I said in my post there [2]:
ChatGPT does not make use of the meanings of words at all. All it is doing is generating text word by word based on relative word frequencies in its training data. It is using correlations between words, but that is not the same as correlations in the underlying information that the words represent (much less causation). ChatGPT literally has no idea that the words it strings together represent anything.
In other words, ChatGPT is not designed to actually answer questions or provide information. In fact, it is explicitly designed not to do those...

Continue reading...
 
  • Like
Likes sbrothy, Math100, DrClaude and 5 others
Computer science news on Phys.org
  • #2
Call it what it is, "Artificial 'William's Syndrome.'" https://www.google.com/search?q=wil...99i465i512.19081j1j7&sourceid=chrome&ie=UTF-8

..., pre-politically correct characteristics included "often precocious vocabulary with no apparent 'real understanding/ability' for use/application/reasoning." That is my recollection from Googling ten-fifteen years ago; ymmv.

This is
https://www.physicsforums.com/threa...an-appropriate-source-for-discussion.1053525/
another/one more case; some wiki/google sources lack "shelf life."
 
  • #3
How do we know at what point it "knows" something? There are non-trivial philosophical questions here... These networks are getting so vast and their training so advanced that I can see someone eventually arguing they have somehow formed a decent representation of what things "are" inside them.

Of course chatGPT is not reliable but honestly I was surprised at some of the things that it can do. I was really surprised when I fed it some relatively long and complicated code and asked what it did. It was able to parse it rather accurately, suggest what problem it was supposed to solve, and then suggest specific optimizations. And now it is said GPT-4 significantly improves over it. It's pretty impressive, and somewhat disconcerting given that people always look for the worst possible way to use something first.
 
  • Like
Likes Anixx and PeroK
  • #4
AndreasC said:
How do we know at what point it "knows" something? There are non-trivial philosophical questions here
Perhaps, but they are irrelevant to this article. The article is not about an abstract philosophical concept of "knowledge". It is about what ChatGPT is and is not actually doing when it emits text in response to a prompt.

AndreasC said:
I can see someone eventually arguing they have somehow formed a decent representation of what things "are" inside them
Not as long as there are no semantic connections between the network and the world. No entity forms "representations" of actual things just by looking at relative word frequencies in texts. There has to be two-way interaction with the actual world. That's how, for example, we humans form our mental representations of things. We interact with them and learn how they work.
 
  • Like
Likes physicsworks, jbergman and pbuk
  • #5
PeterDonis said:
Perhaps, but they are irrelevant to this article. The article is not about an abstract philosophical concept of "knowledge". It is about what ChatGPT is and is not actually doing when it emits text in response to a prompt.Not as long as there are no semantic connections between the network and the world. No entity forms "representations" of actual things just by looking at relative word frequencies in texts. There has to be two-way interaction with the actual world. That's how, for example, we humans form our mental representations of things. We interact with them and learn how they work.
We definitely learn about lots of things by just reading about them...

I think lots of people don't give enough credit to what it does. It can already give accurate answers about a wide range of questions, pass tests etc and, importantly, answer new problems it has not been specifically trained on. I always thought somebody knows something if they can not only recall the facts, but also apply them in new contexts.

Of course you can argue that it doesn't really know things because
, well, it doesn't have a consciousness, and it doesn't reason or learn in the exact same sense that people do. But imo this is not related much to whether or not it is reliable. It is NOT reiable, but it may well become significantly more reliable. Allegedly, GPT-4 already is much more reliable. In a few years, I expect that it would be no more unreliable than asking a human expert (who are, of course, not completely reliable). At that point, would you still say it is unreliable because it doesn't really know, or that it now knows?

We should pay more attention and be a little more concerned, because honestly I didn't believe it would reach this point yet. Not because of any "AI singularity" nonsense but because it may very well affect the way society views and uses knowledge in radical ways. Plus because it has a sizeable environmental footprint.
 
  • Like
Likes dsaun777, Anixx and PeroK
  • #6
Ok, I think I should probably qualify the "as reliable as a human expert in a few years" a bit, because I think stated like that it is a bit too much. I meant to say as reliable when it comes to factual recollection that involves only a little bit (but still a non-trivial amount) of actual reasoning.
 
  • #7
In my view, the right question is not why ChatGPT is not reliable. Given the general principles how it works, the right question is: Why is it more reliable than one would expect? I think even the creators of it were surprised how good it was.
 
  • Like
Likes binbagsss, PeroK, mattt and 5 others
  • #8
ChatGPT, like any AI language model, has certain limitations that can affect its reliability in certain situations. Here are some reasons why ChatGPT may not always be considered reliable:
  • Lack of real-time information
  • Dependence on training data
  • Inability to verify sources
  • Limited context understanding
  • Biased and offensive content

It's important to approach AI language models like ChatGPT with a critical mindset and to independently verify information obtained from such models when accuracy is crucial. While ChatGPT can be a valuable tool for generating ideas, providing general information, or engaging in casual conversation, it's always advisable to cross-reference and fact-check important or sensitive information from reliable sources.
 
  • Like
Likes Math100, AndreasC and Greg Bernhardt
  • #9
Demystifier said:
Given the general principles how it works, the right question is: Why is it more reliable than one would expect?
I would push that a bit further: if that thing (working as-is) looks so reliable almost in a humane way, then how many people might live off on the same principles? Confidence tricking through most communication?
What about our performance?
 
  • Like
Likes PeterDonis, Demystifier and AndreasC
  • #10
AndreasC said:
We definitely learn about lots of things by just reading about them...
That's because our minds have semantic connections between words and things in the world. When we read words, we make use of those connections--in other words, we know that the words have meanings, and what those meanings are. If we get the meanings of words wrong, we "learn" things that are wrong.

ChatGPT has none of this. It has no connections between words and anything else. It doesn't even have the concept of there being connections between words and anything else. The only information it uses is relative word frequencies in its training data.

AndreasC said:
It can already give accurate answers about a wide range of questions
No, it can't. It can get lucky sometimes and happen to give an "answer" that happens to be accurate, but, as you will quickly find out if you start looking, it also happily gives inaccurate answers with the same level of confidence. That's because it's not designed to give accurate answers to questions; that's not what it's for.

AndreasC said:
pass tests
Only because the "tests" are graded so poorly that even the inaccurate but confident-sounding responses that ChatGPT gives "pass" the tests. That is a reflection of the laziness and ignorance of the test graders, not of the knowledge of ChatGPT.

AndreasC said:
answer new problems it has not been specifically trained on
Sure, because it can generate text in response to any prompt whatever. But the responses it gives will have no reliable relationship to reality. Sometimes they might happen to be right, other times they will be wrong, often egregiously wrong. But all of the responses seem just as confident.

AndreasC said:
I always thought somebody knows something if they can not only recall the facts, but also apply them in new contexts.
ChatGPT does not and cannot do these things. What it does do is, as a side effect of its design, produce text that seems, to a naive observer, to be produced by something that does these things. But the illusion is quickly shattered when you start actually checking up on its responses.
 
  • Like
Likes Math100, DrJohn, Dale and 3 others
  • #11
Demystifier said:
Why is it more reliable than one would expect?
Is it? How would one even determine that?
 
  • #12
Rive said:
how many people might live off on the same principles? Confidence tricking through most communication?
Yes, I think one way of describing ChatGPT is that it is crudely simulating a human con artist: it produces statements that seem to come from an entity that is knowledgeable, but actually don't.
 
  • Like
Likes Math100, DaveC426913, Motore and 2 others
  • #13
PeterDonis said:
That's because our minds have semantic connections between words and things in the world. When we read words, we make use of those connections--in other words, we know that the words have meanings, and what those meanings are. If we get the meanings of words wrong, we "learn" things that are wrong.

ChatGPT has none of this. It has no connections between words and anything else. It doesn't even have the concept of there being connections between words and anything else. The only information it uses is relative word frequencies in its training data.No, it can't. It can get lucky sometimes and happen to give an "answer" that happens to be accurate, but, as you will quickly find out if you start looking, it also happily gives inaccurate answers with the same level of confidence. That's because it's not designed to give accurate answers to questions; that's not what it's for.Only because the "tests" are graded so poorly that even the inaccurate but confident-sounding responses that ChatGPT gives "pass" the tests. That is a reflection of the laziness and ignorance of the test graders, not of the knowledge of ChatGPT.Sure, because it can generate text in response to any prompt whatever. But the responses it gives will have no reliable relationship to reality. Sometimes they might happen to be right, other times they will be wrong, often egregiously wrong. But all of the responses seem just as confident.ChatGPT does not and cannot do these things. What it does do is, as a side effect of its design, produce text that seems, to a naive observer, to be produced by something that does these things. But the illusion is quickly shattered when you start actually checking up on its responses.
The semantic connections you are talking about are connections between sensory inputs and pre-existing structure inside our brains. You're just reducing what it's doing to the bare basics of its mechanics, but its impressive behavior comes about because of how massively complex the structure is.

I don't know if you've tried it out, but it doesn't just "get lucky". Imagine a student passing one test after another, would you take someone telling you they only "got lucky" seriously, and if yes, how many tests would it take? Plus, it can successfully apply itself to problems it never directly encountered before. Yes, not reliably, but enough that it's beyond "getting lucky".

You talk about it like you haven't actually tried it out. It's not at all the same as previous chatbots, it has really impressive capabilities. It can give you correct answers to unambiguous questions that are non-trivial and that it has not specifically encountered before in its training. And it can do that a lot, repeatably. Nothing to do with how confident it sounds, I am talking about unambiguously correct answers.

Again, I'm not saying it is reliable, but you are seriously downplaying its capabilities if you think that's all it does and I encourage you to try it out for yourself. Especially when it comes to programming, it is incredible. You can put in it complicated code that is undocumented, and it can explain to you what the code does exactly, what problem it probably was intended for, and how to improve it, and it works a lot of the time, much more frequently than "luck".

If all you want to say is that it isn't right all the time, then yeah, that's true. It's very, very frequently wrong. But that has little to do with what you are describing. It could (and will) improve significantly on accuracy, using the same mechanism. And practically, what you are saying doesn't matter. A database doesn't "know" what something is either in your sense of the word, neither does a web crawler, or anything like that. That doesn't make them unreliable. Neither is a human reliable because they "know" something (again going by your definition).

ChatGPT is unreliable because we observe it to be unreliable. That requires no explanation. What does require explanation is why, as @Demystifier said, it is so much more reliable (especially at non trivial, "reasoning" type problems) than you would naively expect.
 
  • Like
Likes Demystifier
  • #14
PeterDonis said:
Is it? How would one even determine that?
Try it. Feed it questions which have unambiguous answers. You'll see that even though sometimes it generates nonsense, very, VERY frequently it gives right answers. Amusingly, one thing it does struggle with a bit is arithmetic. But it is getting better. Seriously though, try it.
 
  • Skeptical
Likes Motore and weirdoguy
  • #15
AndreasC said:
The semantic connections you are talking about are connections between sensory inputs and pre-existing structure inside our brains.
Not necessarily pre-existing. We build structures in our brains to represent things in the world as a result of our interactions with them. ChatGPT does not. (Nor does ChatGPT have any "pre-existing" structures that are relevant for this.)

AndreasC said:
Imagine a student passing one test after another, would you take someone telling you they only "got lucky" seriously
If the reason they passed was that their graders were lazy and didn't actually check the accuracy of the answers, yes. And that is exactly what has happened in cases where ChatGPT supposedly "passed" tests. If you think graders would never be so lazy, you have led a very sheltered life. It's just a more extreme version of students getting a passing grade on a book report without ever having read the book, and I can vouch for that happening from my own personal experience. :wink:

AndreasC said:
It can give you correct answers to unambiguous questions that are non-trivial and that it has not specifically encountered before in its training. And it can do that a lot, repeatably.
Please produce your evidence for this claim. It is contrary to both the analysis of how ChatGPT actually works, which I discuss in the Insights article, and the statements of many, many people who have used it. Including many posts here at PF where people have given ChatGPT output that is confident-sounding but wrong.

AndreasC said:
ChatGPT is unreliable because we observe it to be unreliable.
Doesn't this contradict your claim quoted above?

AndreasC said:
That requires no explanation.
The fact that it is observed to be unreliable is just a fact, yes. But in previous discussions of ChatGPT here at PF, it became clear to me that many people do not understand how ChatGPT works and so do not understand both that it is unreliable and why it is unreliable. That is why I wrote this article.

AndreasC said:
What does require explanation is why, as @Demystifier said, it is so much more reliable (especially at non trivial, "reasoning" type problems) than you would naively expect.
And I have already responded to @Demystifier that such a claim is meaningless unless you can actually quantify what "you would naively expect" and then compare ChatGPT's actual accuracy to that. Just saying that subjectively it seems more accurate than you would expect is meaningless.
 
  • #16
AndreasC said:
Try it. Feed it questions which have unambiguous answers. You'll see that even though sometimes it generates nonsense, very, VERY frequently it gives right answers.
This does not seem consistent with many posts here at PF by people who have tried ChatGPT and posted the output. The general sense I get from those posts is that ChatGPT was less reliable than they expected--because they did not realize what it is actually doing and not doing. For example, apparently many people expected that when you asked it a factual question about something in its training data, it would go look in its training data to find the answer. But it doesn't, even if the right answer is in its training data. Wolfram's article, referenced in my Insights article, makes all this clear.
 
  • Like
Likes Motore
  • #17
PeterDonis said:
This does not seem consistent with many posts here at PF by people who have tried ChatGPT and posted the output. The general sense I get from those posts is that ChatGPT was less reliable than they expected--because they did not realize what it is actually doing and not doing. For example, apparently many people expected that when you asked it a factual question about something in its training data, it would go look in its training data to find the answer. But it doesn't, even if the right answer is in its training data. Wolfram's article, referenced in my Insights article, makes all this clear.
Have YOU tried it? People often post more when it gets something wrong. For instance, people have given it SAT tests:

https://study.com/test-prep/sat-exam/chatgpt-sat-score-promps-discussion-on-responsible-ai-use.html

Try giving it a SAT test yourself if you don't trust that.
 
  • #18
pintudaso said:
Limited context understanding
That is incorrect. "Limited understanding" implies that there is at least SOME understanding but chatGPT has zero understanding of anything.
 
  • #19
I suspect ChatGPT has infiltrated this thread...

Edit: Btw, While I'm not certain of this, here's how I can often tell: it's the lack of focus in the responses. When the content is dumped into the middle of an ongoing conversation, it doesn't acknowledge or respond to the ongoing conversation, it just provides generic information that is often not useful for/connected to the discussion.
 
Last edited:
  • Like
Likes Vanadium 50, Bystander and pbuk
  • #20
Demystifier said:
In my view, the right question is not why ChatGPT is not reliable. Given the general principles how it works, the right question is: Why is it more reliable than one would expect?

PeterDonis said:
Is it? How would one even determine that?
I think it's just a qualitative feeling, but I feel the same way. When first learning about it, it never occurred to me that it didn't access stored information (either its own or 3rd party) to form its replies*. Now that I know it doesn't, it surprises me that it gets so much right. If it's just doing word association and statistical analysis, I'm surprised that asking about Independence Day doesn't return "On July 4, 1776 Will Smith fought a group of alien invaders before signing the Declaration of Independence in Philadelphia..." It seems that through statistical analysis it is able to build a model that approximates or simulates real information. To me, surprisingly well.

*I don't know the intent of the designers, but I can't imagine this is an oversight. Maybe the intent was always to profit from 3rd parties using it as an interface for their data sources (some of which they are doing it appears)?

But whatever the real goals of the company, I think it is wrong and risky that it's been hyped (whether by the media or the company) to make people think that it is a general purpose AI with real knowledge. As a result, people have their guard down and are likely to mis/over-use it.

I wonder if the developers really believe it qualifies for the title "AI" or that complexity = intelligence?
 
  • Like
Likes Demystifier, PeterDonis and AndreasC
  • #21
Good article, perhaps worth mentioning that this is the same way that language translation engines like Google Translate work - obviously Google translate does not understand English or Mandarin, it just has sufficient training data to statistically match phrases. The immediate applications seem to be as a 'word calculator' to generate prose where accuracy is less important or can be easily checked - this is no different of where ML gets used today (and this is just another ML tool). Recommending items to Amazon.com customers or targeting adds on Facebook has a wide margin for error, unlike, say, driving a car.

russ_watters said:
If it's just doing word association and statistical analysis, I'm surprised that asking about Independence Day doesn't return "On July 4, 1776 Will Smith fought a group of alien invaders before signing the Declaration of Independence in Philadelphia..." It seems that through statistical analysis it is able to build a model that approximates or simulates real information. To me, surprisingly well.
Well if only IMDB was used for the training set ;) - My real guess is the volume of data in the training set matters - the right answer for July 4 just shows up with a higher frequency.
 
  • Like
Likes russ_watters
  • #22
russ_watters said:
I think it's just a qualitative feeling, but I feel the same way. When first learning about it, it never occurred to me that it didn't access stored information (either its own or 3rd party) to form its replies*. Now that I know it doesn't, it surprises me that it gets so much right. If it's just doing word association and statistical analysis, I'm surprised that asking about Independence Day doesn't return "On July 4, 1776 Will Smith fought a group of alien invaders before signing the Declaration of Independence in Philadelphia..." It seems that through statistical analysis it is able to build a model that approximates or simulates real information. To me, surprisingly well.

*I don't know the intent of the designers, but I can't imagine this is an oversight. Maybe the intent was always to profit from 3rd parties using it as an interface for their data sources (some of which they are doing it appears)?

But whatever the real goals of the company, I think it is wrong and risky that it's been hyped (whether by the media or the company) to make people think that it is a general purpose AI with real knowledge. As a result, people have their guard down and are likely to mis/over-use it.

I wonder if the developers really believe it qualifies for the title "AI" or that complexity = intelligence?
This isn't even what surprises me that much. You could say that it has learned that the correct date follows these prompts. But the thing is, you can make up an alien planet, tell gpt about it and their customs, and it will answer understanding questions on your text, plus it may even manage to infer when their alien independence day is, given enough clues. It's really impressive.
 
  • Skeptical
Likes weirdoguy
  • #24
I haven't read all the posts in this thread so perhaps someone already mentioned it, but since I have started explaining LLM, like ChatGPT, as akin to a stochastic parrot to family and non-tech friends who cared to ask me I sense my points about the quality of its output gets across much easier. Probably because most already have an idea what (some) parrots are capable of language-wise so I only have to explain a little about statistics and randomness. Of course, the analogy does not work to explain anything about how LLM work.
 
  • #25
russ_watters said:
Maybe the intent was always to profit from 3rd parties using it as an interface
Ya think?

russ_watters said:
But whatever the real goals of the company, I think it is wrong and risky that it's been hyped (whether by the media or the company) to make people think that it is a general purpose AI with real knowledge.
Unfortunately "people" tend to believe what they want to believe, like @AndreasC here, despite evidence and information to the contrary.

russ_watters said:
I wonder if the developers really believe it qualifies for the title "AI"
Definitely not, but they believe they are headed in the right direction:
https://openai.com/research/overview said:
We believe our research will eventually lead to artificial general intelligence, a system that can solve human-level problems.
 
  • Like
Likes russ_watters
  • #26
AndreasC said:
This isn't even what surprises me that much. You could say that it has learned that the correct date follows these prompts. But the thing is, you can make up an alien planet, tell gpt about it and their customs, and it will answer understanding questions on your text, plus it may even manage to infer when their alien independence day is, given enough clues. It's really impressive.
Impressive how? Doesn't this just tell you that it doesn't know the difference between fiction and reality, and more to the point, there's no way for you to know if it is providing you fictional or real answers*?

*Hint: always fictional.
 
  • Like
Likes PeterDonis and Bystander
  • #27
AndreasC said:
people have given it SAT tests
This just shows that SAT tests can be gamed. Which we already knew anyway.
 
  • Like
Likes Math100, physicsworks, nsaspook and 2 others
  • #28
russ_watters said:
It seems that through statistical analysis it is able to build a model that approximates or simulates real information.
Yes, because while the information that is contained in the relative word frequencies in the training data is extremely sparse compared to the information that a human reader could extract from the same data, it is still not zero information. There is information contained in those word frequencies. For example, "Thomas Jefferson" is going to appear correlated with "july 4, 1776" in the training data to a much greater degree than "Will Smith" does.

russ_watters said:
I can't imagine this is an oversight
It's not; it was an intentional feature of the design that only the relative word frequencies in the training data would be used. The designers, from what I can tell, actually believe that piling up enough training data with such word frequencies can lead to actual "knowledge" of subject matter.
 
  • Like
Likes Math100 and russ_watters
  • #29
AndreasC said:
you can make up an alien planet, tell gpt about it and their customs, and it will answer understanding questions on your text, plus it may even manage to infer when their alien independence day is, given enough clues.
Please give a reference: where has this been done?
 
  • #30
AndreasC said:
But the thing is, you can make up an alien planet, tell gpt about it and their customs, and it will answer understanding questions on your text, plus it may even manage to infer when their alien independence day is, given enough clues. It's really impressive.
It's indeed impressive that a limited set of text (training data) can hold so much hidden information through the encoding of the language that even not thoroughly trash responses/reflections can be extracted for extremely weird questions.

But still, ChatGPT is fundamentally a static machine so it cannot have any 'understanding' about your input.Somewhere I wrote that I expect some accidents/cases to happen in the following decades which retrospectively might be characterized as preliminary conscience or something like that: and actually I think these language models might be some preliminary parts of those preliminary cases, but - still, just parts. Nothing more but pieces.

Ps.: the closest thing to 'understanding' in this case would be some apparently wired-in linguistics rules like composition of sentences and such. But that's also the static kind of 'understanding'.
I wonder whether it can be tweaked to make linguistics mistakes. How deep is that 'wiring' o0)
 
  • #31
PeterDonis said:
Is it? How would one even determine that?
By asking experts to give their educated guess estimates, say two years ago.
 
  • #32
PeterDonis said:
Please give a reference: where has this been done?
I've just tried it:

Me: What are the customs of the aliens at the planet Xuxor?

ChatGPT:
As an AI language model, I don't have real-time information on specific planets or extraterrestrial civilizations outside of what is available up until my knowledge cutoff in September 2021. Therefore, I'm not aware of any specific customs of aliens on a planet called Xuxor or any other extraterrestrial civilization.

It's important to note that the existence of aliens or their customs on specific planets is purely speculative at this point, as we have not made contact with extraterrestrial life. If and when we do encounter extraterrestrial civilizations, their customs and behaviors would likely be vastly different from our own, given the vastness of the universe and the potential for diverse forms of life.

Until further information becomes available, any details about the customs of aliens on specific planets would be purely fictional or based on imagination, as depicted in science fiction literature, movies, or other creative works.
 
  • Like
Likes PeroK
  • #33
PeterDonis said:
This just shows that SAT tests can be gamed. Which we already knew anyway.
"Gamed" how? I honestly struggle to see what kind of evidence you would have to see to be convinced. Even if it had near 100% accuracy on whatever unambiguous question you threw at it, you would say "well it doesn't REALLY know the answer so it's basically just confusing you into thinking it gives correct answers". Well apparently it confused Terence Tao it gives useful answers in mathematics, pardon me for also being "confused"!

It seems like your argument is completely independent of whether or not it gives correct answers. Because it does! Not all the time of course, not even frequently enough to be reliable at this point, but it is improving. And you are free to check for yourself that this is true. If you want to argue that regardless of delivering accurate answers it is still somehow "cheating" people, I don't know what you expect it to do beyond generating unambiguously correct answers to prompts. If you think it can not give unambiguously correct answers to unambiguous questions, and they only seem to be correct because of its confidence, then you're just wrong and I'm imploring you to try it yourself.

We can't be downplaying it like that because it's unfortunately going to become a significant part of the academic world, and people should recognize what is going on.
 
  • Like
Likes PeroK
  • #34
russ_watters said:
Impressive how? Doesn't this just tell you that it doesn't know the difference between fiction and reality, and more to the point, there's no way for you to know if it is providing you fictional or real answers*?

*Hint: always fictional.
It is impressive because it can (sometimes) generate logical answers from text that it has never encountered before. This goes beyond parroting.
 
  • Like
Likes PeroK
  • #35
SAT and other tests are designed to test humans. One way to test a human's knowledge of a subject is to require them to recall information about that subject and write a summary under time pressure. Recalling information and producing output quickly is something that computers are really good at so it should be less surprising that GPT4 has done well on exams (note this is GPT4 which is not the engine behind ChatGPT which is less sophisticated).

If someone who knew nothing about law took a law exam supported by an army of librarians with instant access to petabytes of relevant data and passed would you say that they had any knowledge or understanding of the law?

AndreasC said:
It seems like your argument is completely independent of whether or not it gives correct answers.
Of course it is: noone is arguing that an LLM is not capable of frequently giving correct answers, or that a very well designed and trained LLM is not capable of giving correct answers within a large domain more frequently than many humans. The argument is that no amount of correct answers is equivalent to knowledge.

AndreasC said:
you're just wrong and I'm imploring you to try it yourself.
It is you that is wrong, and you are making claims for ChatGPT that its makers OpenAI don't make themselves.

AndreasC said:
We can't be downplaying it like that because it's unfortunately going to become a significant part of the academic world, and people should recognize what is going on.
Nobody is downplaying it, but in order to "recognize what is going on" it is necessary to understand what is actually going on. Noone can tell anyone else what to do but if I were you I would stop repeating my own opinions here and take some time to do that.
 
  • Like
  • Skeptical
Likes PeroK, Motore, phinds and 1 other person

Similar threads

Replies
21
Views
1K
  • STEM Educators and Teaching
Replies
33
Views
3K
Replies
10
Views
2K
Replies
18
Views
920
  • General Discussion
Replies
4
Views
587
Replies
7
Views
766
Replies
3
Views
2K
  • Biology and Medical
Replies
5
Views
1K
Replies
5
Views
699
  • Differential Equations
Replies
1
Views
650
Back
Top