- #71
berkeman
Mentor
- 66,802
- 19,441
An AI generated basketall court from Facebook today. WTH?
Last edited:
fluidistic said:I didn't know such an AI was prone to infinite loops.
Abstract
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150× higher than when
behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.
Despite efforts to align large language models to produce harmless responses, they are still
vulnerable to jailbreak prompts that elicit unrestricted behaviour. In this work, we investigate
persona modulation as a black-box jailbreaking method to steer a target model to take on
personalities that are willing to comply with harmful instructions. Rather than manually
crafting prompts for each persona, we automate the generation of jailbreaks using a language
model assistant. We demonstrate a range of harmful completions made possible by persona
modulation, including detailed instructions for synthesising methamphetamine, building a
bomb, and laundering money. These automated attacks achieve a harmful completion rate of
42.5% in GPT-4, which is 185 times larger than before modulation (0.23%). These prompts
also transfer to Claude 2 and Vicuna with harmful completion rates of 61.0% and 35.9%,
respectively. Our work reveals yet another vulnerability in commercial large language models
and highlights the need for more comprehensive safeguards.
A self-hallucinating bucket of bits.On February 20, 2024, an optimization to the user experience introduced a bug with how the model processes language.
LLMs generate responses by randomly sampling words based in part on probabilities. Their “language” consists of numbers that map to tokens.
In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense. More technically, inference kernels produced incorrect results when used in certain GPU configurations.
Upon identifying the cause of this incident, we rolled out a fix and confirmed that the incident was resolved.
The example shown there starts with a question about whether one can feed Honey Nut Cheerios to a dog. Don't people understand that ChatGPT has no knowledge of anything? While the text it spews out is sometimes coherent with reality, it does not "fact checks" itself and ends up answering nonsense.nsaspook said:https://arstechnica.com/information...ting-out-shakespearean-nonsense-and-rambling/
ChatGPT goes temporarily “insane” with unexpected outputs, spooking users
Nope, people don't get it. Here's a hilarious one:DrClaude said:Don't people understand that ChatGPT has no knowledge of anything? While the text it spews out is sometimes coherent with reality, it does not "fact checks" itself...
Incompetent and/or corrupt is totally on-brand for the Philly Sheriff's office (not to be confused with the police department), and this was probably the former, by the consultant. Some now former intern was probably assigned to go find favorable news stories about the sheriff, which would have taken many minutes to do the old fashioned way, with google. Instead they offloaded the task to ChatGPT, which delivered exactly what it was asked for (hey, you didn't clearly state they should be real!). Heck, it's even possible they tried the old fashioned way and gave up when all they could find were articles about the department's dysfunction and editorials saying it should be abolished.Philadelphia Sheriff Rochelle Bilal’s campaign is claiming that a consultant used an artificial intelligence chatbot to generate dozens of phony news headlines articles that were posted on her campaign website to highlight her first-term accomplishments.
OpenAI did not name the "hired gun" who it said the Times used to manipulate its systems and did not accuse the newspaper of breaking any anti-hacking laws.
"What OpenAI bizarrely mischaracterizes as 'hacking' is simply using OpenAI's products to look for evidence that they stole and reproduced The Times's copyrighted work," the newspaper's attorney Ian Crosby said in a statement on Tuesday.
Representatives for OpenAI did not immediately respond to a request for comment on the filing.
The truth, which will come out in the course of this case, is that the Times paid
someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the
highly anomalous results that make up Exhibit J to the Complaint.
Boy did ChatGPT get that one wrong.Seth_Genigma said:How did you find PF?: I found PF from ChatGPT surprisingly, I had made a theory on physics and asked for it to tell me a site to find like minded people to help confirm the theory. .
In effect, Air Canada suggests the chatbot is a separate legal entity that is responsible for its own actions. This is a remarkable submission.
Is natural language processing easier or harder when restricted to three words?gleem said:Project GR00T
Vanadium 50 said:Is natural language processing easier or harder when restricted to three words?