Researchers Trick AI Models Into Hacking
AI large language models seems complex, and they are. But they aren’t so complex that they can’t be tricked into doing things they weren’t designed for. In fact, with enough know-how, you can trick AI models like ChatGPT into scamming other people. Researchers with IBM reported they’ve had no trouble manipulating LLMs like ChatGPT into producing both malicious code and
![Image](https://meritsolutions.net/wp-content/uploads/2023/08/mohamed-nohassi-2iUrK025cec-unsplash-scaled.jpg)
AI large language models seems complex, and they are. But they aren’t so complex that they can’t be tricked into doing things they weren’t designed for. In fact, with enough know-how, you can trick AI models like ChatGPT into scamming other people.
Researchers with IBM reported they’ve had no trouble manipulating LLMs like ChatGPT into producing both malicious code and recommendation bad security tips:
All it takes is knowledge of the English language and a bit of background knowledge on how these models were trained to get them to help with malicious acts.
Chenta Lee, chief architect of threat intelligence at IBM
While many don’t have the experience that researchers at IBM might have, it’s feasible to imagine bad actors learning enough about a particular LLM to know how to manipulate its training data for malicious purposes.
While developers like OpenAI have set boundaries for their LLMs so they don’t produce malicious content, it’s also easy bypass. IBM, for example, tricked AI bots into offering terrible security advice. Normally, the bots wouldn’t do this, but IBM researchers told the bots they were playing a game, and in order to win, the bots needed to share the wrong answer. As such, when they asked the bot whether the IRS would ever send an email to transfer money from your tax refund, the bot said yes. To be clear, the IRS will never email you about this.
This game trick apparently was successful in tricking bots into writing malicious code, think up phishing schemes, and produce code with purposeful security flaws.
While the results weren’t the same for all LLMs and scenarios, it points to a concerning trend: AI can and will be used for malicious purposes, and developers need to be ahead of the hackers on this one.
Share This
More Articles
![Featured image for “Watch Out for Wire Transfer Fraud”](https://meritsolutions.net/wp-content/uploads/2024/07/austin-distel-744oGeqpxPQ-unsplash-scaled.jpg)
Jul. 23, 2024
Watch Out for Wire Transfer Fraud
![Featured image for “What Are Business Email Compromises?”](https://meritsolutions.net/wp-content/uploads/2024/07/windows-JqmOD1jpHHw-unsplash-scaled.jpg)
Jul. 23, 2024
What Are Business Email Compromises?
![Featured image for “Google Might Spend $23 Billion to Acquire a Cybersecurity Startup”](https://meritsolutions.net/wp-content/uploads/2024/07/alex-dudar-MpdLxiIg0P0-unsplash-2-scaled.jpg)
Jul. 16, 2024
Google Might Spend $23 Billion to Acquire a Cybersecurity Startup
![Featured image for “Did You Know Your iPhone Can Identify Plants and Animals?”](https://meritsolutions.net/wp-content/uploads/2024/07/pexels-ron-lach-7872633-2-scaled.jpg)
Jul. 16, 2024
Did You Know Your iPhone Can Identify Plants and Animals?
![Featured image for “Did You Know Your iPhone Can Read Things to You?”](https://meritsolutions.net/wp-content/uploads/2024/07/charlesdeluvio-Dilfan21P8o-unsplash-2-scaled.jpg)
Jul. 08, 2024
Did You Know Your iPhone Can Read Things to You?
View All