Threats in AI

Artificial Intelligence (AI) has become very powerful, scalable, and easy to use, which is a dangerous combination for aggressive cyber operations.

A finger over the ChatGPT app on a iphone, held in front of a blurred background with the OpenAI 'Introducing ChatGPT' site
Photo by Mojahid Mottakin / Unsplash

Artificial Intelligence (AI) has become very powerful, scalable, and easy to use, which is a dangerous combination for aggressive cyber operations. This can include Information to Human (I2H), Cyberattacks, and Model Attacks.  Where each category of attack has its own challenges and risks. This article summarizes what each form of attack is and how AI is influencing it.

Information to Human (I2H)

ChatGPT and other similar LLM-powered tools use artificial intelligence (AI) to take in huge amounts of data from the internet, recognize patterns in language, and answer user questions with text that is hard to tell from human writing. This AI capability makes possible the triple threat of misinformation, disinformation, and malinformation (MDM). Traditionally, threat actors spent a lot of money spreading false information. In 2022, the Russian state budget for disinformation, propaganda, and other influence operations was over $1.5 billion [1]. Now, LLMs make it cheaper and easier to make and spread fake content. These threats can take the form of social media posts, targeted messages to specific people via email, text, apps, or even voices (potentially of trusted public figures or people you may know).

  • Social Media Posts that are inflammatory about an organization, idea, or person Many times, they are re-shared posts that seem to go "viral". They hope to get the viewer angry and upset against the target and have them comment or share to broaden their reach.
  • Direct Messages that intend to inform you of a serious issue or danger that is meant to upset you.
  • Emails often seem like they are written only for you, or they can be authentic looking news articles that become a chain intended to spread a lie that is based on half truths.

Targets:

  • Elections - generating and promoting fake information to undermine the election process.
  • Social Divides exploit divides in the social fabric of a society to sow discontent and erode trust in the system.  This can stoke anger between different political parties, races, religions, and many other sub-groups of society.
  • Democracy - many authoritarian states will use these types of attacks to create discontent and upheaval in a democracy that they have issues with. Example: Russian in Ukraine.

Cyberattacks to Steal Information

This particular method of attack is probably somewhat familiar to most people. The unsolicited emails, the annoying phone calls asking you to update your car warranty, and the text messages informing you that someone desperately needs your assistance are all examples of cyber attacks. Even though many of these have been around for some time, when they are powered by generative AI, they become even more dangerous because they do not require the involvement of a human to carry them out. The attack can be designed by a single person, and then it can be scaled up to include as many hacker agents (AI-powered bots) as desired. The malicious actors now have the ability to attack people on a scale that is orders of magnitude larger than before and with interactions that have a human quality. The following are some of the more common types of assaults:

  • Emails that closely impersonate your boss or someone in your organization asking you to do something out of the ordinary, generally sharing private information or approving or sending funds in a non-traditional manner.

  • Texts messages from someone you are close with asking you to do something out of the ordinary, again related to sharing private information or money.  These cases are tricky because they are instantaneous communications and use urgency to make the decision stressful.

  • Phone calls to inform you that action is required that needs private information, a payment, or some other request that is maneuvering for a way to exploit you.

Targets:

  • Financials - the attackers want to gain access to accounts associated with money so they can attempt to steal it.
  • Network Access - the attackers want to trick you into sharing your login credentials or clicking on a link that will execute a program to attack your network so they can get access.  Once they have access to the network, they have numerous attack options.
  • Contacts - sometimes the attacker is just testing to see if someone uses the phone number or account and how long it takes them to respond. They can then sell or use this list of contacts later for a different exploit.

Model Attacks

This is a relatively new and highly specialized line of attack that was developed not too long ago. Within their operations, a large number of businesses, government institutions, and other organizations have implemented advanced AI and machine learning (ML) models to automate their systems. This can include financial models determining loan eligibility, a forecasting service in which the client uploads customer data and it predicts sales revenue, a medical diagnosis based on lab tests, services that assist with writing and grammar, and a great deal of other possibilities. Although they are complex, the attacks are generally focused on particular outcomes.

  • Membership - is a data point present in the training data set they used to make the model?  Examples are:

    • A company sells access to a medical model that predicts the likelihood that someone has a disease. You upload the patient's lab results and other related information, and it returns a confidence score indicating the chances the patient has the disease.

    • Considering the example above, an attacker can build an attack to determine if certain people were used to train the model. If they were, they could see if they had the disease. One often-cited risk is health insurance companies using a service like this to deny insurance to those with conditions.

  • Memorization - when a model is trained in certain ways with unique data (unique often means sensitive), the model may not generalize but memorize. This means if a user asks a question or queries related to that unique data, they get a response with the memorized data. An example:

    • social security numbers If customers are scrubbed but a few slip through, this would be unique data for those customers. If a user types in "what is a social security number", the model may answer a customer's specific social security number instead of the definition of what a social security number is.
  • System Hacks - while these are not gaps or issues with the model itself, they are security gaps in the process surrounding it. Examples are:

    • The service providing the model for users records all inputs and outputs "for training purposes," which recently happened at Samsung [2].

    • One risk is that they now have whatever sensitive information was typed into the prompt.

    • Another risk is that they use that information to train the model, which was discovered through the other attacks mentioned above.

  • Reproduction: Using some statistical profiling of data sets and knowledge of the topic, attackers can systematically use an AI or ML model as a paying customer, go undetected, and recreate the model. If this model is of value to the company or organization providing it as a service, the customer now has their own version.

Other

It's still early days for advanced generative AI, but new applications for the technology are being developed every day. How we think about computing is shifting as a result of having near-human-level scalable agents, and the threats need to be constantly monitored so that new and developing methods can be identified.

References


  1. Countering Disinformation in a Post-ChatGPT World ↩︎

  2. Samsung Bans Staff’s AI Use After Spotting ChatGPT Data Leak ↩︎