agentggagentgg
Back to all findings
CRITICALconfirmedxxexxec792f0467280

XXE via libxmljs2 parseXml with noent:true on uploaded XML

handleXmlUpload parses untrusted uploaded XML with libxmljs2 using noent:true, which expands external entities and enables file disclosure / SSRF / DoS via XXE.

Fileroutes/fileUpload.ts
Lines8193
Confidence
99%
File statusvalidated
Details

In handleXmlUpload, the contents of an uploaded .xml file (file.buffer.toString()) are passed directly to libxml.parseXml(data, { noblanks: true, noent: true, nocdata: true }) inside a vm sandbox.

const data = file.buffer.toString()
const sandbox = { libxml, data }
vm.createContext(sandbox)
const xmlDoc = vm.runInContext('libxml.parseXml(data, { noblanks: true, noent: true, nocdata: true })', sandbox, { timeout: 2000 })

For libxmljs / libxmljs2, the noent: true option is the unsafe setting — it instructs libxml2 to substitute entity references, including external entities loaded via <!ENTITY xxe SYSTEM "file:///etc/passwd">. There is no noent: false, no DTD-loading disable, and no replacement with a defused parser. The resulting parsed string is even echoed back inside an error message (utils.trunc(xmlString, 400)), which directly leaks the contents of any disclosed file.

The surrounding code even acknowledges the bug: the comment says "XXE attacks in Docker/Heroku containers regularly cause 'segfault' crashes" and a xxeFileDisclosureChallenge / xxeDosChallenge is solved when /etc/passwd or system.ini content appears in the parsed XML, or when entity expansion times out — confirming this is intended to be exploitable.

The vm sandbox with a 2-second timeout does not mitigate XXE; it only bounds CPU time on recursive-entity DoS. File disclosure via a single SYSTEM reference still works well within the timeout.

Proof of concept
  1. Craft an XML file xxe.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<foo>&xxe;</foo>
  1. Upload it to the complaint file-upload endpoint that routes through handleXmlUpload (e.g. POST /file-upload with a multipart file field).
  2. The server response/error message will contain the substituted entity value, exposing the contents of /etc/passwd.
  3. Alternatively, supply a billion-laughs payload (nested entities) to trigger the 2s timeout path and cause DoS.
  4. Alternatively, point SYSTEM at an internal URL to perform SSRF.
Impact

Unauthenticated (or any authenticated, depending on route auth) attacker who can hit the XML upload endpoint can read arbitrary local files readable by the Node process (credentials, source, /etc/passwd, etc.), probe internal network resources via SYSTEM URIs (SSRF), or cause denial of service via recursive entity expansion. The disclosed file content is reflected back in the HTTP error response.

Validation
confirmed

The code at handleXmlUpload calls libxml.parseXml(data, { noblanks: true, noent: true, nocdata: true }) on attacker-controlled file.buffer.toString(). In libxmljs2, noent: true enables entity substitution including external SYSTEM entities, enabling file disclosure (the parsed xmlString is reflected back via utils.trunc(xmlString, 400) in the Error message). The vm sandbox with 2s timeout doesn't disable DTD loading or external entity resolution. Scope explicitly says to treat Juice Shop findings as real production bugs.

CVSS 3.1
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:L
Base score: 9.3 · CRITICAL

The sink libxml.parseXml(data, { ..., noent: true, ... }) in handleXmlUpload is reached by simply POSTing an .xml file to the upload endpoint over HTTP, with no auth check visible in this file (AV:N, AC:L, PR:N, UI:N). Confidentiality is High because a <!ENTITY xxe SYSTEM "file:///etc/passwd"> payload is expanded and the resulting xmlString is reflected back to the attacker via utils.trunc(xmlString, 400) in the error message, enabling disclosure of any file readable by the Node process. Scope is Changed because the same SYSTEM URI can be pointed at internal http(s) URLs to perform SSRF against backend services beyond the web app's security authority, and the disclosure crosses out of the app's data boundary into arbitrary host files. Availability is Low rather than High because the vm 2-second timeout bounds each billion-laughs request, allowing only intermittent/per-request resource exhaustion rather than guaranteed sustained downtime; integrity is None since XXE here only reads/parses, it doesn't modify server state.

References