CVE-2019-12415: XML processing vulnerability in Apache POI

CVE-2019-12415: XML processing vulnerability in Apache POI

by Artem Smotrakov


Apache POI is a popular Java library for working with Microsoft documents. For example, it allows you reading and writing Microsoft Excel files using Java. When I was recently looking into the library, I noticed a little vulnerability which then became CVE-2019–12415. The issue has been fixed in POI 4.1.1. Below are the details.

The issue

Besides many other formats, Apache POI can work with Microsoft Excel documents. In particular, the library contains the XSSFExportToXml class that is used in processing Microsoft Excel Open XML Spreadsheet (XLSX) files. The class takes a Map element, which is defined in Open Office XML specification, and converts it to XML.

The Map element contains the internals of an XLSX file. In particular, it contains an XSD schema. The XSSFExportToXml.exportToXml() method may be instructed to use this schema for XML validation:

private boolean isValid(Document xml) throws SAXException {
    try {
        String language = "http://www.w3.org/2001/XMLSchema";
        SchemaFactory factory = SchemaFactory.newInstance(language);
        Source source = new DOMSource(map.getSchema());
        Schema schema = factory.newSchema(source);
        Validator validator = schema.newValidator();
        validator.validate(new DOMSource(xml));
...

The problem here is that SchemaFactory doesn't turn the security XML processing mode on which results in an XXE vulnerability if an attacker can pass a malicious XSD schema to the isValid() method.

How does the schema come to the isValid() method? The answer is simple: it comes from an XLSX document. First of all, an XSLT document is just a ZIP archive. If you extract it, you'll find a bunch of XML documents and other files. The XSD schema comes from the xl/xmlMaps.xml file which contains something like the following:

<MapInfo xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" SelectionNamespaces="">
    <Schema ID="Schema2">
        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns="">
        ...

There may be various payloads. For example, an attacker can inject a<xs:redefine schemaLocation="https://internal.site/endpoint">

element into the schema. Here https://internal.site/endpoint is a URL to a resource from the private network which can't be directly accessed by an attacker:

<MapInfo xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" SelectionNamespaces="">
    <Schema ID="Schema2">
        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns="">
            <xs:redefine schemaLocation="https://internal.site/endpoint">
            ...

Then, the attacker archives everything back, and the malicious XLSX is ready. When SchemaFactory loads the schema, it's going to access https://internal.site/endpoint.

The issue has been fixed by setting

XMLConstants.FEATURE_SECURE_PROCESSING

feature to SchemaFactory.

What an attacker can do

The attacker can use all features of the XSD format in payloads. Possible consequences include but may not be limited to:

  1. Server Side Request Forgery (SSRF)
  2. Sensitive information leak from local and remote resources

Pre-requisites for an exploit

Here is what makes an application vulnerable:

  1. The application uses Apache POI 4.1.0 and below.
  2. The application allows untrusted data to be processed by the XSSFExportToXml class.
  3. The third parameter of the XSSFExportToXml.exportToXml() method is set to true which enables XML validation.

Conclusion

The Java standard library offers a lot of classes for XML processing. DocumentBuilder is one of the most popular classes to load an XML document. However, there are many other classes such as SchemaFactoryTransformer and so on which may parse XML documents. All of these classes should be configured to do it in a safe way if they take date from untrusted sources.

References


The article has been originally published at: https://medium.com/bugbountywriteup/cve-2019-12415-xml-processing-vulnerability-579fdbfbaa18

November 12, 2019
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
Loki
Loki
1 month ago

Hi, there thanks a lot for this post. I am trying to develop an personal tool for Reading& creating MS word documents using Apache POI. I tried to analyse and find it’s internal working but I have no idea as where to start. Can you pls point me in the right direction ?

© HAKIN9 MEDIA SP. Z O.O. SP. K. 2013