TROOPERS14
i read 'boutique conference' somewhere recently, and that term actually nails it. occupying the print media complex in heidelberg for an entire TROOPERS manages to provide two days of workshops, two more of conference and a day of roundtable discussions as well as various side events like an IPv6 security summit or an SAP security track or a telco sec day or.. did i miss anything? guess so.
there is a soldering station to fiddle with your.. batch, because it comes with an arduino attached. troopers provides food & coffee & mate nearly around the clock. AND, well not that i'm anything like picky about clothing at all, BUT they have conference shirts available for girls.
no one goes to cons just for collecting shirts.. but this industry is moaning about the lack of females, no? so when they finally show up it is really nice when this kind of events actually acknowledges beforehand that this could happen. in other words, giving out shirts only for men sort of implies that there will be no women.
summing it up, great event :]
and here comes what we did there.
DIFFRAY VULNERABILITY RESEARCH
WHAT IS IT
as i refused for ~5 months to come up with documentation, i guess, no finally it is the right time. DiffRay in short is a tool to diff Windows 7 and Windows 8 executables to spot missing security functions in an automated way.
if one can fiddle around with input values for an application, without that application checking for their validity, chances are high one can actually perform creative abuse on memory structures. the inclined reader understands potential impact of memory corruption. Microsoft does too, and thus came up with dedicated libraries that provide input checking functions to make it easy for developers to apply the right security check for a dedicated input value. namely these libraries are intsafe and strsafe. they provide APIs like ULongAdd or StringCchCopy, which do nothing more than checking if a given value stays within expected boundaries (more information on MSDN).
following these functions are called 'safe functions'. we assume, that when in one version of a library such a safe function is applied while in the same piece of code of another version of that library no safe function is called - something is fishy.
we perform the diffing on Windows libraries and drivers (.dll, .sys) in a very simple way. our approach is to decompile each binary, scan for safe functions and put every hit per API into a database. this way we simply count the hits for a specific safe function in a library function and diff it with the complementary library function of another Windows version.
finally, if the hit counts differ we have a good chance, that some value in that library function at hand goes unchecked. we consider this a potential vulnerability.
HOW DOES IT WORK
the bare necessities
Python 2.7 32bit
pymssql 1.0.2 32bit for Python 2.7
PyQt 4 32bit
DiffRay comes with two executables for decompilation of libs and drivers and a python application for parsing the spotted safe function hits into a database and for producing the final diffings. basically what we do is decompile the binaries to .c-files, so yeah we produce some sort of Windows OS source code :)
you will need IDA Pro and the hexrays decompiler for this step. the .c-files are then parsed to a database. you can choose between sqlite or mssql; i highly recommend mssql. or you implement a DB handler of your choice, this is python!!
next step is the parsing. DiffRay parses either files or whole directories for symbols of safe functions. right now there are 130 symbols, they can be extended by just editing the signatures.conf. also, there is signature mapping if some safe function turns out to be equivalent to another one. we saw this in the past, but didn't yet come up with the right mappings. configuration could be achieved by editing the signature_mapping.conf in the form sig1=sig2, line by line.
once the parsing is finished (depending on the DB backend that could take a while) DiffRay can start with the diffing. the commandline instructions you need are listed in the slides. basically, diffing can happen via library id or via library name. the lib id way is not very handy when diffing various libraries. thus for automation i recommend using the name option and creating a batch file that feeds DiffRay with the library names and dumps the output to a directory of choice.
attention, for the name option the name should identify the Win7/Win8 versions, without extention! so e.g. kernel32 is fine, when there is a Win7 version and a Win8 version present.
the output then should be a bunch of files, preferably .csv, that contain data like this:
Function_Name | Pattern | Win8 | Win7 |
EQoSDispatchIoctl | StringCbLength | 2 | 0 |
Ipv4SetEchoRequestCreate | ULongAdd | 4 | 0 |
Ipv6SetEchoRequestCreate | ULongAdd | 4 | 0 |
WfpAleAuditEvent | StringCbCopy | 2 | 1 |
WfpAleCaptureImageFileName | StringCbCopy | 1 | 2 |
from here on the researcher is on his own. now, get the libraries, open them in IDA and jump to the mentioned functions. good luck ;)
THE GUI
for ease of use we decided we wrap this whole process up in a GUI. its designed in Qt, so very easy to build and modify - you can even change the colors if you don't like my chewing gum style. anyway, all the colors do make sense as all functions are integrated into one window:
yellow
configuration dialog. you HAVE to be connected to a database when you start parsing! you can configure credentials for mssql, there is nothing to configure for sqlite (always connects to the same sqlite db). the buttons CREATE DB and FLUSH DB at the moment actually do the same, dropping everything and creating a new db from scratch. via the configuration dialog you can edit signatures, mssql settings, mappings and logging.
blue
decompilation box. click around here to invoke the dll2idb.exe and idb2c.exe that should come with the python project. then watch the IDA Pro window pop up and down as it decompiles :)
pink
parsing box. you need python 2.7 for it to work. for anything to work.. you can either parse a file or an entire directory, make sure its all .c-files and to have the right operating system selection. parsing is done in separate processes, you can start on multiple directories at a time.
green
diffing box. either on library ids or on library names, as mentioned a name has to hit one Win7 and one Win8 library. sadly, we don't yet have a way to do batch via GUI. will come in the future.
grey
search box. check if a libname actually finds libraries or which lib got which id. or get all diffing info of one particular library.
THE PROJECT itself
in total we decompiled and diffed more than 900 libraries of Win7 and Win8. the slides show some of the results, not all of them though. it is still a lot of work to actually check all the potential vulnerabilities and to evaluate if they are triggerable after all. a lot of false positives arise due to new code parts in Win8, manual checks on the Win7 side that have been replaced with a safe function or different naming of safe functions between both versions.
apart from that, great fun. if you're hacker, bored, don't know what to do - start a joint project with someone else. you will learn what he knows, what you don't know and the other way round. both have expertise and great ideas, put them together and tadaaa you get twice as much of each.
WHATS NEXT
well there is Windows 8.1 right? besides, there is a lot more we can parse for than symbols. there is actually a lot more we could do with the pseudo source code of windows... and it would actually be a bright idea to switch to IDAPython instead of decompiling stuff.
there has to be some bug fixing, some more logging, some more automation. there could be some machine learning element or integration of symbolic execution, that could add completely different maaaagic..
but that would be a different blog post.