Thursday, November 20, 2014

Hunting Bunnies

A month ago I gave a talk titled TS/NOFORN at an exquisite boutique conference named in Luxembourg. Some people also named it keynote. I had been talking about a set of extravangant malware samples, which all seemed to be related yet different in the deep of their dark souls. One of these samples stuck out, namely the fourth sample of the analyzed species; thus entitled suspect #4. Suspect #4 is a huge evil beast, sophisticated, if one does not take the typos in its string constants into account. The malware presents itself as 'bunny', invades the system, evades sandboxes and presents an execution platform for Lua scripts a C&C would inject to the malware.

As I have promised to various fellow researchers, I sat down to document this very same miscreant. Having typed my fingers wound throughout numerous long nights, I can now happily present you with the final version of the bunny report. Beloved bunny, this way you will live forever.

My deepest gratitude to all who helped with technical insights, binary input, mental support, coffee or rum. To beware the best interest of all contributors the report is published under a CreativeCommons license.

EvilBunny: Suspect #4 - Enjoy.

Sunday, September 28, 2014

Predictive Research: Malware, You're Doing It Wrong

I sat down this weekend to document the inspiring thoughts behind a talk I gave at Next Generation Threats last week in Stockholm. The initial idea was to outline how today's threat detection systems work and how they are bound to fail in specific situations. Slides are pretty and contain cat pictures:

Yet, they do not express all the ideas which popped up in my head even only after the event.

As researchers are constantly mocking Anti-Virus solutions these days, let me point out what crap the malware authors themselves are actually doing wrong at the same time. 98% of malware (numbers not based on empirical research) we see today is not exceptionally good software. More even, as long as we see today’s malware it can’t quite be genius, no? 98% of malware (numbers not based on empirical research) has evil intentions, and, it works – not more and not less. But all the “even more sophisticated” malware spread over security software vendor pages is a myth, along with the even more sophisticated detection technology. 

And while malware and malware defense seems like a race of least effort, it is the work of offensive research to push defenders to new innovations. I’m not an offensive person, and with all means I do not intend to enable crooks to do their job better. Me, and I bet 98% of all others (numbers not based on empirical research) wish for intelligent defenses, which are predictive rather than knowledge based. In other words, who wouldn’t want to be safe from future threats rather than only the currently known ones?

From a simplified and abstract viewpoint there are three modes of detection:

  • The binary is known
  • The binary is recognized
  • The behavior of the binary is recognized
All three of them doubtlessly work. There is an entire industry busy with spreading samples and hashes of known threats to enable protection. Pattern recognition catches up to 90% of these known samples (numbers based on empirical research), while patterns consist of all sorts of data. Among these plain bytes like opcodes or file header characteristics, heuristic attributes like entropy or combined patterns. If nothing of these work, the behavior of a binary can still be extracted in various ways. Sandboxes and emulators do a great job nowadays in imitating real world systems.

However, all of these methods can be tricked. As detection technology is evolving, they will be tricked eventually. Let’s not call this write-up offensive research, but predictive. 

Let’s see what malware is doing wrong.


As recent incidents picked up by public media have proofed, if malware is not intended to go large scale the need to evade static detection evaporates [1]. However, most successful attack vectors are not used only once. As soon as malware is out to infect more than a handful of systems consecutive success can only be achieved by constant morphing.
Let me call it ‘Weapons of Match Destruction’ what malware like Virut [2] is using. A polymorphic engine for example, which creates a totally new binary every time the infection spreads on, so new that less than 5-10 bytes of potential pattern remains equal. For what I know according crypters producing different representations of one payload are easy to acquire. Note though, their use is mere obfuscation than protection, really.


Malware that can’t be caught by pattern matching will be caught by file hash matching. The most ancient form of malware detection is still effective on highly polymorphic samples. A malicious binary once uncovered is shared among researchers and software vendors and will be detected as soon as threat definitions (aka. signatures) are formulated and pushed out to security solutions. The trick to fly under the radar is some effort but pretty simple actually. Exchanging binaries every once in a while, before they get detected, allows to escape the malware detection cycle. Given the binaries are robust against byte- and behavior pattern detection exchanging them faster than the signature update cycle can catch them will keep them undetected. True story, some variants of Zeus do so.


Threat detection technology is all about patterns. Legacy threat detection after 20 years of practice is pretty sophisticated in detecting binary patterns, but so is the malware in distorting its binary patterns. What malware is not very good in yet is hiding its artifacts. An infected machine will always show artifacts, let that be files, registry keys, domain lookups on the network or modifications of hard disk sectors. Repetitive use of these artifacts are what eventually identify a malware infection on a system and help mapping it to known families. In any case, the following artifacts are subject of special attention:

  • Filenames
  • Domain names
  • Registry key names / value names
  • Infiltration methods
  • Persistence methods


The existence of sandboxes is not a secret any more. Also it might have hit the public that most analysis-oriented systems, such as reverse engineer’s boxes, rely on virtualized Windows XP. At the point of writing I am not aware of the number of actual productive systems running on VMware or Qemu, but in consideration of the trade-off malware might want to refuse execution on these systems. 

There is a long list of evasion techniques focusing on virtual environments, from delaying execution to detection of supervisors. While I would not say that checking hard disk identifier names is a very smart technique, it is definitely a start. More successful would be to use a technique which doesn’t leave readable string constants in the binary. Or, multiple techniques spread throughout the binary. 


Persistence mechanisms will, to a certain extent, always be visible on an infected machine. A registry key, an executable binary, a new service, a modification of the boot sector. For disinfection essentially two things have to be undertaken – neutralize the binaries and deactivate persistence / recovery techniques. The latter one is considerably easy given the analyst sees the respective system modification and manages to remove it. This would not be anything like easy if the malware would be able to apply a number of different techniques, to hide them well, and more even, to revive destroyed persistence mechanisms.

A pretty cool approach has been shown by recent Backoff variants [3], which manage two copies of the malicious binary on disk and use either the Windows services or a mutex to assure the system remains infected. However, the number of techniques available is large and innovative combinations are possible.


The idea that readable string constants in the binary are not feasible when trying to be stealthy has already reached most malware authors. Interestingly, the main mitigation method though is to pack the binary with some crypter that won’t even leave a single byte where it has been before. Yet this implies, as soon as the crypter is bypassed the plain binary remains easy to analyze. And, seriously, most crypters one can get on the internet are not all that hard to bypass. Suitable methods exist to distort strings and only reconstruct them at runtime, so the static payload does not look like an open book to read. 


Right after string constants on the ladder of analyst’s favorite helpers are API calls. It is kind of hard not to understand the significance of CreateFileA, WriteFileA and CloseHandle in a row. This fact is similarly understood by malware writers as the string issue, apply a runtime packer and all API magic is hidden. Again, to a certain extent this is bound to fail, as long as you don’t have a packer which is exceptionally hard to crack.  

Not very complicated and tremendously effective to defend against static detection are jump tables. Better even, morphing jump tables, where API offsets change place every once in a while. However, there is a number of ways to hinder the disassembler / debugger in showing API names. Choose one. Or many.


The most common architecture of malware is the following: one or more protection layers on top of one or more packer layers on top of the plain payload. This, because usually the entity providing the plain binary and the entity providing packing and protection are not the same. Simple workload distribution ensures that the guy being specialist on packing does his job while the guy being specialist on exfiltrating data can focus on his desired target.

Again I am lacking the empirical data for such statement, but my guess would be that the number of available crypters or crypter services out there is fairly limited. So as most of these operate somewhat similar today’s analysts don’t have major problems in cracking literally all of them. Therefore, just like with the strings, keeping anti-analysis and packing / unpacking up throughout all layers raises the analysis pain by a multitude. 


I guess it is tempting to not bother with the command and control server management issue and just use one with a single domain. However, looking at C&C domain names is by far the easiest way to classify malware. Domain generation algorithms haven’t only been invented yesterday, I remember reading even dictionary based DGAs are doable. While their main purpose remains hiding the C&C infrastructure on the web, they also pose significant headache for analysts. Gibberish domain names make it somewhat hard to map a sample to a given family, which would allow detection even if the sample is unknown. 


Dynamic analysis is limited in several ways, such as time or virtual environments – or in a sense that they are not very dynamic in what they are actually analyzing. Emulators and sandboxes usually only know one way, execute and see what happens. Simple examples of how this could go wrong are applications which need certain input parameters or DLLs which exhibit their maliciousness only when called on a certain export. And for what it’s worth, human analysts work usually in the same fashion. Parallelized execution is a bitch, conditional execution also raises analysis efforts significantly. No better way to trick sandboxes than showing malicious behavior every Tuesday 3 p.m. and acting benign the whole rest of the week. Just sayin’.


Last but not least, remaining in known spheres gives analysts the advantage of the equal battle field. There are hordes of engineers out there who know how to handle a sandbox, interpret API call traces or dig into x86/64 binaries. From my personal borderline painful experience I can tell that the number of people being familiar with virtual machine execution, kernel land code, bootkits or bios malware is rather limited. 


Looking at the attacks we see today, again only those we can see, they reuse tools and techniques that have been there before. Still, a large part of malware goes undetected every day. Most of it shows one or another weakness, that allows detection. The art in detection though is actually to see. Having visibility on the artifacts which indicate an attack or a successful breach is already half the way. 

The outlined mistakes malware authors are making are considerably easy to fix. Suitable source code and descriptions of techniques are available on the internet. It is a question of time only at what point the crooks will advance their miscreants to a level where current solutions are no more able to find and analyze them. Looking into the future is nothing humans are especially good at, however, the points I outline here are not generally unknown or anything innovative. Meanwhile I am sure current security solutions cover some, but surely not all of the mentioned weaknesses. New innovations and combined approaches only can extend visibility and empower new ways of threat detection.


Home Depot Data Breach Could Be The Largest Yet [NY Times]

Polymorphic Virut Malware [Avira Blog]

New POS Malware Family – Backoff [Trustware Blog]

Saturday, August 2, 2014

Yes, you can!

long time no write.. i have been busy with starting a lot of different things, finished only some and never sat down to put anything here. today is the day - find below the latest tech stuff i worked on and some thoughts that had to get out.


- Troopers video out

- talk at Area41 Z├╝rich on VB6 runtime analysis together with Jurriaan Bremer, write up here

- talk at SSTIC Rennes about a fun collection of anti-analysis tricks of all sorts, write ups spread all over

- Sazoora anti-analysis close up

- VirusBulletin article on VB6 runtime packers

- a Havex.RAT risk assessment, cause that thing wasn't so sophisticated at all 


lately i had some inspiring discussions about women's mindsets, with women about their mindsets and how some of them think i'm so much of a different creature. while i think thats bogus.

the past years i had a few very frightening and painful experiences; and as much as i want to believe that they are piling up because life is evil, i much rather think its me running towards them. 

to leave that somewhere: conference talks scare the living jesus out of me. thats why i love to do them. my karate trainings used to freak me out, twice a week. thats why i kept going. the first malware i put into a debugger felt worse than sparring, cause i was totally helpless.

as nice as the nevergiveup mantra sounds though, its not necessarily easy to stay hungry for challenges. my driving factor to keep researching is mainly the fun on exploring new things, combined with the cheerful satisfaction when something worked out. so by figuring out, the harder and smarter i try the more things work out the more i have to cheer about, i developed my own mindset towards a thorough 'Yes, you can!'.

the past year i talked to a handful of women in that business we are in. all of them bright and talented professionals, yet many did not blurt their yesican all over the place. instead i heard a lot of 'i have to learn this', 'i'm not sure about this', 'some day i will', 'i am not good at X' or plainly 'I can't.'

oh well.. i've been there too. most work i did the past year, submitting to conferences, joining on projects and such, i did because someone else said 'you can' while i thought i couldn't. and they said i would be able to while i thought 'but...'

but then again any challenge in life is a bit like sparring :) you go in, kick as smart and as hard as you can, and you go out again - in pain and a bit more skilled than before. because there is a point about sparring that most people don't know: fighters do this to sharpen their skills more than anything else. there is a common misperception, that if you are not able to do something today you won't be able to learn it in the future. 

which, i learned, is so plain wrong. 
all you need is some sparring to get the right skills :)

Saturday, March 29, 2014

Yet Another Security Tool @Troopers14


i read 'boutique conference' somewhere recently, and that term actually nails it. occupying the print media complex in heidelberg for an entire TROOPERS manages to provide two days of workshops, two more of conference and a day of roundtable discussions as well as various side events like an IPv6 security summit or an SAP security track or a telco sec day or.. did i miss anything? guess so.

there is a soldering station to fiddle with your.. batch, because it comes with an arduino attached. troopers provides food & coffee & mate nearly around the clock. AND, well not that i'm anything like picky about clothing at all, BUT they have conference shirts available for girls.

no one goes to cons just for collecting shirts.. but this industry is moaning about the lack of females, no? so when they finally show up it is really nice when this kind of events actually acknowledges beforehand that this could happen. in other words, giving out shirts only for men sort of implies that there will be no women.

summing it up, great event :]
and here comes what we did there.



as i refused for ~5 months to come up with documentation, i guess, no finally it is the right time. DiffRay in short is a tool to diff Windows 7 and Windows 8 executables to spot missing security functions in an automated way.

if one can fiddle around with input values for an application, without that application checking for their validity, chances are high one can actually perform creative abuse on memory structures. the inclined reader understands potential impact of memory corruption. Microsoft does too, and thus came up with dedicated libraries that provide input checking functions to make it easy for developers to apply the right security check for a dedicated input value. namely these libraries are intsafe and strsafe. they provide APIs like ULongAdd or StringCchCopy, which do nothing more than checking if a given value stays within expected boundaries (more information on MSDN). 

following these functions are called 'safe functions'. we assume, that when in one version of a library such a safe function is applied while in the same piece of code of another version of that library no safe function is called - something is fishy. 

we perform the diffing on Windows libraries and drivers (.dll, .sys) in a very simple way. our approach is to decompile each binary, scan for safe functions and put every hit per API into a database. this way we simply count the hits for a specific safe function in a library function and diff it with the complementary library function of another Windows version. 

finally, if the hit counts differ we have a good chance, that some value in that library function at hand goes unchecked. we consider this a potential vulnerability.


the bare necessities
Python 2.7 32bit
pymssql 1.0.2 32bit for Python 2.7
PyQt 4 32bit 

DiffRay comes with two executables for decompilation of libs and drivers and a python application for parsing the spotted safe function hits into a database and for producing the final diffings. basically what we do is decompile the binaries to .c-files, so yeah we produce some sort of Windows OS source code :) 
you will need IDA Pro and the hexrays decompiler for this step. the .c-files are then parsed to a database. you can choose between sqlite or mssql; i highly recommend mssql. or you implement a DB handler of your choice, this is python!!

next step is the parsing. DiffRay parses either files or whole directories for symbols of safe functions. right now there are 130 symbols, they can be extended by just editing the signatures.conf. also, there is signature mapping if some safe function turns out to be equivalent to another one. we saw this in the past, but didn't yet come up with the right mappings. configuration could be achieved by editing the signature_mapping.conf in the form sig1=sig2, line by line.

once the parsing is finished (depending on the DB backend that could take a while) DiffRay can start with the diffing. the commandline instructions you need are listed in the slides. basically, diffing can happen via library id or via library name. the lib id way is not very handy when diffing various libraries. thus for automation i recommend using the name option and creating a batch file that feeds DiffRay with the library names and dumps the output to a directory of choice.

attention, for the name option the name should identify the Win7/Win8 versions, without extention! so e.g. kernel32 is fine, when there is a Win7 version and a Win8 version present.

the output then should be a bunch of files, preferably .csv, that contain data like this:

Function_Name Pattern Win8 Win7
EQoSDispatchIoctl StringCbLength 2 0
Ipv4SetEchoRequestCreate ULongAdd 4 0
Ipv6SetEchoRequestCreate ULongAdd 4 0
WfpAleAuditEvent StringCbCopy 2 1
WfpAleCaptureImageFileName StringCbCopy 1 2

from here on the researcher is on his own. now, get the libraries, open them in IDA and jump to the mentioned functions. good luck ;)


for ease of use we decided we wrap this whole process up in a GUI. its designed in Qt, so very easy to build and modify - you can even change the colors if you don't like my chewing gum style. anyway, all the colors do make sense as all functions are integrated into one window:


configuration dialog. you HAVE to be connected to a database when you start parsing! you can configure credentials for mssql, there is nothing to configure for sqlite (always connects to the same sqlite db). the buttons CREATE DB and FLUSH DB at the moment actually do the same, dropping everything and creating a new db from scratch. via the configuration dialog you can edit signatures, mssql settings, mappings and logging.


decompilation box. click around here to invoke the dll2idb.exe and idb2c.exe that should come with the python project. then watch the IDA Pro window pop up and down as it decompiles :)


parsing box. you need python 2.7 for it to work. for anything to work.. you can either parse a file or an entire directory, make sure its all .c-files and to have the right operating system selection. parsing is done in separate processes, you can start on multiple directories at a time.


diffing box. either on library ids or on library names, as mentioned a name has to hit one Win7 and one Win8 library. sadly, we don't yet have a way to do batch via GUI. will come in the future.


search box. check if a libname actually finds libraries or which lib got which id. or get all diffing info of one particular library.  


in total we decompiled and diffed more than 900 libraries of Win7 and Win8. the slides show some of the results, not all of them though. it is still a lot of work to actually check all the potential vulnerabilities and to evaluate if they are triggerable after all. a lot of false positives arise due to new code parts in Win8, manual checks on the Win7 side that have been replaced with a safe function or different naming of safe functions between both versions. 

apart from that, great fun. if you're hacker, bored, don't know what to do - start a joint project with someone else. you will learn what he knows, what you don't know and the other way round. both have expertise and great ideas, put them together and tadaaa you get twice as much of each. 


well there is Windows 8.1 right? besides, there is a lot more we can parse for than symbols. there is actually a lot more we could do with the pseudo source code of windows... and it would actually be a bright idea to switch to IDAPython instead of decompiling stuff.

there has to be some bug fixing, some more logging, some more automation. there could be some machine learning element or integration of symbolic execution, that could add completely different maaaagic.. 

but that would be a different blog post.

Thursday, March 20, 2014

The Mystery of Anti-Debug by HeapAlloc

first of all, a big thank you to my friend moti who actually provided the final hints to solve that mystery and saved me A LOT of time googling heap structures. i wish everyone of you would have a moti when getting stuck in RE questions :)

to the story: meanwhile, in a zeus trojan. last week i peeked into a zeus just to really quickly 
stumble over an anti-debug trick. SURPRISE! kidding. 

that anti-debug is really easy to pass by, but wasn’t that trivial to explain; or at least that is what it seems like because i couldn’t find any suitable documentation. and this while the interwebz is full of malware reverser’s write ups.. kidding again.

so here we go...

in a nutshell: the malware would allocate heap memory and use the header of the heap entry as an indicator for an attached debugger. simple, after all. is that frequently used? i don’t know.

but, lets start at the end. the debugger would crash with an access violation when executing invalid arguments. those invalid arguments were produced by an unpacking routine, and it took some runs for brute forcing with IDA Stealth to find a possible root cause: NtGlobalFlag. well known you might think now, and indeed i had some people smiling at me with a yes my dear, thats not tricky at all. but i went on to at least spot the check for this flag, which as for example Mr. Yason described very nicely indicates an attached debugger. guess what, didn’t find that check. 

in the first illustration you see the call that causes the exception: EnumWindows, that should call into a handler function, which actually is the unpacked code and turned out to be invalid when NtGlobalFlag is set.

Exception happening in EnumWindows handler function
so i held on to that exception and discovered that by turning IDA Stealth on and off the decryption key in the unpacking routine would change. gotcha, challenge accepted. so, investigating that key i eventually ended up in a piece of memory that preceded an allocated heap block. i provided my walkway up the unpacking routine in a screenshot doku, for people who like pictures just as much as me.

in the unpacking loop: bl being used to XOR the future code

up: ebx initialized with key_init

up: modifying key_init with esi

up: tweaking esi

up: grabbing esi initially from allocated_heap-8

root of all evil: HeapAlloc

so i ended up clueless inside of the RtlAllocateHeap function of ntdll.dll. in fact the value that the malware would grab from memory was initialized by RtlAllocateHeap and i admit it took some staring at the memory to accept the fact that it must be part of the header of the allocated heap block. the memory happened to always be allocated at 9A0688h; the value requested in the code therefor was 9A0680h. i could wonderfully watch the value at this offset turn from 03 to 01 when turning IDA Stealth, specifically NtGlobalFlag protection on or off. and this very value was the cause of crash.

Value in Question in the Memory Block Header

here is where moti jumped in and pointed me to the _HEAP_ENTRY structure, which assigns names to the values preceding my heap block:

The Memory Block Header

with data_offset-8 one hits exactly the size variable of that structure, which actually contains the size divided by 4 PLUS.. size of management information - like the header for example, which would explain a +1. or additional debug information, which could explain a +3. overall, the allocated buffer size in that particular case is always 14000h (it is dedicated to contain unpacked code later). the size value in _HEAP_ENTRY therefor is 2803h when a debugger is attached, or 2801h without a debugger or IDA Stealth activated.

Magic happened - Value changed!
so finalizing, when NtGlobalFlag indicates an attached debugger the heap manager understands this as a higher need for debug information so the allocated heap blocks are slightly bigger than without that flag set. this fact is used by the malware, as it uses the lower byte of the size value for calculation of the unpacking XOR key. 

for more information on heap structures check out this article which i found very informative
OR ask your own moti!

Tuesday, March 11, 2014

Bright Future Ahead

some weeks ago i was invited to talk at an austrian highschool, to a class of 18 year olds, about.. me. odd right? thing is, it were mostly girls in there and their teacher thought it would be a good idea to present them a -kindof- different perspective of future. and honestly, before that i didn't think i would ever serve as example for others. i'm not usually stepping up and taking a stage unless i'm asked to do so. then having all these big eyes on me.. unsettling. but not only after the talk i understood what it was actually about. never i was like 'i am a woman and can do cool stuff'. it was 'i do cool stuff. you should too.' you can figure out later that this is unusual for a girl, if you need to.

it in fact is easy to talk passionately about something you seriously think is cool. so thinking back that was the most fun talk i've given so far. about how i decided to study computer science, or why i started as a malware analyst, opportunities and obstacles, future plans and on how to choose ones dreams carefully.


honestly, it sounded like a cool idea. after studying information security none of the other options looked appealing, thinking back i can't even remember what they were about. so i started on a malware analysis project for my thesis and later got a job at our local anti-virus company.

now, 3 years later, what to say - i'm happy. i'm free and independent, love what i'm doing and got so many possibilities...

that screenshot you can see on slide number three shows IDA Pro, by far the most powerful and also my favorite tool in my analysis lab. i know, at first sight it looks terrible, but wait, could you believe that i had such a good time with IDA Pro inside a number of binaries; way more than i ever had using.. MS Visio? or Adobe InDesign? all a question of perception.

reversing is like building puzzles. a binary a big black box at first, but i promise as long as you don't give up you will reveal secret after secret and eventually end up with an 'UHH i understand this now'. once you have more practise you will experience more success in a day than most professional artists seem to have all their career long. true story. because every little 'uh i understand this now' feels awesome. reversing looks to me as an art on its own, but a determinable one. and one that, on average, pays better.

a binary can just only work a certain way. even the sophisticated advanced ones are never as complicated as dealing with humans. there is always a solution for any problem.


i think, a lot of technological studies have questionable reputation - because they are perceived the wrong way. economy, philosophy, politics, multi media design and therelike are topics that a first world human being experiences every day. processor design, electrical engineering, structured programming or mathematical equations just don't appear, ever, outside of a classroom or a lab. ordinary humans tend to fear the unknown. so why would a youngster, especially a female one whos not even supposed to like tech, out of the blue decide he, or she, wants to understand machine level code?

a similar thought on talent. we are successful in things that we are good at. we are goon in things that we practise. i believe in practise, more than in natural talent. but we all practise a lot what we like, and we tend to like things that we are good in. which, if you think about it, is a circle of like - practise - like more - be genius - practise more.

so concluding, what do you think you're good at - and are you sure there's nothing else? i did my own case study on that theory, unintentionally.  when i was 17 i did my driving license exam, and i was terrible in parking cars. i somehow made it through that exam and decided i would just never park anything again unless it was unavoidable. then i went on to university, public transport was sparse, but so were the parking spaces around the campus. and every year it seemed there were more and more cars and parking lots would  become smaller and smaller. so my situation was clear, park that small car of mine into ridiculous corners or walk a long distance to the campus. finally, i rather learned to park than to walk...

beautiful end, after 5 years of ridiculous daily parking i figured i could fit my renault into any space that was just an inch bigger than the renault itself.

so again, do you think there is something you are not good in, and are you sure you don't want to change that? free after einstein, saying something is too complicated just means you don't understand it well enough. plus back to the ladies. driving cars is not a male talent; i'm SURE they just practised harder. i wonder how many females actually did get an electric toy car at age 3. like my older brother did.


now finally, let me get back to the binaries. why become a malware analyst, if you still don't like binary, is actually easy. more jobs, more money, faster career, more freedom. and if you have money and freedom, you are actually more likely to get what you want after all. go figure.

apart from that, you will have fun trust me. you will receive ridiculous appreciation, for doing something that others, even men, are afraid of because they just don't understand it well enough. you will very often stand out and be better than others, because reverse engineering like most engineering fields just doesn't have so much competition going on like.. marketing. thereafter, you will experience less discrimination as in competitive fields. you will meet a number of very bright and interesting personalities, which is within the most beautiful aspects of this job. you will face an incredible diversity of people and tasks and lots of neverending challenges that remodel your own personality. 


so now, if your fingers are already burning, i added some links in the slide set above where you will find homework. if you're still scared, contact me and i will help.

but if you're still not sure what to dream about - as long as you define your success by your own achievements you should be fine. when looking back and finding there is nothing to regret, you did something right.

closing this post i want to quote a card my brother (!) has lying around in his car (!).

Saturday, February 22, 2014

Dissection Is My Hobby: Upatre Insights

i found the biggest problem to face with actual all my projects is not necessarily that i lack the idea of what i want to do, but that i lack documentation on how to do it and then go and have to figure out myself. would that be nice if someone had just mounted a page like.. malware reverser's frequently asked questions, arrrr well never happened.

now im not going to start a FAQ page, but in order to help that situation i produced detailed documentation of my latest reversing project

if you read this because you want to know about malware you will be a bit disappointed probably. mainly because the purpose of that malware itself is not so exciting at all. it just downloads.. stuff. but also because i myself focused not directly on the final executable but on the stony path it takes to get there. it is just awkwardly fascinating to watch malware shift bytes around in memory and trying to escape. i would recommend everyone just slightly interested to try it out himself, the according sample hashes for identification are listed in the write up doc.

for the records, the malware is detected as TrojanDownloader.Win32.Upatre, the full report can be downloaded at, on a summary i will try now here. could get dirty though.


the analyzed sample is a malicious downloader with the sole purpose to connect to a remote C&C when invoked and to download and execute additional malware. it communicates via HTTPS to one of two hardcoded domains, which are believed to be legitimate websites on compromised web servers. malware execution can be parted in a protection layer, an unpacking layer with different stages and the final payload. For an initial infection the malware just copies its own image to the systems %TEMP% directory and executes that copy.


the malware possesses a neat collection of anti-analysis tricks, none of them highly-sophisticated but very nice for learning purposes. 

the first one is an anti-simulation trick targetting anti-virus simulation engines by the use of a multimedia API as seen in the picture. acmMetrics is an API call present in the msacm32.dll library. usually it is used for retrieving metrics for ACM objects (Audio Compression Manager). during the startup procedure of a malware sample it is highly likely that this was not the initial intention when placing that call. acmMetrics is part of the multimedia library since at least Microsoft Windows 2000 (according to Microsoft documentation) and in this special case called to trick AV simulation engines.

in our case acmMetrics is expected to deliver an error message for an invalid handle, which is not surprising given that the handle parameter is not initialized beforehand. in case the return value is not MMSYSERR_INVALIDHANDLE, code 5, execution continues to access the memory referenced by edx, which at this point always results in a memory access violation. edx is not initialized thus set to zero. 

the point of this check is, on a normal operating system like Windows 2000 or newer this function returns 5 in any case. Simulator engines usually don’t support media APIs due to overhead, therefor either crash on the call or later on the access violation.

implicit breakpoint detection

the protection layer performs minor decryption of a part of its own code, which results in implicit breakpoint detection. the decryption consists in subtracting a key from every opcode of a given section. the simple decryption routine iterates code on the position 40100Fh, where execution continues later on. If a software breakpoint is placed in the section to be decrypted the routine produces invalid opcodes and the malware crashes later on.

window confusion

At the end of what could be classified as protection layer stage one the malware invokes CreateWindowExA with a provided WndClass Structure. This structure defines the handler function of the dummy window, which will execute the second part of the protection layer. The created window has no graphical representation, thus can’t be seen so just only serves for executing said handler function. If the analyst does not recognize the switch of execution to the handler function and places according breakpoints control of the debugger will be lost.  

broken timing defence

interesting in the next section of the protection layer is a rdtsc-triggered timing defense. malware can utilize the system time to verify if a debugger, including a human analyst, is attached to the running process. windows offers various mechanisms to request the system time, most commonly used are rdtsc or the GetTickCount system call.  for detecting an attached debugger/human malware wants to know the time difference between two time stamps, namely if the delta is too big as if the CPU would execute without interruption.

the malware at hand issues two rdtsc instructions, wrapped around the decryption loop. the delta is calculated immediately afterwards, but never checked against any threshold. instead it is kept in eax until the next system call overwrites it with its return value. no other verification could be found, this anti-debugging trick is either broken or the first timestamp servers a different purpose that could not be identified. 

more multimedia disturbance

the windows media library is used a second time as a means of protection from analysis. the malware issues a call to mciSendStringA with the command “set waveaudio door open”. it is not perfectly clear what purpose the command “set waveaudio door open” usually fulfills, but without doubt the aim of the malware at hand is not to interfere with multimedia devices. an effect of mciSendStringA is that it starts up two additional threads for interaction with devices – the analyst could lose control of the debugger when inappropriately configured. a solution is to configure the debugger to stop on the start-up of a new thread, step back to the original code and continue execution until it returns to the malware code. 


after bypassing all the protection mechanisms the unpacker executes without problems. unpacking can be parted in three steps: 
  • one that compresses and decrypts the packed payload data
  • second one that decompresses the same data using RtlDecompressBuffer
  • third one which performs checks on the unpacked binary, patches function call offsets and reconstructs the import address table (IAT). 
details on the unpacking routine, including the IAT reconstruction, can be found in the report mentioned above. any critics on that analysis are welcome :) ou.. how did that sample actually catch my attention? it was part of a mean malware spam wave, that has been ongoing in austria for at least since november. samples are all the same size and have very similar protection/unpacking mechanisms. so some future research would be to look at the other ~100 related malwares that i have and correlate similarities on binary level. maybe..

future shines bright, you know?