Malware reuse - cpu emulation for malware analysis

During any RE project, precise goal definition is one of the most important steps, without which the project will most likely fail. And I’m not even talking about how time consuming it could be, as frequently I can find the rabbit hole but I’m not really aware how deep it goes. The context for the whole post is RE of malicious software and here I’d like to talk a little bit about the way to put some of the work on the shoulders of the malware itself and still be able to concentrate on the main goals. I also want to point out that I’ll be using the malicious code striped from all 3rd party packers and crypters – the code that malware author wrote at the first place.

What am I after?

In my opinion, malware authors are well aware of the techniques that are used to analyze their “fruits” and they also understand that they can’t prevent the analysis of it. BUT, they definitely can make this process harder. Of course, harder is the matter of experience and tools. The harder part is the one I’d like to return back to malware and use it against it. So here are some examples of hindering the analysis:

  • String obfuscation

A lot of benefit from it could be seen during static analysis. IMHO dynamic analysis suffers less and pretty easily the real string is revealed just by stepping through the de-obfuscation function.

  • API hiding

This will complicate the understanding of the flow and definitely make it harder to do statical or dynamic analysis. On the following figure, there are meaningless imports in terms of understanding what the soft is doing. Of course one can argue that everything will be imported and resolved during the runtime and it’s true, but what matters is what technique will be used for that. I’ll explain this later on.

There are of course many other ways to make the work of malware analyst challenging like code packing and encryption or add anti-debugging but I’d like to concentrate on the above as I think that APIs and strings are important enough during the analysis and it’s a good example to demonstrate the whole idea. In addition explained technique can also be used to solve other mentioned problems.

Malware is my partner

Let’s say that in this particular example I was interested in the network part. The problem was that all APIs and strings were hidden, which makes analysis not so pleasant, so let’s remove the curtain. I’m assuming that if the strings are obfuscated then there must be routine which will de-obfuscate them back, meaning that it will probably be called relatively often. Using the following script I got 10 most referenced functions, hoping to find what I need there:

from idautils import *
from idaapi import *

def NumOfRefs(address):
  res = 0
  for ref_ea in CodeRefsTo(address, 0):
      res = res + 1
  return res

ea = BeginEA()
sum_of_funcs = 0
func_stats = dict()

for funcea in Functions(SegStart(ea), SegEnd(ea)):
    functionName = GetFunctionName(funcea)
    if 'sub_' in functionName:
      num_of_refs = NumOfRefs(funcea)
      func_stats[functionName] = num_of_refs

sorted_func_stats = sorted(func_stats.items(), key=lambda x: x[1], reverse=True)
first_ten = 0
for f in sorted_func_stats:
  if f[1] > 1:
      print "%s called %d times" % (f[0], f[1])
      if first_ten >= 10:
          break
      first_ten = first_ten + 1

With a little help from dynamic analysis, I found that the following code was the actual de-obfuscator. It was called several hundred times throughout the code and it looked relatively complex.

Another interesting finding from the call statistics was the technique which was used to hide API calls. For every function name the hash value was pre-calculated and hardcoded during development. So, during runtime the export table of the specified DLL was parsed and each API name hashed till the needed function found.

Now, that the obfuscation techniques are known, I want to be able to reveal all the hidden information so I can continue network related analysis – concentrate on the initial goal – which I prefer to do statically most of the time. To sum things up, I needed some sort of automatic way to

  • get all the strings and APIs.
  • don’t reverse engineer the obfuscation algorithm at all as it could be changed in the future.
  • do everything in clean environment and not in the sandbox

Can it be done? Yes, it can, by the means of CPU emulation. Further I’ll show how to combine CPU emulation with IDA Pro to solve the above problems.

Some words about CPU Emulation

CPU Emulation – handle the tasks of translating CPU behavior to their equivalent logical and memory computations. So in other words we get “CPU” and it’s interactions with memory completely implemented using software!!! When x86 instructions are being executed, nothing will leave the emulator and damage the host machine. Personally, I find it really fascinating and useful. I’ve found and tested several emulation frameworks:

  • PyEMU – simple and easy to use x86 emulation framework completely written on Python by Cody Pierce. It uses libdasm for its disassembling needs and PEfile to manage PE files. The framework is pretty universal as it could be used to write standalone scrips as well as executable in the context of IDAPython.
  • IDA-x86emu – x86 emulation plugin for IDA written in C by Chris Eagle.

I decided to use PyEMU as it was relatively easy for me to implement all the logic I needed. It’s not complete in terms of instructions which it is able to emulate and I needed to add a few by myself but it was very easy. For example one of the already implemented instruction is XOR

       #30 /r XOR op1value, op2value
        if instruction.opcode == 0x30:
    
            if op1.type == pydasm.OPERAND_TYPE_REGISTER:
                op1value = self.get_register(op1.reg, size)
                op2value = self.get_register(op2.reg, size)
    
                # Do logic
                result = op1value ^ op2value
                self.set_flags("LOGIC", op1value, op2value, result, size)
                self.set_register(op1.reg, result, size)
    
            elif op1.type == pydasm.OPERAND_TYPE_MEMORY:
                op1value = self.get_memory_address(instruction, 1, size)
                op2value = self.get_register(op2.reg, size)
    
                # Do logic
                op1valuederef = self.get_memory(op1value, size)
                result = op1valuederef ^ op2value
                self.set_flags("LOGIC", op1valuederef, op2value, result, size)
                self.set_memory(op1value, result, size)

Now, let’s get back.

Emulating malicious code in IDA Pro – Results

Ok, I start by defining 3 objectives that from my experience must be fulfilled to successfully emulate code. I get to them only after I’ve finished the work so if anyone done something like this too and has something to add, then be my guest. So, here they are:

  1. Identify the code to emulate – this could be accomplished with IDA’s help using Xref feature.
  2. Identify the I/O of the emulated code – the API of the code, it’s very essential.
  3. Identify the OS APIs that are used inside the emulated code – there is no reason to do OS’s work.

Resolving APIs

As I’ve already mentioned, API resolving is based on comparing results from hashing function (calculate_hash) applied on OS API names against hardcoded hashes. So to get all the APIs used by malware I needed to calculate hashes of a potential list of the API names, so by comparing calculated results with the hardcoded values, I’ll be able to mark every hash by it’s name. Following the 3 step recipe:

  • calculate_hash which is used to calculate hash from API name, so it will executed in emulator.

  • the input are the exports from system libraries, so by using PEfile potential API names could be extracted (See Appendix section).
  • calculate_hash did not contain any OS APIs references – good for me :)

Once all the hashes are calculated, all the functions that use hardcoded hash values could be tagged and renamed. The following figure, demonstrates how every API is “implemented”.

I also used a small script to get all the hashes from the code and finally by comparing my calculations with extracted hashes, I renamed all the functions containing a hash value to an appropriate name.

De-obfuscating strings

Let’s update the figure of the strings de-obfuscation according to now known APIs:

As you can see, there are OS APIs present which the emulation must take care of. PyEMU has facilities based on callbacks, which can notify when the certain API is called. So again, following the 3 step recipe:

  • DecryptString is the peace of code to feed the emulator.
  • the input is obfuscated, hardcoded struct which is identified from analyzing the surroundings of the call to DecryptString and also from dynamic analysis.

  • OS APIs are memory management routines for which I’ve implemented several callback routines in Python. On every call to the OS API in emulated code, the implemented callbacks will be called. Here is the example of implementation of HeapSize callback and it’s registration.
  ...

  #
  # callback for returning results to caller of HeapSize
  #
  def heapsize(library, address, dll):
      block = emu.get_memory(emu.get_register("ESP") + 0xc)
      emu.set_register("EAX", allocs[block])
      return_address = emu.get_memory(emu.get_register("ESP"))
      emu.set_register("ESP", emu.get_register("ESP") + 16)
      emu.set_register("EIP", return_address)
      return True
      
  #
  # registering HeapSize callback
  #
  emu.set_library_handler("HeapSize", heapsize)
  emu.os.add_library("kernel32", "HeapSize")
  emu.set_memory(address_of_HeapSize_pointer, emu.os.get_library_address("HeapSize"))

  ...

All this information is sufficient to get what I need. Every de-obfuscated string is added as a reoccurring comment to every obfuscated one, that’s how it look at the end.

What about the network?

Before the conclusion, I just want to show that once all the strings were reviled I found several parts connected to the network operation, that I needed in the first place – this part of code is responsible for dealing with plugin download from remote CnC.

The above work did not solve the initial problem, but certainly helped to make it more pleasant, saved time and to concentrate the effort at the right place.

Conclusion

Pros

  • Malicious code is not actually running on real machine
  • Full control over the execution environment
  • The problem downgraded to black-box
    • understand the inputs
    • grab the results
  • Potentially saves time
  • Developed code can be easily re-used

Cons

  • First time development is needed to prepare the environment
    • write handlers for OS APIs
    • partial understanding of the emulated code is still needed, to follow 3 step recipe
  • Slow execution
  • Not suitable for all problems
  • No support of the frameworks

Final

  • Those tools are just cool
  • It’s not an ultimate solution but a side help
  • There are lot of tools when combined can be very powerful
  • Some preliminary research must be done to use emulation
  • Could be of help when building automatic analysis tools
  • Fully controlled execution environment

Appendix

Extracting APIs

The following code will go over every exported API name in the listed DLLs and will execute in the emulator calculate_hash according to the following interface:

hash_result = calculate_hash(char *API_name)

ref_ea = calculate_hash
HEAP = 0x17000000
emu = IDAPyEmu()
apis = dict()

print "[*] Loading PE image into memory"
seg_addr = FirstSeg()

while seg_addr != 0xffffffff:
  for x in range(GetSegmentAttr(seg_addr, SEGATTR_END) - seg_addr):
      emu.set_memory(seg_addr + x, GetOriginalByte(seg_addr + x), size=1)
  seg_addr = NextSeg(seg_addr)
print "[*] PE image loading is done."

libs = ('c:\\windows\\system32\\ws2_32.dll',
      'c:\\windows\\system32\\kernel32.dll',
      'c:\\windows\\system32\\user32.dll',
      'c:\\windows\\system32\\wininet.dll',
      'c:\\windows\\system32\\advapi32.dll',
      'c:\\windows\\system32\\gdi32.dll',
      'c:\\windows\\system32\\avifil32.dll',
      'c:\\windows\\system32\\msvcrt.dll',
      'c:\\windows\\system32\\Imagehlp.dll',
      'c:\\windows\\system32\\psapi.dll',
      'c:\\windows\\system32\\ntdll.dll')
      
for lib in libs:
      pe = pefile.PE(lib)
      print "[!] Importing Library %s" % lib
      for imp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
          if imp.name == None: continue
          for x in range(len(imp.name) + 2):
              emu.set_memory(HEAP + x, 0, size=1)
          emu.set_memory(HEAP, imp.name[::-1])
          emu.set_register("EAX", HEAP)
          emu.execute(start=ref_ea, end=(ref_ea + 9))
          res = emu.get_register("EAX")
          apis[res] = imp.name

Once the hashes are calculated, what is left is just rename the hash enclosing function by the appropriate name.

References

Contents