Tech Forum – The Art of Estimation Merging In Subversion
Feb 05

Lutz Roeder’s Reflector is an excellent free program that lets you take a .net assembly and decompile it into completely human readable source code. It’s so simple a chimp could do it. It’s so scarily good that you can almost take the decompiled code, stick it in Visual Studio, hit F5 and have it run as if you’ve just built it yourself. Great if you’re a hobbyist wanting to learn how an awesome app was built, also good if you’re an evil corporate hacker wishing to rip off the hard work of a smart development team.

We’re about to release a client–server app where the client is freely downloadable, but useless without authentication at the server. The client is relatively thin, but still contains a lot of hard, hand crafted code which will sit on the client machine. We started to get a little paranoid about the possibility of dirty haxxors decompiling our work and using it in competing products, so we turned to the black art of obfuscation.  

Obfuscation is the process of converting your nice, tidy, readable code into nonsense gibberish before compilation. It still runs as intended, but means that anyone decompiling your assemblies and executables won’t easily be able to figure out what the hell is going on in your code.

There are a few different techniques an obfuscator uses to do this, the most basic and common being to rename your methods, variables and parameters to short strings like a, a0, aa1. A guy I shared a flat with at university wrote code like this on purpose to try and ensure the university didn’t exercise its rights on IP to take his self proclaimed “five million pound ideas” and turn a profit themselves. Unfortunately, after taking a couple of weeks off from coding he couldn’t follow the code himself and eventually ended up flunking his CS course with a 2:2.

I used Dotfuscator CE which comes bundled with Visual Studio to compile a small sample project and compared the results in Reflector:

Unobfuscated code                            Obfuscated Code

Clean, unobfuscated code (left) and code obfuscated using Dotfuscator (right).

Note how the class and method names are contracted to be meaningless and how the obfuscator has taken advantage of overloading to give the different methods with differing signatures the same name. Less code results in smaller assemblies with Dotfuscator reporting an average of 30% reduction in size.

Nobody wants to have their hard work stolen, so why doesn’t everyone obfuscate their code as a matter of course? Why isn’t the output from your build automatically obfuscated?

The answer is because it’s bloody hard to get it to work effectively for real applications. In my admittedly limited experience at least. There are certain parts of your application that can’t be obfuscated without breaking your code. If any of your libraries expose APIs, those methods and parameters can’t be obfuscated or any calling methods will fail. If you use reflection and are testing for types that have been renamed through obfuscation, you’ll get errors. And any stack traces you receive from error messages in an obfuscated program will also be obfuscated. You can get around these issues by manually configuring the obfuscator to avoid every method that will cause an error, but this takes a crazy amount of time. Imagine stepping through code that looks like the picture on the right above, like my friend from university, trying to find what’s causing the program to crash. It’s enough to make your eyes bleed.

And worst of all, for all the effort involved, obfuscation isn’t failsafe. If a team of hackers is determined enough to recover your code, then they can and will given enough time. Obfuscation will stop the casual weekend types, but these aren’t likely to be the guys that are trying to steal IP from you. So at best, obfuscation will only slow decompilation down and at a huge cost of maintenance. Real obfuscation is hard, maybe even impossible.

After a lot of effort I managed to get most of our code obfuscated and deployed, but at a huge time cost for what I see as a relatively minor threat. From now on, any projects requiring obfuscation are going to be architected for it.

My advice is this: If you’re really developing top secret, proprietary code, then you should encompass obfuscation aspects in your initial design from the outset, isolating the sensitive parts of the code and either sticking them on a server, or ensuring that they can be properly obfuscated without affecting the rest of the codebase by incorporating checks into your build tests.

If someone copies your program successfully enough to become a competitor then you’re hopefully going to notice that fact and get lawyers to sort it out. Hint: it’s usually when someone whacks your business partner around the head with a bat to steal the satellite uplink code.

Leave a Reply