Internationalization of Malware

Wes Brown | July 1st, 2008 | Filed Under: Industry Punditry, Malware

Why am I qualified to write about this?

In a former life, I was a malware analyst.  It was my job to analyze incoming samples and determine if they were false positives, or false negatives.  I also worked on automating the process, and it was a very neat job.  Unfortunately for me, economic realities and the precarious nature of startup companies dictated other career paths.  I had analyzed literally thousands of samples, and took notes on characteristics to help improve the anti-malware product.

Eventually after a while of doing this, I have some observations on where malware is going, and I’m going to share some of them in this post.

Growing Internationalization

In the past, an anti-malware company could focus on English-targeted samples.  But an increasing percentage of malware samples are international in origin and targeting international machines.  I saw numerous cases of Chinese malware targeting Chinese software or hosts.  This was quite a challenge to determine if it was malware or not for several reasons.

Cultural Impact

One of the most fascinating facets of the increasing internationalization of malware is the cultural assumptions around such software.  What is considered malware in the US may be commonly accepted in China or Japan, and this is largely due to the society that it exists in.

Anti-cheating rootkits are very common in games released in these countries.  What is considered to be invasive in the North American or European world is acceptable there.  These anti-cheating rootkits would hook into the kernel space in a very invasive way, and have the behavioral characteristics of malware such as hooking into the keyboard driver.  This made it very difficult from a purely technical standpoint to distinguish them.  These kits were attempting to protect the application from being tampered with while running, i.e. to reduce the incidence of bots, or modifications to the presentation layer to allow people to see through walls.  They would watch for kernel debuggers, or running processes that did specific characteristic behavior.  These very techniques would flag them as malware as many such samples would behave similarly to avoid antivirus or to prevent someone from easily reverse engineering them.

If I applied US standards to these particular samples and declared them a true positive, then we would have many angry international customers when their games no longer worked.  This also applied to extremely intrusive adware.  But these pieces of software could run on US machines as well, so it was a very tricky balance.

Linguistic Barriers

In the past, if I ran into a piece of malware that had foreign language strings in them, I could muddle through them if they were a Latin-derived language.  Spanish or French, I did not have any issues with.  But when it comes to languages that come from an entirely different root such as Chinese or Japaense written in hanzi or kanji, I was losing vital clues.

By looking at the behavior of the sample alone, I would declare it malware.  But what if it was one of the aforementioned game rootkits?  How do I know that the game actually includes it, or if it was indeed a trojan’ed game?  With English language samples, I would simply look at the strings, or use Google.  But I had to muddle through pages in a writing system that I could not easily begin to comprehend.

So, if you want to be a malware analyst, it would be in your best interest to become conversant or fluent in one or more of the non-Roman languages.

Internationalization of Antimalware Tools

As we are dealing more and more with malware samples that are international in scope, it becomes important that the tools themselves are internationalized. With the growth of samples targeted at other languages, the automatic tools that I wrote primarily dealt with ASCII and were becoming inadequate.  String and keyword analysis did not work well.  Tools need to be Unicode and multi-lingual.

Hints for International Malware Analysis

  • Pay close attention to the signers of samples, whether they are signed or not.  Once you have verified a signed application, consider it the baseline.
  • Once you have multiple samples of what appears to be the same application but has different checksums, pay close attention to file size, and the version, vendor strings.  Interestingly, many trojaned applications do not have the correct version and vendor strings.
  • Use entropy to your advantage.  Measure the entropy of the binary segments that you have.  If they have very similar entropy values, and have a minor increment in version, the probability of it being a trojan is much lower.
  • Pay close attention to the vendor and version strings of samples.  See if you can get an authoritative version of the application from the vendor’s site and compare it.  Once you have a sample that you can declare as as false positive, all other similar samples are much easier to analyze.
  • Take note of what binary packers they are using.  Certain packers have a higher probability of being used by malware.  But there are legitimate use of packers, and some antivirus products will trigger a false positive on a packed application, no matter what.
  • Build a library of samples, and understand the cultural context of the country of origin and destination.  Categorizing the sample characteristics by these criteria will help you determine if it is a true or a false positive for that particular market. 

Conclusion

It is becoming more and more important that entire infrastructure of malware analysis, from anti-malware client to backend infrastructure to the analyst herself become multilingual and multicultural.   It is a difficult challenge that is going to crop up more and more in the future.

Viewing 11 Comments

    • ^
    • v
    Wes,

    Thanks for the interesting post. For entropy measurements, do you do them yourself or do you use a pre-packaged tool? If so, which tool?
    • ^
    • v
    Wes Brown from ISS back in the days?
    • ^
    • v
    @Gabe: I used PEAT's segment entropy score algorithm. The approach used is documented in detail by Robert Lyda and James Hamrock's 'Using Entropy Analysis to Find Encrypted and Packed Malware'. You can find the article and abstract here: http://ieeexplore.ieee.org/Xplore/login.jsp?url...

    @Mark Curphey: Yep, that's me. We last met in Kuala Lumpur.
    • ^
    • v
    Wes,
    I am an expatriate working at a large financial institution in Riyadh, Saudi Arabia. The lack of awareness and concern about Infosec at both the individual and corporate levels is tremendously worrisome to me, especially when you consider the amount of cash that consumers and businesses throw around here.

    tayyib

    It might be worthwhile to maybe build a small cabal of security types that are native speakers of relevant languages, or are at least familiar with them. Its only a matter of time that "an ra beh enghlisi che migooid?" starts getting uttered in the rooms that house certain western organizations pen-teams.... Not just for malware related issues, but Infosec or even general IT related consulting.

    I guess I need to step up my Arabic lessons.
    As always, your entries are a pleasure to read,
    ma salaama, khoda hafiz, mahalo, slap mah fro, etc.
    jcw
    • ^
    • v
    One last thing..

    "You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage."

    -John von Neumann to Claude Shannon
    • ^
    • v
    @John Waters: Thank you for that quote. It made me laugh. Especially because when I was writing this post, I was thinking about how to explain entropy, but decided that it really didn't matter as long as it was a consistent factor. I know that I didn't think too hard about the mechanics of entropy, I just found it very useful for measuring binaries.
    • ^
    • v
    Anti cheat software IS malware. It installs itself at an improper priveledge level, interfers with other processes, and often will 'phone home' to report user activity and other private information.

    It has nothing to do with 'cultural' differences, that is simply using bigotry to shift blame.
    The developers of the games are a bunch of idiots who can't figure out how to program secure code.

    Since the majority of these games don't mention their anti-hack software, or if they do mention it gloss over the details (such as, 'this software will attempt to take over vital system processes and randomly terminate other applications') these programs ARE malware, by every definition of the term.
    An AV software SHOULD be reporting this type of program, even if it flags the item simply as 'suspicious'.
    • ^
    • v
    @skeptikal: praise you, this is 100% correct. We all remember sony planting rootkits on peoples computers, what the game companies do is in no way different.
    • ^
    • v
    It's not quite 100% correct; what's wrong is the claim that game producers don't know how to make their software secure. Many popular games currently aren't possible to make hack-proof. For example, say you want hardware-accelerated, real-time 3D graphics. If an enemy shows up on the screen, hiding behind a bush, just barely visible to skilled players due to his camouflage, the server has to send the enemy's position to the client. (Not necessarily send the information that it is in fact an enemy, but this doesn't matter here.) Now there's nothing to completely prevent one from patching the client so the enemy will glow bright red instead of being camouflaged. You can't tell players to get an Internet2 connection and pay for a supercomputer server so the graphics can be generated on the server and sent as a video stream. You can't tell players they should play NetHack instead. They want to play _their_ game on _their_ computer.

    What you can do is let your players voluntarily use something like PunkBuster and choose to only play with others who do the same. This still can be hacked, but now that's more difficult and script kiddies who just want to fool around a bit are more likely to do so on servers with PunkBuster disables, leaving the others alone.

    I fully agree that such software shouldn't be forced on players, shouldn't be deployed covertly, shouldn't run with excessive permissions, shouldn't spy on you (reporting gathered data back over the net without your knowledge), and shouldn't take the form of a root kit. But the idea that games could even just in theory be made completely secure is completely wrong. Consider this: with an online banking app, say, you have to protect both the server and the client from attacks coming from outside. But with a game, you have to protect the client from an attacker who owns the machine on which the client runs. Major difference.
    • ^
    • v
    • ^
    • v
    informative article but i don't understand "Pay close attention to the signers of samples" please elaborate it....

Trackbacks

close Reblog this comment
blog comments powered by Disqus