Wednesday, April 07, 2010

poking holes in trojans

it seems like only days ago when this blog's longest comment thread in recent memory drew to a close after a heated discussion on the definition of trojan and now it seems that not long after both chet wisniewski and david harley posted blog entries featuring that very classification.

not only that but chet's usage of the word as an umbrella term for all non-replicative malware doesn't seem to match david's usage - nor does it agree with my definition (which i imagine probably doesn't exactly match david's either). all of which just goes to show that there really isn't a universally agreed upon definition of trojan. no matter which definition you use there are always some problems with it.

that being said, there are some ideas in chet's post that mirror topics brought up in the aforementioned comment thread and that i think should be brought out front and center.

starting with the low-hanging fruit, let's look chet's example of a trojan that you don't have to execute in order to become a victim of - that being the drive-by download. now i realize that my computer science background has allowed me to develop some rather transcendent notions of what execution means (even going beyond this), but i really don't think one needs to go that far to get the concept that when you open a web page you're executing it's contents. a web page is a container for data and a wide variety of executable content (from javascript to flash to activex to silverlight, etc) and, rather than bog down the user experience with endless prompts to execute this, that, and the other thing, browser developers decided that opening a webpage should result in the execution of whatever executable content it contains. this is something that doesn't get nearly as much attention as it probably should and as a result most people aren't aware of this rather significant detail (otherwise more people would be using noscript) and consequently make bad decisions about how to use the web. i expect that chet himself is all too aware of the executable potential of web content but he's not passing that knowledge on to the reader when he tells them that drive-by downloads don't require any user interaction. the user opened the page in the first place (or opened another page that lead to the drive-by download page being opened) so they did interact in the sense of executing something. clicking a link in your web browser isn't that much different than clicking an *.exe in your file browser, and if more people understood that they might be more careful online.

more generally, the notion that something can be a trojan without requiring the user to execute something seems very odd to me. it seems to me that a piece of malware that doesn't need the user to execute it, that doesn't need the victim to let it in past the defenses, simply doesn't bare much similarity to the legendary strategem from which the name trojan horse program is derived.

i realize that analogies shouldn't be carried too far but to say that all malware that doesn't self-replicate (basically everything that's left over once you remove viruses and worms) is a trojan horse makes me wonder why on earth they chose the term trojan horse in the first place. the definition and the name seem to have no obvious connection. perhaps at the time they simply hadn't conceived of any malware that didn't conform to the paradigm used by the ancient greeks, but is that still true today or could i conjure up some malware examples that just don't seem like they should be called trojans at all? for example, when we think about malicious software we often think about that which runs on the victim's computer, but what about malicious software that runs on the attacker's computer? a participatory DoS tool, for example, would certainly seem to belong to the malware set since it's malicious software, and it certainly doesn't belong to the self-replicating set, but can you imagine something whose malicious functions are both known and advertised being called a trojan horse? should we call every implement of war a trojan horse instead of just the actual hollow wooden horses? catapults will henceforth be known as trojan horses, trenches also, swords and guns and chemical weapons, all of it. while we're being absurd let's call everyone bruce in order to avoid confusion. g'day bruce.

how about an example that does execute on the victim's machine? now remember i'm trying to steer clear of anything the victim user would let in past his/her defenses so the question you might be asking yourself (after taking into account just how liberal my concept of execution really is) how on earth such malware would get onto the victim's system? the answer, of course, is that it's planted there by an attacker who has already gained access to the system. back in the early 90's (perhaps even earlier) there was this attack technique whereby an unsecured system would be used to sniff out enough information (generally login credentials) to compromise a more secure system (because users of the unsecured system might on occasion access resources on another system), which in turn would then be used to sniff out information to compromise an even more secure system and so on and so forth until the attacker reached his/her goal. the attacker would use a collection of tools, often including some sort of back door in the form of a modified system binary as well as a password sniffing program, that were at least in the beginning known as toolkits (eventually one of these toolkits got named "rootkit" and the rest, as they say, is history). now i'll admit that the modified system binary that provides a back door does seem to bare at least a passing similarity to what we'd think of as a trojan horse program even if the victim didn't let it in him/herself (it's something that the victim could have easily let through the gates if given the chance), but in this context, rather than baring a similarity to the trojan horse of old, this bares more similarity to converting an existing agent into a double-agent. additionally a password sniffer in and of itself doesn't necessarily strike me as being particularly trojan-like. it's a packet sniffer that filters what it captures. it's not even clear that it's malicious until you take the context of it's use into account.

that sort of context sensitivity is often cited as a property of the trojan set but i believe it is a property of the malware set (of which the trojan set is a proper subset). in fact i think the trojan set inherits that property from the malware set. i also think that at the end of the day it's that property which makes the question of how to define such a set pointless. the term "trojan horse program" should probably go down in infamy as one of the anti-malware community's great failures because in it's unqualified form it has been one of the most singularly unhelpful classifications. back when there were only 3 major subclassifications for malware (virus, worm, and trojan, as chet mentioned in his post) we had quite a successful anti-virus industry that handily took care of viruses and worms, but not really trojans. both viruses and worms have functional definitions and trojans (whether you consider them the complement of the self-replicative set within the malware set or something more specific) do not. it wasn't until we started carving out functionally defined subsets of the trojan set (such as spyware and adware) that we started to actually get a handle on trojans. problems need to be well-defined before we can hope to address them.

so maybe we should all just forget about trojan as an unqualified term and only use it when we're talking about things like remote access trojans or downloader trojans. while we're at it, let's make sure we continue to carve out new functionally defined subsets of the malware set when fundamentally new behaviour comes along so that we don't sit around gazing at our navels and lamenting how difficult it is to classify things as belonging to an ill-defined set. we've already had an anti-spyware industry, an anti-trojan industry, an anti-rootkit industry, etc. due to slow response by anti-malware incumbents - we don't need to keep doing that.

8 comments:

Vess said...

The original term "Trojan" reflected quite well what the ancient Greeks did. The original Trojan horse wasn't just a weapon - it was something that looked benign but contained a hidden and nasty surprise. So, it was quite apt to apply it to programs that looked useful but did something bad. (Remember, there were no viruses back then.)

It is not possible to define the term "Trojan" objectively. The term "virus" can be defined, because it does something that can be measured objectively (it replicates). But the definition of a "Trojan" would require using such terms like "intentionally" and "harmful" - and these cannot be defined in an objective way.

A classic example is the program FORMAT.COM. Is it harmful? Well, it certainly can be used in a harmful way - but, in general, it does something useful. What if it is renamed to SEXYPICS.COM? Obviously, changing the name doesn't change the contents or the functionality of the program - but it will start doing that the user is unlikely to expect. What if the prompt it issues whether to proceed with the formatting is in Swahili and "yes" is the default answer? Is that a Trojan or just bad design? And so on.

Personally, I agree that Trojans are non-replicating malware (it is important to distinguish between replicating and non-replicating malware) - but I disagree that all non-replicating malware are Trojans. The best definition I've seen of the term is "a program that claims to do something but does something else, which, if the user knew about, wouldn't approve" - but, again, this is informal and subjective.

As for why anti-virus programs handle viruses better than Trojans - there is a different reason for that. Most people rely on scanners, because this is what they can understand. Viruses replicate. So, if a virus is found on one computer, chances are that it will also be found on other computers later. So, it makes sense to create and distribute a program that will find it there. But the vast majority of Trojans are one-shot. They are generated on-the-fly (server-side polymorphism), attack the victim and are never seen again. Yes, we can update our scanners to detect them - but it isn't going to help anyone; the first victim has already been hit and the same Trojan won't hit anyone else.

As for worm and how to define them and whether they are viruses or not - let's not open this particular can of worms. :-) (The surest way to start a fight at a gathering of anti-virus researchers is to ask them what is a "worm".)

kurt wismer said...

i think we agree on most points, but with regards to the handling of trojans by av software i think perhaps i expressed myself poorly.

i actually think av software handles them fairly well these days - at the very least it there's a commitment to trying to do so. i was thinking back to the late 90's and early 00's, when the anti-malware market fractured and anti-spyware applications started to proliferate.

i don't think the problem the av industry had back then had much to do with server-side polymorphism, i think it had more to do with a hesitance to expand the scope of anti-virus software into a poorly defined area like trojans. it took upstarts to show that while the trojan problem as a whole was hard, certain specific parts of it were not and they made a business out of doing what the av industry had failed to do up until that point.

obviously the av industry eventually managed to incorporate the same line of thinking and have made the anti-spyware industry all but obsolete.

David Harley said...

I love a definitional thread. (How I miss alt.comp.virus...) Still, I should maybe point out that my Mac Virus blog wasn't an attempt to cover all the bases on a definition of Trojans: if anything, it was an anti-definition statement.

My concern (and Chet's, IIRC) was more around the way that some parties reinvent definitions then draw unsafe conclusions based on those redefinitions. Mac fans are not the offenders in this respect, but the argument that "there are no OS X viruses so there is no malware problem" is one that always presses my buttons. :-D

kurt wismer said...

@david harley:
miss acv? you and me both. of course acv is still there, but somehow it's just not what it used to be.

i understand that you weren't trying to define trojan, but you did use it in a way that implied there was non-replicative malware outside the trojan set, which ran counter to chet's usage.

i also agree that the way things get redefining things can lead to problems but it seems to me that trojan, with it's long history of multiple and sometimes contradictory definitions, is best left behind.

David Harley said...

What I meant was that I -see- the term used as if it's limited to a subset, not that I was endorsing that usage. That said, I have to agree with Vess that not all non-replicating malware can usefully be described as a Trojan.

I'm bowing out at this point, as I'm travelling in a few hours and won't be connected for a few days, but I suspect that we'll all be back for another tilt at this windmill. :)

kurt wismer said...

@david harley:
on that point i agree with vess as well. i even tried to come up with some examples of non-replicative non-trojan malware in the post.

Chester Wisniewski said...

Great write up Kurt. When I was writing my post I was thinking along the same lines as your post, but was hoping to keep it simpler for my target audience.

Lumping "everything else" under Trojan is not really appropriate, but it is difficult for the average computer user or IT administrator to understand or care about the differences. I may not have been clear in my goal, which was to point out that malware is malware... It doesn't much matter the mechanism through which you receive it anyone can be a target and you cannot necessarily protect yourself through smarter surfing.

Whether drive-by infections are technically Trojans can be debated for a long time, but as a end user of the Internet I need technical mechanisms to look for what might be wrong. Simply not clicking on NudeAngelinajolie.exe is not enough to guard against being compromised.

As to the fracturing of the industry during the anti-spyware days I am proud to say those of us at Sophos never took that direction. We have always defined spyware as just another definition of malware and blocked it from day one (to the best of our abilities) inside our anti-virus product and try to argue that the purpose of AV was to block malicious applications despite how they may work on the inside.

I am leaving for Europe so I will be out of pocket as well, but I look forward to seeing us all converge on a similar result.

Chet

kurt wismer said...

@chester wisniewski:
i guess we're on more or less the same page about trojans, then.

i understand your concerns about your target audience being able to understand what you're talking about, but in that regard at least we seem to have philosophical differences - i don't like the idea of dumbing things down, i would prefer to lift the audience up, to elevate them.

i actually do believe drive-by downloads can qualify as trojans, my point in that regard was more to the effect that they didn't qualify as an example of trojans the user didn't need to execute. the user executes them by browsing to the page they're on. it's really more of a disagreement over execution than 'trojanity', though.

and in the matter of the fracturing of the industry/market - the anti-spyware industry arose precisely because the incumbents weren't doing an adequate job at the time. it seems to me that it took the anti-spyware upstarts to show the entrenched anti-malware industry how to move forward with protecting against non-replicative malware - and that was to ignore the difficulties in classifying things as trojans and define new, functionally defined subsets where such difficulties would not exist. the problem of trojan protection was and still is hard in the general case, so change the problem to something that's easier and do that. it's not a perfect solution, but it's turned out to be a lot better than nothing. so long as the industry keeps following that pattern and chipping away at the trojan set they'll continue to approach a solution.