Wednesday, January 20, 2010

the myth of in-the-wild prevalence

upon reading this article at ghacks.net about scanning linux systems for viruses i became aware that there are some misunderstandings over the meaning of the term 'in the wild'.

the article in question is not the only place i've seen these misunderstandings and i don't want to knock it too hard because it does advise scanning your linux systems, but the statement that
Linux is immune to viruses right? Well…mostly. Even though a proof of concept virus has been discussed, and nothing has actually made it into the wild…you still have email on your system.
fairly clearly indicates both a lack of awareness of the threat linux faces as well as a lack of understanding of what constitutes as 'in the wild'.

so let's get this out of the way early in the discussion. 'In the wild' means literally that the malware in question is active and victimizing someone or some group, somewhere in the real world. that seems like an obvious and natural definition but what isn't obvious is the implication that that has for most people. you see many people equate 'in the wild' with epidemic. they think that if something were really in the wild it would have affected a lot of people and they would have seen it personally or known someone who had seen it. they think that they can use their own experience as a measure of whether something is 'in the wild' or not. the reality is that something being 'in the wild' does not mean that that something is common enough for you to have stumbled across it - there is a wide spectrum of prevalence possibilities for 'in the wild' malware.

to that end, there have of course been linux viruses in the wild. are there still some in the wild? well given that old viruses never really die, i'm going to have to say yes. remember, rare and 'in the wild' are not mutually exclusive concepts - something can be both at the same time. once something goes into the wild it's subsequently very difficult to conclusively show it has left the wild. in fact you could say it's equivalent to proving a negative (which, as we all know, is impossible).

(note: just to be clear, i'm not talking about the wildlist from wildlist.org. things that are on the wildlist are definitely 'in the wild' but not everything 'in the wild' gets to go on the wildlist. the wildlist is a much more narrowly defined set than what's 'in the wild')

2 comments:

Didier Stevens said...

I consider malware used in targeted attacks (malware designed to attack one organisation and "delivered" to the users of this organisation) not to be in the wild. What do you think? Not because of its prevalence, but because it is delivered to a small, identified group.

kurt wismer said...

@didier stevens:
i think that complicates things unnecessarily.

in the wild pairs naturally with in the lab. if something is found active outside of the lab it seems natural to me to consider that 'in the wild'.

if we adopt your school of thought then we have to consider a 3rd place where you might find malware and i'm not sure that a 3rd location adds any benefits. 'in the wild' denotes the possibility that people in the real world may be affected by it. even though something is only used in targeted attacks, people in the real world may still be affected by it simply because they may be the targets.

additionally, there are times when targeted attacks use malware that would be considered 'in the wild' regardless of the targeted attack itself, so it seems to me that targeted attacks are a bit of a non-sequitur with regards to whether something is in the wild or not.