Friday, January 11, 2008

vulnerability research vs. malware research

well, i said i was tempted to write a post on this topic, and even though it turns out i've written about it several times in the past (viruses and disclosure, malware and disclosure, full disclosure to name but a few) i'm going to try it again in hopes that after this one i won't ever have to repeat myself on this subject again (yeah, as if)...

as always, the most interesting debates wind up being about meaning, so lets define some terms for the sake of this discussion... a vulnerability is in essence a mistake that allows a component or service of a system to behave in an unexpected and unintended way that has a negative impact on the security of the system (i know not everyone ascribes to the mistake definition, but it's useful to draw a distinction between those vulnerabilities which can be fixed and those that can't - rest assured i will cover the alternative later)... vulnerability research, then, is the process of discovering these mistakes; sometimes by looking at source code, sometimes by reverse engineering the vulnerable system, sometimes even by accidentally or intentionally triggering the unintended behaviour... now there's a pretty broad consensus (with the exception of the bad guys) that vulnerabilities are bad so naturally we want to get rid of them, and because they're mistakes they can be fixed... however, because vulnerability research is often performed by people other than those who made the vulnerable system in the first place, it becomes necessary to find a way to demonstrate the existence of the vulnerability to others... further, those tasked with fixing the vulnerability require a reliable means of triggering the unintended behaviour in order to determine why that behaviour occurs and in order to ensure that whatever fix they put in place actually fixes the vulnerability... therefore we need an exploit to trigger the unintended behaviour so that the vulnerability can be demonstrated to others to convince them that a fix is needed, and also to test for the presence of the vulnerability when developing (or even applying) a fix...

of course with those uses in mind it's important to make the distinction that what we want in the way of exploits are actually benign exploits (as opposed to malicious/weaponized ones)... for example, when dealing with an arbitrary code execution vulnerability, an exploit that launches notepad is preferable to one that launches tftp with command line parameters to download something malicious from the internet and then subsequently launch that... you see exploits can be used for malicious purposes as well as the positive ones already described, and the less benign the exploit is the easier it is for a bad guy to do something bad with it...

now so far i've described exploits as things that are able to demonstrate the existence of a mistake that we want to fix... clearly when that mistake isn't present the exploit shouldn't work so we can say that the exploit depends on the existence of the mistake/vulnerability...

malware is a different beast entirely because it doesn't (in general) depend on the existence of a mistake or vulnerability... what does it depend on? well, from one of the seminal pieces of research on viruses we know that self-replicating malware depends on 3 things: the ability to share data (necessary for the attacker to get his malicious software from his own machine to the victim's machine), the ability to pass data that's been shared with you along to others (sharing transitivity - necessary for self-replicating malware to spread or pass itself along), and the ability to interpret data that's been shared with you as program code (ie. the ability to add to or change the set of things the computer is willing to execute - the generality of interpretation - necessary for the computer to be able to execute the new incoming malware)... none of these things can really be considered flaws or mistakes, in fact it's hard to imagine computers being as useful as they are without them... without sharing there would be no internet, no floppy/cd/dvd drives or printers (they would allow sharing over the sneakernet), no store-bought software (you'd have to make all the software, including the operating system, yourself), etc... not being able to pass anything along would be equally bizarre (though that's the world that hollywood envisions and tries to make real through the use of DRM), it would also make the division of labour very hard to manage because hierarchies would be impossible (your boss wouldn't be able to pass along to you anything his boss had given him, nor would he be able to pass along to his boss anything you had produced)... not being able to interpret data as program code would be the most profound change of all as it is this property that makes the general purpose computer a general purpose computer rather than a pocket calculator - it allows a single computer to be flexible enough to be used for many changing purposes and without that flexibility we'd need a different piece of hardware specially designed for each and every new task that came our way... it should be clear that these are things we neither can nor want to 'fix' and so self-replicating malware will always be possible... it turns out that all sharing transitivity does for self-replicating malware is allow it to spread, so we can therefore say that in general non-replicative malware doesn't need it (though that doesn't necessarily mean some specific instances won't use it, such as for the command and control of more sophisticated botnets)... there are no additional dependencies worth mentioning for non-replicative malware in general (the exception being exploits, which can be considered a type of malware and obviously depend on vulnerabilities) because the underlying functionality they use is the same functionality used by normal software, it's just that that functionality is used to perform actions we don't want performed... communicating with a remote site the way RATs, botnets, and spyware does is just sharing of messages, and we've already covered sharing... writing to the hard disk is pretty fundamental and can even be considered a kind of sharing between sessions or a prerequisite of media-based sharing... outputting to the screen (something adware necessarily has to do) is necessary in order for the user to properly interact with the computer (and really all input and output can be considered as falling under the umbrella of sharing)... since there are no dependencies in general that can be 'fixed', this case therefore covers the alternative of the vulnerability that isn't a mistake...

now a clever reader at this point would be asking him/herself how much sense it makes to perform the previously described type of vulnerability research on a vulnerability that can't be fixed... a knowledgeable reader would know the answer: it doesn't make sense to perform that sort of research... an argumentative reader might point out that such malware can exploit fixable vulnerabilities too, but the answer to that is to research the fixable vulnerability and look at the exploit for that on its own rather than connected to more general malware...

so how does that change the nature of research around malware? well for one thing the previously discussed benefits of creating benign exploits don't apply in malware research because they were all predicated on the idea that what allowed the thing to exist could be fixed... we know that's not generally true for malware and in the cases where it is true it's because the malware contains an exploit that can exist separate from the malware and be used that way instead of connected to the malware (which only serves to make the exploit less benign)... additionally, because malware in general depends on non-fixable properties of a system (not to mention they're less benign), the benefits that society reaps from sharing benign exploits with the public at large also don't apply to malware for essentially the same reason...

another, more intrinsic difference between the two research fields is that in vulnerability research we create a special kind of malware called exploits as indicators of the presence of the vulnerability but in malware research we do not create any kind of malware as an indicator of the presence of malware (that just wouldn't make sense)... creating malware in the course of doing malware research would actually be analogous to creating vulnerabilities in the course of doing vulnerability research... the reason we create indicators for vulnerabilities is that the vulnerabilities themselves are not exactly visible or shareable, they aren't distinct entities in and of themselves, whereas if i want to show you a piece of malware i give you the actual malware...

a third difference between the two research fields is that there is (according to popular informed opinion) a discrete, finite number of vulnerabilities while the number of possible malware is, theoretically, countably infinite - limited in practice only by the available storage space... so while it may make a certain kind of sense to enumerate vulnerabilities, making an exploit for each one so that the vulnerability can be mitigated (hopefully before it gets used by the bad guys), it's a very different idea to try and enumerate all possible malware in hopes of being able to mitigate all of them (because there's just too many of them and because there are so many, most will actually never actually be made by either the good guys or the bad guys)... furthermore, while it might seem feasible (or at least within the realm of hope) to stay ahead of the bad guys with regards to vulnerabilities, no matter how many pieces of malware we preemptively make in order to mitigate them the chances that we will block a bad guy by beating him to the creation of a particular piece of malware is essentially zero... the malware we might make is practically guaranteed to be different from the malware the bad guys will make...

by now you should be starting to see how different malware research is from vulnerability research and how pointless creating malware for the purposes of malware research is compared to creating benign exploits for the purposes of vulnerability research... instead, malware research focuses on malware that has already been made and classes of malware that already exist... rather than creating malware and hoping in vain that the bad guys malware will have the same properties, malware researchers share data with each other about new malware trends while they're still in their early stages and make educated guesses about what the next step will be... sometimes their predictions are right (such as storm botnet nodes being used in phishing, which was predicted when it was observed that the botnet was becoming segmented) and sometimes they aren't (or at least aren't yet, like some of the more ambitious predictions about mobile malware)... as such, malware researchers also look at the human attackers creating/using the malware - trying to guess their motivations, gaging their skill level, determining their connections with others, etc, because all those things are important in predicting what they might do next... enumerating possible future malware is not nearly as useful or accurate as a predictive tool in malware research as it's counterpart is in vulnerability research...

given these profound differences, it really makes very little sense to treat malware research the same as vulnerability research or malware in general the same as a benign exploit...

0 comments: