Yesterday, we saw yet another example of how antivirus — not malicious code — can leave thousands of PCs useless.
What intended to be a routine McAfee software update to its antivirus definitions for corporate customers has likely turned into a costly nightmare for the antivirus software maker and many of its customers. Instead of updating the security software, the faulty virus definitions removed the Svchost.exe file, a critical component of the Windows operating system.
According to the article, “Defective McAfee update causes worldwide meltdown of XP PCs,” this points to the severity of the problem.
“Now, it is hard to imagine picking a more crucial file to torpedo.
Svchost.exeis one of the most crucial of all Windows system files. It hosts the services that make just about every OS function possible. As the symptoms described here suggest, Windows simply won’t start ifSvchost.exeisn’t there.”
As a result, affected systems were left endlessly rebooting until tech support repaired the problem manually. Early reports have estimated tens of thousands of machines were affected worldwide. McAfee’s official recommendation for repairing the damage involved copying Svchost.exe from a working machine and manually copying it to an affected system.
If anything, what yesterday’s incident highlights the fact that antivirus is not designed to stop any threat — even their own code — from doing harm.
Believe it or not, the McAfee debacle could have been avoided with application whitelisting, which doesn’t allow any unauthorized applications to run on a system. For example, in its default setting, CoreTrace’s BOUNCER application whitelisting solution prevents the deletion or modification of any whitelisted executables — which certainly includes critical OS files like Svchost.exe. In other words, machines protected by BOUNCER were working today rather than spending time in a reboot loop.
Toney,
Good post. Let me add a quick thought (or two :-)….
Certainly better QA could have/should have eliminated this issue.
Likely effective Application Whitelisting would have caught the problem before the endless boot loop as well (as you point out)
But it also clearly points to another risk.
There are several companies that confuse blacklist with graylist, and graylist with whitelist.
Some of the larger players are attempting to portray “reputation-based” lists as whitelists. It is our belief that they ARE NOT synonymous. Graylist’s and Reputation-based offerings still suffer from the risk of False Positives, i.e. Believing something is bad, and stopping it, when it is actually good.
So a true whitelist must have known-provenance. Essentially code does have DNA. And the DNA is traceable to the source (the author or ISV). In order to minimize false positives, one must drive to the highest level of supply chain/provenance assurance. Traceable software DNA is a must in order to fully rely on these evolving security and compliance methods.
Adopting true whitelisting would have saved McAfee the embarrassment of this situation, not to mention the real market cap impact (short term) of over $100m. It could be more than $100m longer term if they don’t resolve these systemic issues.
Other graylist and reputation-service suppliers of whitelist enforcement should learn from this very expensive market lesson. And buyers need to know the differences here as they shift to Security 2.0 methods, such as proactive whitelisting.
Not all “whitelists” are created equally.
Wyatt.
Wyatt, Okay, hold it. If a product is on a reputation based ‘whitelist’ and I use it, how is it going to be hurt by a false positive action which you say would thereby stop it from functioning even though it is actually good? Your post sounds like your mixing what goes onto a whitelist with what gets prevented from running by a whitelist. But my understanding is that if it’s ON a whitelist, then it operates. So if it’s on the whitelist, however the list was generated, whether by reputation (ATT.com) or traceable software dna (digital watermarking), it is allowed to run and does NOT get blocked. I guess I need some clarification what you are trying to say, Wyatt. Thanks, Greg
Greg, sorry if I wasn’t clear.
There are really two issues here.
First the principles of false positives and false negatives work differently with Blacklist and Whitelists.
Recall that the principle of Type 1 (T1) and Type 2 (T2) errors:
T1 or False Positive is classifying something as true when it isn’t.
T2 or False Negative is classifying something as false with it isn’t.
With Blacklists (deny methods) – False Positive is stopping something from running when it is actually good. (good code is denied). With Whitelists (allow methods) – False Positive is allowing something to be run when it is really bad (bad code is allowed)
So yes Greg, what you’re saying is correct:
With blacklists, if it is on the list it is blocked. And with whitelists, if it is on the list it is allowed.
The question in both cases is how is the list created? To what degree am I certain that the item that is on the list belongs there? So list certainty is another critical dimension of both blacklisting and whitelisting. And false positives are bad in both cases.
So if I am a blacklist/deny vendor I want to make sure something good doesn’t get on my list. (the McAfee issue).
And if I am a whitelist/allow vendor I want to make sure something bad doesn’t get on my list.
The question of Reputation vs. Provenance is really the question of HOW one selects WHAT GETS ON the White or allow LIST.
Graylist methods are risky (the promiscuous and indirect gathering of whitelist object in the wild) as it is often ambiguous as to the source and quality of those measurements. Reputational methods can be risky as well, depending on HOW that reputation is established. Often it is based on PREVALENCE, i.e. how frequently have I seen this code.
Provenance-based whitelist creation is based on very direct extraction of the software DNA from sources as close as possible to the source (the author//ISV) and where the chain of custody of that software DNA is carefully protected so that the confidence of true source is as high as possible.
If we don’t have the highest possible quality lists on which to base our allow / deny methods, we are simply adding to our risk and uncertainty. Effective policy decisions (allow/deny) completely depend on good data (both quality and quantity — but the quantity question I’ll reserve for another day).
That was my point. I am sorry if I was not clear. For more on the subject, please read my blog: http://signacert.wordpress.com/2009/06/
Wyatt.
The Whitelist product I used did not allow changed files to run until they were approved. In my case I would allow updates on a couple test PCs, then if there were no issues I would approve those files for the rest of the network. The files were identified by their HASH not just their name. I would perform the same steps for Microsoft Updates as well. You can have the best product in the world, but if you open holes it is worthless.