Steve Clayton: Fighting Email Spam Is Helping Search For HIV Vaccine
A new Microsoft tool that was initially developed to be a complex email spam filter is now being used to further AIDS research with its data mining algorithm being used to do specific cell analysis that helps detail virus patterns.
Helping tackle some of the most urgent global challenges is firmly on the agenda for Microsoft Research (MSR) – applying our experience, our depth and breadth of expertise, our partnerships and the power of software to tackle these challenges. In the case of searching for a vaccine for HIV, all of this comes in to play as we’re applying high-powered computation, fighting HIV with data and perhaps surprisingly, using our experience of building email spam filters to find a solution.
More than 1.8 million people die of HIV-related causes each year — approximately 5,000 deaths per day. One of the great challenges in fighting HIV is that the virus is constantly mutating to avoid attack by the immune system — so much so that it can change as much within one infected person as the influenza virus has throughout recorded history. This makes it incredibly difficult to accurately analyze the virus and develop therapies that attack its elusive weak points. Each mutation means another variable to identify and understand. To complicate things even more, individual immune response varies greatly; some people’s immune systems are able to robustly combat the virus, allowing them to live for years without treatment, while others become sick more quickly as their bodies fail to resist the invasive attack.
As with many MSR projects, this work is being undertaken in partnership with several other experts in the field. Testing of a vaccine in Durban, South Africa is being led by Bruce Walker, at the Ragon Institute at Massachusetts General Hospital, MIT and Harvard, and a professor of medicine at the University of KwaZulu-Natal. They are joined by the Centre for the AIDS Programme of Research in South Africa and the KwaZulu-Natal Research Institute for Tuberculosis and HIV. This testing program generates vast quantities of data, the analysis of which presents a huge hurdle.
That’s where David Heckerman and Jonathan Carlson of Microsoft Research along with a Microsoft Computational Biology Tool called PhyloD come in. This software enables efficient data mining which then leads to specific cell analysis that helps detail virus patterns for further analysis. PhyloD contains an algorithm, code and visualization tools to perform complex pattern recognition and analysis – enabling Heckerman and his colleagues to learn how different individual immune systems respond to the many mutations of the virus.
Ordinarily the computing power required to process the number of variables and possible correlations would take years, but combining the PhyloD tool with Microsoft’s high-performance computing center, this work can be done in hours. Working with two groups, one led by Christian Brander in Barcelona, and another led by Paul Goepfert at the University of Alabama, Heckerman and his team have discovered roughly six times as many possible attack points on the HIV virus as had been previously identified.
When we first met Bruce, he had a very tricky problem to analyze. He had this great data set but he didn’t know how to analyze it. We happened to have just the right algorithm for it and this large bank of computers at Microsoft that could do this massive amount of computation. He gave us the problem on Friday. On Monday, we had a completed analysis for him.
What I personally find most fascinating about this work is that it builds on previous work Heckerman undertook to build an email spam filter. The same principles that are being used to fight spam in Hotmail, Outlook and Exchange are being used to tackle HIV. It turns out there are a lot of similarities between the way spammers evolve their approaches to avoid filters and the way that the HIV virus is constantly mutating.
Continue reading here.
Originally posted on Next at Microsoft.