Online Virus Tracking Tool Nextstrain Wins Inaugural Open Science Prize

Credit: Trevor Bedford and Richard Neher,

Over the past decade, scientists and clinicians have eagerly deposited their burgeoning biomedical data into publicly accessible databases. However, a lack of computational tools for sharing and synthesizing the data has prevented this wealth of information from being fully utilized.

In an attempt to unleash the power of open-access data, the National Institutes of Health, in collaboration with the Howard Hughes Medical Institute and Britain’s Wellcome Trust, launched the Open Science Prize Exit icon. Last week, after a multi-stage public voting process, the inaugural award was announced. The winner of the grand prize—and $230,000—is a prototype computational tool called nextstrain Exit icon that tracks the spread of emerging viruses such as Ebola and Zika. This tool could be especially valuable in revealing the transmission patterns and geographic spread of new outbreaks before vaccines are available, such as during the 2013-2016 Ebola epidemic and the current Zika epidemic.

An international team of scientists—led by NIGMS grantee Trevor Bedford Exit icon of the Fred Hutchinson Cancer Research Center, Seattle, and Richard Neher Exit icon of Biozentrum at the University of Basel, Switzerland—developed nextstrain as an open-access system capable of sharing and analyzing viral genomes. The system mines viral genome sequence data that researchers have made publicly available online. nextstrain then rapidly determines the evolutionary relationships among all the viruses in its database and displays the results of its analyses on an interactive public website.

The image here shows nextstrain’s analysis of the genomes from Zika virus obtained in 25 countries over the past few years. Plotting the relatedness of these viral strains on a timeline provides investigators a sense of how the virus has spread and evolved, and which strains are genetically similar. Researchers can upload genome sequences of newly discovered viral strains—in this case Zika—and find out in short order how their new strain relates to previously discovered strains, which could potentially impact treatment decisions.

Nearly 100 interdisciplinary teams comprising 450 innovators from 45 nations competed for the Open Science Prize. More than 3,500 people from six continents voted online for the winner. Other finalists for the prize focused on brain maps Exit icon, gene discovery Exit icon, air-quality monitoring Exit icon, neuroimaging Exit icon and drug discovery Exit icon.

nextstrain was funded in part by NIH under grant U54GM111274.

El Niño Season Temperatures Linked to Dengue Epidemics

Screen shot from the video showing dengue incidence in Southeast Asia.
Incidence of dengue fever across Southeast Asia, 1993-2010. Note increasing incidence (red) starting about June 1997, which corresponds to a period of higher temperatures driven by a strong El Niño season. At the end of the El Niño event, in January 1999, dengue incidence is much lower (green). Credit: Wilbert van Panhuis, University of Pittsburgh.

Weather forecasters are already warning about an intense El Niño season that’s expected to alter precipitation levels and temperatures worldwide. El Niño seasons, characterized by warmer Pacific Ocean water along the equator, may impact the spread of some infectious diseases transmitted by mosquitoes.

In a study published last month in the Proceedings of the National Academy of Sciences, researchers reported a link between intense dengue fever epidemics in Southeast Asia and the high temperatures that a previous El Niño weather event brought to that region.

Dengue fever, a viral infection transmitted by the Aedes mosquito, can cause life-threatening high fever, severe joint pain and bleeding. Infection rates soar every two to five years. Interested in understanding why, an international team of researchers collected and analyzed incidence reports including 3.5 million dengue fever cases across eight Southeast Asian countries spanning an 18-year period. The study is part of Project Tycho, an effort to study disease transmission dynamics by mining historical data and making that data freely available to others. Continue reading

Data-Mining Study Explores Health Outcomes from Common Heartburn Drugs

Results of a data-mining study suggest a link between a common heartburn drug and heart attacks. Credit: Stock image.

Scouring through anonymized health records of millions of Americans, data-mining scientists found an association between a common heartburn drug and an elevated risk for heart attacks. Their preliminary results suggest that there may be a link between the two factors.

For 60 million Americans, heartburn is a painful and common occurrence caused by stomach acid rising through the esophagus. It’s treated by drugs such as proton-pump inhibitors (PPIs) that lower acid production in the stomach. Taken by about one in every 14 Americans, PPIs, which include Nexium and Prilosec, are the most popular class of heartburn drugs.

PPIs have long been thought to be completely safe for most users. But a preliminary laboratory study published in 2013 suggested that this may not be the case. The study, led by a team of researchers at Stanford University, showed that PPIs could affect biochemical reactions outside of their regular acid suppression action that would have harmful effects on the heart. Continue reading

Digging Deeply Into Data for the Causes of Disease

Hunting for the cause of a disease can be like tracing a river back to its many sources. Myriad factors, large and small, may contribute to a condition. One approach to the search focuses on the massive amounts of genomic and other biological data that scientists are gathering in the course of their studies. To examine this data and look for meaningful patterns and other clues, scientists turn to bioinformatics, a field focused on the development of analytical methods and software tools.

Here are a few examples of how National Institutes of Health-funded scientists are using bioinformatics to dig deeply into data and learn more about the development of diseases, including Huntington’s, preeclampsia and asthma.

Huntington’s Disease

Network of proteins that interact with huntingtin

Researchers have mapped a network of 2,141 proteins that all interact either directly or through one other protein with huntingtin (red), the protein associated with Huntington’s disease. Credit: Cendrine Tourette, Buck Institute for Research on Aging, J Biol Chem 2014 Mar 7;289(10):6709-26 Exit icon.

The cause of Huntington’s disease, a degenerative neurological disorder with no known cure, may appear simple. It begins with a change in a single gene that alters the shape and functioning of the huntingtin protein. But this protein, whether in its normal or altered form, does not act alone. It interacts with other proteins, which in turn interact with others.

A research team led by Robert Hughes of the Buck Institute for Research on Aging set out to understand how this ripple effect contributes to the breakdown in normal cellular function associated with Huntington’s disease. The scientists used experimental and computational approaches to map a network of 2,141 proteins that interact with the huntingtin protein either directly or through one other protein. They found that many of these proteins were involved in cell movement and intercellular communication. Understanding how the huntingtin protein leads to mistakes in these cellular processes could help scientists pursue new approaches to developing treatments. Continue reading

Simulating the Potential Spread of Measles

Try out FRED Measles:

  1. Go to http://fred.publichealth.
    Exit icon
  2. Select “Get Started”
  3. Pick a state and city
  4. Play both simulations

To help the public better understand how measles can spread, a team of infectious disease computer modelers at the University of Pittsburgh has launched a free, mobile-friendly tool that lets users simulate measles outbreaks in cities across the country.

The tool is part of the Pitt team’s Framework for Reconstructing Epidemiological Dynamics, or FRED, that it previously developed to simulate flu epidemics. FRED is based on anonymized U.S. census data that captures demographic and geographic distributions of different communities. It also incorporates details about the simulated disease, such as how contagious it is.

Screenshot of the FRED simulation.

A free, mobile-friendly tool lets users simulate potential measles outbreaks in cities across the country. Credit: University of Pittsburgh Graduate School of Public Health.

Continue reading

Forecasting Infectious Disease Spread with Web Data

Just as you might turn to Twitter or Facebook for a pulse on what’s happening around you, researchers involved in an infectious disease computational modeling project are turning to anonymized social media and other publicly available Web data to improve their ability to forecast emerging outbreaks and develop tools that can help health officials as they respond.

Mining Wikipedia Data

Screen shot of the Wikipedia site
Incorporating real-time, anonymized data from Wikipedia and other novel sources of information is aiding efforts to forecast and respond to emerging outbreaks. Credit: Stock image.

“When it comes to infectious disease forecasting, getting ahead of the curve is problematic because data from official public health sources is retrospective,” says Irene Eckstrand of the National Institutes of Health, which funds the project, called Models of Infectious Disease Agent Study (MIDAS). “Incorporating real-time, anonymized data from social media and other Web sources into disease modeling tools may be helpful, but it also presents challenges.”

To help evaluate the Web’s potential for improving infectious disease forecasting efforts, MIDAS researcher Sara Del Valle of Los Alamos National Laboratory conducted proof-of-concept experiments involving data that Wikipedia releases hourly to any interested party. Del Valle’s research group built models based on the page view histories of disease-related Wikipedia pages in seven languages. The scientists tested the new models against their other models, which rely on official health data reported from countries using those languages. By comparing the outcomes of the different modeling approaches, the Los Alamos team concluded that the Wikipedia-based modeling results for flu and dengue fever performed better than those for other diseases. Continue reading

Raking the Family Tree for Disease-Causing Variations

Silhouettes of people with nucleic acid sequences and a stethoscope.
A new software tool analyzes disease-causing genetic variations within a family. Credit: NIH’s National Human Genome Research Institute.

Changes in your DNA sequence occur randomly and rarely. But when they do happen, they can increase your risk of developing common, complex diseases, such as cancer. One way to identify disease-causing variations is to study the genomes of family members, since the changes typically are passed down to subsequent generations.

To rake through a family tree for genetic variations with the highest probabilities of causing a disease, researchers combined several commonly-used statistical methods into a new software tool called pVAAST. The scientific team, which included Mark Yandell and Lynn Jorde of the University of Utah and Chad Huff of the University of Texas MD Anderson Cancer Center, used the tool to identify the genetic causes of a chronic intestinal inflammation disease and of developmental defects affecting the heart, face and limbs.

The results confirmed previously identified genetic variations for the developmental diseases and pinpointed a previously unknown variation for the intestinal inflammation. Together, the findings confirm the ability of the tool to detect disease-causing genetic changes within a family. Another research team has already used the software tool to discover rare genetic changes associated with family cases of breast cancer. These studies are likely just the beginning for studying genetic patterns of diseases than run in a family.

This work also was funded by NIH’s National Institute of Diabetes and Digestive and Kidney Diseases; National Cancer Institute; National Human Genome Research Institute; National Heart, Lung, and Blood Institute; and National Institute of Mental Health.

Learn more:
University of Utah News Release (no longer available)
Yandell Exit icon, Jorde Exit icon and Huff Exit icon Labs

Understanding Complex Diseases Through Computation

Scientists developed a computational method that could help identify various subtypes of complex diseases. Credit: Stock image

Complex diseases such as diabetes, cancer and asthma are caused by the intricate interplay of genetic, environmental and lifestyle factors that vary among affected individuals. As a result, the same medications may not work for every patient. Now, scientists have shown that a computational method capable of analyzing more than 100 clinical variables for a large group of people can identify various subtypes of asthma, which could ultimately lead to more targeted and personalized treatments. The research team, led by Wei Wu Exit icon of Carnegie Mellon University and Sally Wenzel of the University of Pittsburgh, used a computational approach developed by Wu to identify several patient clusters consistent with known subtypes of asthma, as well as a possible new subtype of severe asthma that does not respond well to conventional drug treatment. If supported by further studies, the researchers’ proposed approach could help improve the understanding, diagnosis and treatment not just of asthma but of other complex diseases.

This work also was funded by NIH’s National Heart, Lung, and Blood Institute.

Learn more:
Carnegie Mellon University News Release Exit icon