30 de junio 2015
Scientists from the IMPPC publish a new tool to help clinicians and researchers to explore The Cancer Genome Atlas data
The technology available to decipher the genetic code and the surrounding factors that make this code readable is improving rapidly and more and more data is being produced and stored daily. In very general terms, the limiting factor for all this data is our ability to interpret it.
Bioinformaticians use computing techniques to "see" patterns through the mass of data points but biologists and medical staff struggle to be able to find meaningful information and very often are not even able to identify the correct technique to find it. Other factors increasing difficulty include the many different data sets in different places.
The Cancer Genome Atlas (TCGA) is a huge database set up to coordinate efforts to store and analyse data. It is an extremely ambitious project containing information on more than 30 types of cancer. Information about the genetic code (genomics) and information about the regulation layer that controls which genes are active at any time (epigenomics) is combined with detailed clinical and pathological information about different cancers. Further data is being added all the time as new analyses of collections of samples are carried out and added. Not only the number of genes involved in a particular cancerous process, but also the factors switching them on or off can vary tremendously, so knowing where to look for what is a major challenge.
In this paper scientists from the Epigenetic Mechanisms of Cancer and Cell Differentiation Group at the IMPPC have presented a tool that allows a wide variety of researchers to have real-time access and to visualize the activity and epigenetic landscape of their gene of interest in any of the tumor types covered by the TCGA and with no need for sophisticated bioinformatic skills. The paper describing this tool has already been ranked highly by the Altimetric webpage, a respected method of evaluation in this field and shows that it has attracted the attention of the community.
Wanderer allows researchers to navigate through the many levels of information available in the database. For example, it can compare information on tumour and non tumour samples for DNA methylation, one of the ways in which genes are switched on or off, to see which genes are active in the two types of tissues. It will carry out statistical analysis and provide graphs of results, which can be downloaded.
It then allows researchers to zoom in or out from different points of the genome, locate specific genes or regions and label them or analyse different aspects. It also facilitates data sharing and links to information in other resources.
Wanderer is an intuitive and easy to use interface for looking at complex data regarding the genetic code and the structures controlling its behaviour for all types of cancer in the TCGA database
Integration and mining of data is a hot topic in cancer research. Many tools that focus on data retrieval and indexing exist and others link layers of TCGA data to clinical information. Tools used by trained bioinformaticians are listed on the TCGA website but often they remain extremely difficult to use for classical experimental biologists and medical doctors who are looking for specific information. Wanderer offers gene-centred access to data in TCGA to a wider audience and the way it displays information is an outstanding feature. It can give an instant view of specific regions of the genome; such quick visualization of this complex information was not possible before and it will help researchers take a peek and then make informed decisions about how to proceed with treating the data and designing experiments. These needs are not restricted to cancer biology and other investigators may want to look in on their favourite genes, especially to see what is happening in normal tissues.
This tool will make some of the work routinely carried out by researchers easier and much quicker. It will help those who don't have training in bioinformatics find the information they are looking for more efficiently, saving them time and allowing them to plan further analysis and future experiments to improve their effectiveness. It is a powerful tool for extracting useful information from valuable sources of data, which contain huge quantities of information that we are only just beginning to be able to extract, organize and act on.
The orginal paper is:
Epigenetics and Chromatin
23 June 2015
Anna Díez-Villanueva, Izaskun Mallona, Miguel A Peinado