南湖新闻网讯(通讯员 罗志辉)6月13日,我校生命科学技术学院、生物医学与健康学院、湖北洪山实验室陈振夏教授实验室联合信息学院章文教授研究开发了基于文本挖掘的药物富集分析方法DSEATM,该研究可用于揭示疾病的致病机制。相关成果以“DSEATM:drug set enrichment analysis uncovering disease mechanisms by biomedical text mining”为题发表在Briefings in bioinformatics上。

图1 DSEATM药物富集分析流程
Disease pathogenesis is always a major topic in biomedical research. With the exponential growth of biomedical information, drug effect analysis for specific phenotypes has shown great promise in uncovering disease-associated pathways. However, this method has only been applied to a limited number of drugs. Here, we extracted the data of 4634 diseases, 3671 drugs, 112,809 disease–drug associations and 81,527 drug–gene associations by text mining of 29,168,919 publications. On this basis, we proposed a ‘Drug Set Enrichment Analysis by Text Mining (DSEATM)’ pipeline and applied it to 3250 diseases, which outperformed the state-of-theart method. Furthermore, diseases pathways enriched by DSEATM were similar to those obtained using the TCGA cancer RNA-seq differentially expressed genes. In addition, the drug number, which showed a remarkable positive correlation of 0.73 with the AUC, plays a determining role in the performance of DSEATM. Taken together, DSEATM is an auspicious and accurate disease research tool that offers fresh insights.