Computational methods for the prediction of Major Histocompatibility Complex (MHC) class II binding peptides play an important role in facilitating the understanding of immune recognition and the process of epitope discovery. To develop an effective computational method, we need to consider two important characteristics of the problem: (1) the length of binding peptides is highly flexible; and (2) MHC molecules are extremely polymorphic and for the vast majority of them there are no sufficient training data. Recently we have developed three computational methods for MHC II peptide binding predictions, TEPITOPEpan, MHC2SKpan and MHC2MIL. As a PSSM (position specific scoring matrix) based method, TEPITOPEpan is extended from a classic method, TEPITOPE, to cover all HLA DR molecules. MHC2SKpan is a kernel based method, which makes use of a new string kernel MHC2SK to measure the similarities among peptides with variable lengths. Finally, MHC2MIL is a multiple instance learning based method, which further considers both peptide flanking region and residue positions. Experimental results on multiple benchmark datasets demonstrate the superior performance of these methods. They are freely available at http://datamining-iip.fudan.edu.cn/server.html.
Shanfeng Zhu is an associate professor in School of Computer Science, and Shanghai Key Lab of Intelligent Information Processing at Fudan University. His research focuses on developing and applying data mining and machine learning methods for bioinformatics and information retrieval, especially biomedical text mining, immunological informatics and drug discovery.