Annotation: dorsal/arxiv

Authors	Laurent Jacob, Jean-Philippe Vert
Categories	q-bio.QM
ArXiv ID	q-bio/0702008
URL	https://arxiv.org/abs/q-bio/0702008
Journal	We use various multitask kernels in order to improve MHC-I-peptide binding prediction, in particular for MHC alleles for which few training data is available. (05/02/2007)

Authors

Laurent Jacob, Jean-Philippe Vert

Abstract

Motivation: In silico methods for the prediction of antigenic peptides binding to MHC class I molecules play an increasingly important role in the identification of T-cell epitopes. Statistical and machine learning methods, in particular, are widely used to score candidate epitopes based on their similarity with known epitopes and non epitopes. The genes coding for the MHC molecules, however, are highly polymorphic, and statistical methods have difficulties to build models for alleles with few known epitopes. In this case, recent works have demonstrated the utility of leveraging information across alleles to improve the performance of the prediction. Results: We design a support vector machine algorithm that is able to learn epitope models for all alleles simultaneously, by sharing information across similar alleles. The sharing of information across alleles is controlled by a user-defined measure of similarity between alleles. We show that this similarity can be defined in terms of supertypes, or more directly by comparing key residues known to play a role in the peptide-MHC binding. We illustrate the potential of this approach on various benchmark experiments where it outperforms other state-of-the-art methods.

{ "annotation_id": "325263ee-3b57-4207-8242-65386d28ab7a", "date_created": "2026-03-02T18:01:35.713000Z", "date_modified": "2026-03-02T18:01:35.713000Z", "file_hash": "b5754f7420f0f837c96d119c37f00c929767d402d7489285eef499a31fea62bc", "private": false, "record": { "abstract": "Motivation: In silico methods for the prediction of antigenic peptides\nbinding to MHC class I molecules play an increasingly important role in the\nidentification of T-cell epitopes. Statistical and machine learning methods, in\nparticular, are widely used to score candidate epitopes based on their\nsimilarity with known epitopes and non epitopes. The genes coding for the MHC\nmolecules, however, are highly polymorphic, and statistical methods have\ndifficulties to build models for alleles with few known epitopes. In this case,\nrecent works have demonstrated the utility of leveraging information across\nalleles to improve the performance of the prediction. Results: We design a\nsupport vector machine algorithm that is able to learn epitope models for all\nalleles simultaneously, by sharing information across similar alleles. The\nsharing of information across alleles is controlled by a user-defined measure\nof similarity between alleles. We show that this similarity can be defined in\nterms of supertypes, or more directly by comparing key residues known to play a\nrole in the peptide-MHC binding. We illustrate the potential of this approach\non various benchmark experiments where it outperforms other state-of-the-art\nmethods.", "arxiv_id": "q-bio/0702008", "authors": [ "Laurent Jacob", "Jean-Philippe Vert" ], "categories": [ "q-bio.QM" ], "journal_ref": "We use various multitask kernels in order to improve MHC-I-peptide\n binding prediction, in particular for MHC alleles for which few training data\n is available. (05/02/2007)", "title": "Epitope prediction improved by multitask support vector machines", "url": "https://arxiv.org/abs/q-bio/0702008" }, "schema_id": "dorsal/arxiv", "source": { "execution_id": "ed3862bd-d8d7-491d-9e5d-a50738cf439a", "id": "arXiv Dataset IDs", "type": "Model", "variant": "snapshot-2026-03-01", "version": "0.1.0" }, "user_id": 1000002 }