A new paper in Scientists progress details how scientists managed to map a central part of the immune system – HLA class II molecules – while accurately predicting how they display pathogen fragments on cell surfaces.
When we are sick, the immune system relies on signs on the surface of cells indicating the presence of a foreign body inside. Immune cells, especially T cells, latch onto the cell surface and kill the cancer, virus, or other pathogen present, provided they can determine the threat.
The cells alert the intruder’s immune system with special proteins called human leukocyte antigen (HLA) molecules. They are responsible for letting the immune system know that something is wrong.
“When a cell is infected, everything in it is hidden from the immune system, which lives outside the cells,” explains Morten Nielsen, professor at DTU Health Technology and corresponding author of the article in Scientists progress announcing the mapping of over 96% of the entire HLA class II landscape.
“The reason the body can detect that something is hiding inside the cell is because the HLA class molecules and the fact that they take protein fragments from the pathogen inside the cell, transport them to the surface and expose them. If the fragments have properties that are not recognizable, the immune system triggers a reaction that kills the cell.
“But the rules by which protein fragments are displayed and which are not, as well as their other properties, have been very unclear for many years because there are many different HLA variants. One could argue that there are more 50,000 ways to display our protein fragments.”
Nielsen has been working on HLA for 20 years and has made significant contributions to the process of developing treatments to help and train the immune system to fight disease. Much of the progress made in cancer immunotherapy is linked to tools developed by Nielsen.
In the paper titled “Accurate prediction of HLA class II antigen presentation across all loci using tailored data acquisition and refined machine learning,” scientists from DTU, the University of Oklahoma, Leiden University and the pureMHC company successfully complete the mapping of the entire system, or, as it is called in the article, the “specificity tree” of HLA II class.
20 years of preparation
It took 20 years to complete the landscape map of HLA class specifics for several reasons. On the one hand, they are never the same from one person to another. Their genes differ greatly, so different people have different HLA types that recognize different parts of a pathogen.
Although they all play a central role in the functioning of the immune system by displaying protein fragments, they affect health in different ways. Some make us more susceptible to autoimmune diseases, in which the immune system attacks the body. Some increase the likelihood of rejecting an organ transplant. Some affect how the immune system responds to treatments such as vaccines or medications.
Additionally, each HLA class II molecule has two parts: an alpha part and a beta part. They in turn come from three different groups of genes: DR, DP and DQ. The DR group has one main gene, DRB1, and three other genes, DRB3, DRB4 and DRB5. The DP and DQ groups have two genes, DPA and DPB and DQA and DQB. The alpha and beta parts can come from the same gene or from different chromosomes.
Sometimes it was stipulated that knowledge of DRB1 was sufficient or that other combinations were less important to characterize the HLA class II functional space. However, several other class II HLAs appear to play an essential role, for example in autoimmune diseases and in the non-repulsion of transplanted organs. They may also be essential in the treatment of other diseases, which is why interest in creating immunotherapy treatments that recognize them is increasing.
Regardless, there are many possible combinations in the HLA class II system, and as only DRB1 molecules have been studied and mapped extensively, understanding of the entire HLA class II complex is lacking. .
Large-scale datasets and machine learning
To understand how the myriad of HLA class II genes affect health, Nielsen and his colleagues needed to know what types of pathogens they recognize and how they present them to our immune systems. To make this latter effort and understand the rules defining HLA class II, they integrated large-scale, high-quality datasets covering a wide variety of HLA class II molecules and their specificities. They used bespoke machine learning frameworks, improving the ability to accurately predict how they would work.
“Twenty years ago, we were looking at 500 data points from a molecule, but we quickly realized that there were rules to follow. We didn’t need to measure everything. So, gradually, our understanding has grown, as has the technology available. “We have gone from our first paper containing one molecule to our latest paper, which covers 50,000 molecules. All are described in detail,” says Nielsen.
“We have overcome all obstacles and fully understand the role of each HLA class II molecule. For example, our tools have been used for 15 years in the development of cancer immunotherapy and have served as a cornerstone for many companies developing cancer vaccines. the tools are the most used.”
“With the current paper, we now offer a comprehensive toolbox, a toolbox that can also be used for viral infections or autoimmune diseases. There will still be a lot of research in this area, but in terms of conceptually, I think the journey is over, and I don’t believe anything else can happen.”
More information:
Jonas B. Nilsson et al, Accurate prediction of HLA class II antigen presentation across all loci using tailored data acquisition and refined machine learning, Scientists progress (2023). DOI: 10.1126/sciadv.adj6367
Provided by the Technical University of Denmark
Quote: Scientists map the antigenic landscape (November 27, 2023) retrieved November 27, 2023 from
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.