For most human proteins, there are no small molecules known to bind them chemically (so called “ligands”). Ligands frequently represent important starting points for drug development but this knowledge gap critically hampers the development of novel medicines. Researchers at CeMM, in a collaboration with Pfizer, have now leveraged and scaled a method to measure the binding activity of hundreds of small molecules against thousands of human proteins. This large-scale study revealed tens of thousands of ligand-protein interactions that can now be explored for the development of chemical tools and therapeutics. Moreover, powered by machine learning and artificial intelligence, it allows unbiased predictions of how small molecules interact with all proteins present in living human cells. These groundbreaking results have been published in the journal Science, and all generated data and models are freely available for the scientific community.
The majority of all drugs are small molecules that influence the activity of proteins. These small molecules — if well understood — are also invaluable tools to characterize the behavior of proteins and to do basic biological research. Given these essential roles, it is surprising that for more than 80 percent of all proteins, no small-molecule binders have been identified so far. This hinders the development of novel drugs and therapeutic strategies, but likewise prevents novel biological insights into health and disease.
To close this gap, researchers at CeMM in collaboration with Pfizer have expanded and scaled an experimental platform that enables them to measure how hundreds of small molecules with various chemical structures interact with all expressed proteins in living cells. This yielded a rich catalog of tens of thousands of ligand-protein interactions than can now be further optimized to represent starting points for further therapeutic development. In their study, the team led by CeMM PI Georg Winter has exemplified this by developing small-molecule binders of cellular transporters, components of the cellular degradation machinery and to understudied proteins involved in cellular signal transduction. Moreover, taking advantage of the large dataset, machine learning and artificial intelligence models were developed that can predict how additional small molecules interact with proteins expressed in living human cells.
“We were amazed to see how artificial intelligence and machine learning can elevate our understanding of small-molecule behavior in human cells. We hope that our catalog of small molecule-protein interactions and the associated artificial intelligence models can now provide a shortcut in drug discovery approaches,” says Georg Winter. To maximize the potential impact and usefulness for the scientific community, all data and models are made freely available through a web application. “This was an outstanding partnership between industry and academia. We are delighted to present the results which were obtained through three years of close collaboration and teamwork between the groups. It’s been a great project,” says Dr Patrick Verhoest, Vice President and Head of Medicine Design at Pfizer.