Using Machine Learning to identify and improve protein-ligand binding affinity predictions

AI is expected to play an increasingly important role in the years to come, especially in biological studies. As a major part of AI, neural networks cannot achieve such success without the support of enormous data.

The huge advancements in biological sciences and technologies has led to the accumulation of unprecedented amounts of biomolecular data. For example, in a protein data bank, there are about 150,000 three-dimensional biomolecular structures (Berman, Westbrook et al. 2000). There is an abundance of available biological structures, data analysis methods and models, including data mining, manifold learning and graph or network models. Leveraging the data, topological data analysis (TDA), for example, can potentially provide great promise in the big data era and have become increasingly popular in bioinformatics and computational biology in the past two decades.

Data-driven learning models are among the most important and rapidly evolving areas in chemoinformatics and bioinformatics. Featurization, or feature engineering, is key to the performance of machine learning models in material, chemical, and biological systems. As such, a group of researchers at the School of Physical and Mathematical Sciences at Nanyang Technological University Singapore are using high performance computing to develop a new molecular representation framework, known as persistent spectral (PerSpect), and PerSpect based machine learning (PerSpect ML) for protein-ligand binding affinity prediction. The proposed PerSpect theory provides a powerful feature engineering framework. PerSpect ML models demonstrate great potential to significantly improve the performance of learning models in molecular data analysis.

High performance computing plays a pivotal role in the team’s daily work. The project needs to process large databases which contains thousands of entries, the databases needs to be divided into several pieces and parallel computing has to be employed to treat each part. Additionally, some algorithms are time/memory-consuming and computing resources with multiple cores and large memory are needed to run them.

To find out more about how NSCC’s HPC resources can help you, please contact [email protected].

NSCC NewsBytes May 2021

Back To Case Studies

Other Case Studies

Health and Biomedical Sciences

Using HPC to help conserve Southeast Asia’s biodiversity

NSCC’s supercomputer is being used to accelerate the analysis of the evolutionary history and biogeography of ‘ancient’ trapdoor spiders so that they can be used as a model…

Advanced Manufacturing and Engineering

High resolution modelling of weather and climate over Singapore and Southeast Asia

Using NSCC’s supercomputing resources, researchers in Singapore are using complex computer models to look at how climate and weather impact the region The latest climate…

Urban Solutions and Sustainability

When is water not just water?

NSCC’s supercomputing resource is helping researchers gain a better understanding of the true nature of water, and its properties. Although it is one of the most common elements,…

Health and Biomedical Sciences

Studying the mutation mechanisms of the flu virus to develop better vaccines

Utilising high performance computing to understand the life cycle of Influenza A Virus to aid in the development and commercial production of vaccines. Influenza A Virus (IAV) is…

Health and Biomedical Sciences

Tailoring brain tumour treatments for better patient outcomes

Doctors and researchers at the National Neuroscience Institute (NNI) are using a supercomputer to analyse the brain tumour genome of individual patients. The data could…

Urban Solutions and Sustainability

Privacy-Preserving Data Fusion for Traffic State Estimation

Researchers from NUS are proposing a privacy-preserved data fusion algorithm for traffic state estimation (FedTSE) to facilitate collaboration and data sharing between multiple data owners…

Using Machine Learning to identify and improve protein-ligand binding affinity predictions

Other Case Studies

Join Our Mailing List

Privacy Statement