The ExCAPE European-funded project investigates how the power of supercomputers can speed up drug discovery using machine learning. One of the machine learning algorithms is “Matrix Factorization” (MF). MF is a core machine learning technique for applications of collaborative filtering, such as recommender systems. In drug discovery it can be used to predict the interaction between chemical compounds and protein targets.
The Matrix Factorization technique studied in ExCAPE uses Bayesian Matrix Factorization (BMF). While BMF has the advantage of being able to provide confidence estimates it is more computationally intensive. Therefore, a high-performance parallel implementation of BMF, that is suitable for HPC, was needed. With the help from the POP CoE, this implementation was developed and optimized. It allowed them to discover new insights in compound-protein interaction thanks to the large-scale models built on datasets that were previously intractable.
This webinar presented the ExCAPE project, the Bayesian Matrix Factorization techniques used and how the POP CoE gave them crucial insights into the scaling bottlenecks of our code and so helped us remove them. It was also demonstrated how the HPC infrastructure and implementations were crucial to giving insights that helped the pharma industry in their drug discovery process.
The presentation slides are also available here.
About the Presenter
Tom Vander Aa is a researcher/project coordinator in the ExaScience Life Lab at imec. This lab creates new supercomputer solutions to generate breakthroughs in life sciences and biotechnology. His interests are in software engineering for high-performance computing and machine learning. Before joining the ExaScience Lab he was at Target Compiler Technologies and at the architecture and compiler group in imec working on low energy high performance architectures and compilation techniques.