Accelerating drug discovery process through effective protein crystal recognitions and classification for a leading Pharma company
Enabling a top Pharmaceutical company to significantly improve the efficiency of Protein Crystal recognition and classification, thereby accelerating the drug discovery process.
About the Customer
The customer is one of the top 10 drug companies (source: Forbes 2020) and a leading drug maker in US. The company has been focused on deploying machine learning techniques to reduce operation time and improving accuracy of their Drug discovery process.
The customer contacted Brillio to leverage our experience in Machine Learning and Product Engineering space to execute the classification task of protein images for better drug delivery.
For the task at hand, the Analytical problem maybe stated as
Leveraging outcome of Deep Learning Models to analyzing results of protein crystallization experiments and identifying the formation of proteins from microscope images
Creating scoring mechanism to score and therefore classify new images
Identifying Protein crystal is vital to the process of drug iscovery, but the observational data can be too overwhelming for a human to analyse. This method is traditionally completed manually by highly skilled crystallographers,and is very resource intensive as well as prone to errors.
Devising an AI-based solution for analysing results of Protein Crystallization experiments would enable an automated and reliable way to identify the elusive Protein Crystal outcome.
This problem is ideally suited to be addressed using machine learning disruption through computer vision technologies. The key challenge lies in designing a solution that is efficient, consistent, accurate and adaptable in recognition and classification of Protein crystallization outcomes.
Step 1: Preprocessing of Data
More than 1 million images were preprocessed to standardize images from different sources.
Step 2: Image classification algorithm
The latest Hybrid convolutional neural network algorithms based on architectures classes such as Inception, ResNet, MobileNet, DenseNet w used for image recognition and classification.
Step 3: Training Algorithm on SageMaker
Multiple models were built and trained to predict better accuracy. Hyperparameter tuning was used to efficiently iterate between different model configurations to arrive at the one with great accuracy. Models were containerized using dockers.
Step 4: Testing & Deployment in AWS
The model with highest accuracy was deployed for both Real-time inference and Batch Inference modes.
Step 5: Creating Web Application
A web app was developed for inferencing protein outcomes images that can enable review of crystallography and their labels as inferenced by the ML model. It is packed with features that include sorting, filtering and searching functionalities for ease of use.
Hybrid Algorithm detects pattern that cannot be identified by standard API’s so images that do not have any characteristics like previously identified classes can also be sorted.
~100x reduction in resources required for protein crystallography analysis which in turn speeds up drug discovery process.