Enabling a top Temperature control Logistics Company to build better Text recognition system
Automating sorting and storing of customer data for a leading Temperature Control Logistics company through Optical Character Recognition
About the Customer
The customer is largest player in the temperature control logistics industry in US. The company has been focused on deploying machine learning techniques to solve problems of storing, moving, and preserving foods and for supporting activities.
The customer contacted Brillio for building Machine learning model for Text recognition of Bill of Lading (BoL) document. These are receipts that accompany customer consignments and vary widely in content and format. The process to capture receipt contents was highly manual, time consuming, prone to errors, and not scalable
For the task at hand, the analytical problem may be stated as
Build an automated, real-time solution to extract receipt (BoL) content.
This will create an automated way of sorting and storing customer data and will save hours of manual work. They contacted Brillio to get the benefits of our expertise in Computer Vision and deep learning space to execute this text recognition task.
Bill of lading is one of the most important documents when it comes to transportation process and the task is to identify correctly what is written in some specific fields like Bill of Lading number, Shipping date, Ship – from address, Ship – to address and name of Carrier. But the observational data can be overwhelming for a human to analyze. Devising a Machine learning solution for Text recognition would thus enable an automated and reliable way for Text recognition. This problem is ideally suited to be addressed using machine learning disruption through Computer vision technologies. The key challenge lies in designing a solution which is efficient, consistent, accurate and adaptable in Text Recognition.
Solution Approach – Process
Step 1: Metrics creation – Region of Intersection (RoI)
Using the Tensorflow framework we created Faster RCNN for Character recognition and set up a Region of Intersection (RoI) metrics to evaluate it. RoI was calculated with the help of dividing area of overlap by area of union. Average Precision [0.50:0.95] IoU = 0.609 (> 0.5 is considered a good prediction).
Step 2: Image Pre-processing
Here we started with Binarization to convert everything in to black and white, and then we did the color inversion which is swapping background color. Then we did contrast adjustment to make image clearer and de-skewing for any rotation detected.
Step 3: Character recognition
We did Text line extraction using pyTesseract andOCropus libraries.
Step 4: Text Post- processing
After doing the Rule based extraction, we did the validation check in formatting date and used Fuzzy logic to identify company names.
Finally, we get out put in the form of text sorted according to the fields required, and based on accuracy we label them Green, Amber and Red. We achieved an overall accuracy of 86%. The RAG score is determined using a Fuzzy logic-based algorithm such as Levenstein distance. Match 80+% -> Green, 50% – 80% -> Amber and less than 50% -> Red.
Business Impact and Benefits
Significantly reducing efforts (~15x) to extract receipt contents