top of page
  • Kaitos GmbH

Tire data recognition using AI - Part 1

The recognition of tire data such as

Brand and profile

Tire dimension

Load- and Speedindex

DOT


from simple photos is considered one of the most challenging application fields of modern text recognition. The reasons for this are the low contrast of font to background, dirt or wear of the lettering, strong variability in font layout and size as well as font orientation within a tire image. In addition, external factors such as image blurring and different viewing angles and lighting can make text recognition difficult.

To illustrate the difficulty of reading tyire data, we first tested AI-based OCRs from Google - Google Cloud Vision, Amazon - AWS Rekognition and Microsoft - Azure Machine Vision - which are among the most powerful general purpose OCRs on the market - on a validation set consisting of 100 tire images as shown below. tiire inspection. Since the DOT is usually only very small with low contrast on the tire and is also often worn or slightly dirty, it is an excellent starting point for developing and checking the performance of the KTR architecture. ed tire recognition can enable completely new applications.


Status-Quo

Existing software systems usually rely on semi-automatic solutions consisting of a mix of conventional OCR (Optical Character Recognition), historical data and human supplementation of the data or manual pre-selection of a tire area with the required information on the tire. Often, a very high-resolution image - created by laser scanning of the tire, for example - is required to achieve the necessary reliability. This is both costly and time-consuming, as the technology is expensive and laser scanning usually takes several seconds.


In a pre-processing step, the tire is localised on the image via a U-Net and smoothly calculated via a transformation of the image section from Cartesian to cylindrical coordinates with high stability and performance.R - Kaitos Tire Recognition, a new Deep Learning architecture that reads out tire data with high accuracy and speed based on mobile phone photos of the complete tire sidewall and can thus be used as a building block for complete automation of tiire data recognition.


KTR

In a pre-processing step, the tyre is localised on the image via a U-Net and smoothly calculated via a transformation of the image section from Cartesian to cylindrical coordinates with high stability and performance.

The flattened image of the tire is then transferred to our OCR architecture.

At its core, this consists of a CRNN that we have heavily modified plus a pre-trained feature extractor based on EfficientNetv2. This allows all data types within a model pass to be read out completely. The result of the model pass is then evaluated by means of a beam search decoder we developed ourselves, so that the individual data types can be made available in a structured way by the KTR architecture.

Bild-Quelle: CRNN Research Paper

We use focal-CTC-loss with entropy regularisation as loss. In order to reduce the amount of data required and to increase the quality of the readout, the KTR architecture was additionally supplemented by an internally developed attention mechanism per data type.


As described above, KTR outputs the tire data structured for each data type - e.g. for the DOT - including the top five ratings. For this, KTR supplements the data with an estimation of the probability of the correctness of the respective output. For the training we used Nvidia GeForce RTX 3090 GPUs. The training set comprises 10000 tire images. KTR is deployable as a RESTful service via Docker container and runs on a standard CPU in less than 2 seconds per inference, so it is optimally suited for productive use in an industrial environment.


Starting point - the DOT

To illustrate the difficulty of reading tyire data, we first tested AI-based OCRs from Google - Google Cloud Vision, Amazon - AWS Rekognition and Microsoft - Azure Machine Vision - which are among the most powerful general purpose OCRs on the market - on a validation set consisting of 100 tire images as shown below. tire inspection. Since the DOT is usually only very small with low contrast on the tire and is also often worn or slightly dirty, it is an excellent starting point for developing and checking the performance of the KTR architecture.


Google vs Amazon vs Microsoft und Kaitos

To illustrate the difficulty of reading tire data, we first tested AI-based OCRs from Google - Google Cloud Vision, Amazon - AWS Rekognition and Microsoft - Azure Machine Vision - which are among the most powerful general purpose OCRs on the market - on a validation set consisting of 100 tire images as shown below.

For this purpose, we searched the complete read text returned by the respective API for the correct DOT. I.e. in favour of the general purpose OCRs we did without regex fits - i.e. the targeted search for the DOT along specific text layouts - which is difficult to implement due to the many different representations of the DOT with regard to the number of characters as a combination of letters and numbers, spaces as well as the non-uniform spelling with and without a preceding "DOT" and would have further reduced the number of correct results on the part of the cloud providers.


The OCRs on the data set produced the following results:

OCR









Correct

7%

16%

17%

83%

Nearly correct

9%

3%

7%

10%

Wrong

84%

81%

76%

7%

*Logo-Source: Google, amazon for amazon Rekognition, Microsoft for Azure Machine Learning

On the part of Google - whose OCR performed best among the major cloud providers - the DOT could be read correctly in 17% of the cases. In a further 7% of cases, the DOT could be read, but the segmentation - i.e. the recognition of the different DOT elements as belonging together - was incorrect. In 76% of the cases, the DOT could either not be recognised at all or could only be partially read out correctly. This is very clear evidence of the difficulty of this use case of tire data recognition.


We have now tested the performance of our KTR architecture on the same data set. To do this, we passed the images directly to the KTR architecture and compared the complete DOT returned with the ground truth of the image. KTR was able to read the complete DOT correctly from the tire in a very convincing 83% of the cases. In 93% of the cases the correct prediction was among the top 5 results.


Conclusion

Tire data recognition is a very challenging field due to various factors such as low contrast between lettering and background or dirty and scratched lettering. This is proven by the results of leading OCR providers such as Google, Amazon and Microsoft, which were able to achieve a maximum recognition rate of 17% for the complete DOT on the data set used here. However, the deep learning application KTR, which was developed specifically for tire data recognition, is better able to overcome these problems. Thus, KTR delivers extremely convincing results with a recognition rate of over 83% and, as a special application for tire data recognition, shows a performance almost 5 times better than the general purpose OCRs. In addition, the KTR architecture is very lightweight, i.e. it runs on a standard CPU in under 2 seconds per inference, making it ideal for productive use in industrial environments.


Of course, this is not the end of our development of KTR. We are currently training our architecture for productive use and aim at recognition rates beyond 98%. If you would like to be kept up to date on the further development of KTR or are interested in receiving further information or are generally interested in consulting or development services in the field of AI, please do not hesitate to contact us - contact.

 

References/ recommended reading on the topic:

 

25 views0 comments

Recent Posts

See All

コメント


bottom of page