deepcarbon
2nd in the final ranking, with the BOGA team of the Data Challenge 2023 organised by IA Pau and TotalEnergies. Creation of a web application for the automated analysis of lithologies. I mainly focused on building a performant model.

Topics covered :
- data labelling
- training of a YOLOV5 model for detecting patterns in an image
- integration of an OCR solution
- creation of a complete workflow integrating segmentation models and OCR
details
TOTAL has a library of documents describing the composition of the earth's subsoil in different geographical areas.

The project involves searching each document for the column of a table, generally several pages long, describing in texture form each mineral resource, the depth of which is represented by the height of the column in the table (the textures of the mineral resources can be found in a legend in the document, which also needs to be searched). This table is not standardised: the column searched for is not always in the same place, and the legend has different formats.

The result of the search should be represented in the form of a pie chart, with each pie representing the proportion of each mineral resource identified in terms of height and depth.
A dataset (PDF documents) has been provided.

The aim is to use artificial intelligence technologies to search for mineral resources and their proportion in each PDF document.