Artificial intelligence in digitalization and document management

In the digitization process of 77,153 documents required by the National Federation of Coffee Growers, Cadena incorporated artificial intelligence techniques to make it much more efficient in terms of time, physical and technological resources, making it accurate and meeting customer expectations.

The “Support for the renewal of coffee plantations” program is one of the most important of the National Federation of Coffee Growers because it allows coffee growers in the country to access public resources to carry out special treatment or renovation activities in their coffee plantations, because while they grow they reduce their productivity, which is why it is essential to keep them young to guarantee good productivity and quality. And this is what the program allows, which emerged in 1998 and after a pause between 2012 and 2015,  resumed in 2016 and since then the Federation has the need to digitize, every year, the documents of the beneficiaries.

“Because it is a program that works with public resources, we must respond to different entities, which is why it is key for us to have all the documents of the people who participate in the program digitized, so we avoid loss and can keep them more easily,” explains Sandra. Milena Mojica Sanabria, Technical Management of the Federation.

Cadena also developed a web platform for storing, consulting and viewing digitized documents (captured images and data).

In the 2018 edition, 77,153 coffee growers from 16 cities of the country participated. For the digitization process of the documentary supports, which were a total of 555,000 pages, the Federation contracted Cadena, which carried out three specific activities in information management: image capture, data capture (registration format  , verification format and farm visit format) and loading and feeding of an online platform for consultation of interested parties.

That was the process

As there were many pages that had to be digitized, including data capture and structured storage for the files,  Cadena decided to automate part of the process by incorporating artificial intelligence techniques, which made it possible to identify the different types of formats and classify them accordingly. automatic , that is, after capturing the images of the documents of the participating coffee growers, the system determined what type of format it was and, additionally, identified the corresponding fields from which the data required by the Federation should be extracted.

The artificial intelligence techniques used were two:  artificial vision and natural language processing , which in other words, allow the computer to identify and interpret what is contained in one or another document. Likewise, two types of training and learning were used for the system:  supervised , in which part of the human team taught the system and audited the captured data, which really corresponded to what was registered; and  unsupervised , in which the system did it autonomously.

Additionally, as Juan Camilo Sánchez, one of those responsible for executing the project, explains, “ we had to identify each of the folders, know the name of each one and the documents that we had to store in them, we did this with artificial intelligence. From Cadena we set out to do this project in the shortest time and make a much more efficient use of technological resources”.

For Juan Camilo, the most challenging part of the digitization and management of these documents was having the data of the coffee growers written manually because the calligraphy varied, “it was not easy to get the computer to properly identify and interpret the different letters and characters, but we did it and in the end, the result was satisfactory both for us and for the client”, he affirms and emphasizes that  the application of these artificial intelligence techniques in document management is recommended so that the process is much more practical and efficient in terms of time, resources, personnel and even physical spaces.



Leave a Reply

Your email address will not be published. Required fields are marked *