In the dynamic world of Artificial Intelligence (AI), innovation never stops. At icaria Technology, we continuously explore new frontiers in data security, a growing concern in our hyper-connected society.
Our products icaria TDM, icaria GDPR, and our upcoming release, icaria Data Governance (icaria DG), require accurate identification of data types and their purpose in application databases. Therefore, we have evaluated multiple pre-trained AI models, available in Natural Language Processing (NLP) libraries such as Spacy, Flair, and Transformers, among others.
Our goal is to detect sensitive data in text, such as credit card numbers, email addresses, or names. Our findings reveal that these models are highly effective for this task. By using transfer learning techniques, we have significantly reduced the time and resources needed for identifying sensitive data and improved the effectiveness of their detection.
To take our research to the next level, we've developed an application using Streamlit, an open-source tool that allows creating web applications to interactively visualize our machine learning algorithms. With Streamlit, we have created graphs that facilitate the interpretation of results and enable easier comparison between different models and parameters. Additionally, we have developed a user interface that allows for retraining AI models, visualizing the analysis results of these models, and comparing various models.
A clear goal is to continue exploring, improving, and developing data discovery algorithms for our Data Governance and protection tools.
This will be extremely valuable both for our development team and for stakeholders interested in understanding how sensitive data identification works.
At icaria Technology, we believe in research and the implementation of solutions in the field of AI and data security. The combination of pre-trained AI models, retraining them, and visualization tools like Streamlit allows us not only to develop more effective solutions but also to make these solutions accessible and understandable to everyone.
We are pleased to share our progress and will continue to research and share our knowledge in the future. In addition, it will soon be incorporated into the Sensitive Data Identification process.
If you are interested in learning more about how we can help your organization in identifying and protecting sensitive data, do not hesitate to contact us to see examples or discuss the many possibilities that the world of Artificial Intelligence offers.