Manually processing massive data is like digging with pick instead of excavator

Identrics is a young technology company providing custom semantic solutions for companies in Southeast Europe (SEE), including media groups and business intelligence providers.

Iva Marinova, Data Scientist at Identrics

Why is Identrics focusing on SEE and what are the main challenges related to the adoption of Artificial Intelligence in the region?

With the growing volumes of information created every day, effective information management is a key business requirement. The problems that we solve are connected with the automation of the mundane, manual data processing tasks done practically in every business.

Processing data on a massive scale without automation is like harvesting wheat without a combine harvester. A human-only workforce is an expensive, fixed and finite resource unable to swiftly scale up to meet sudden demand or scale down at idle times. And this is where Identrics comes in. By applying smart technologies, we add value to businesses, letting them harness human expertise and grow their effectiveness.

Southeast Europe (SEE) is where Identrics is headquartered and this naturally gives us the know-how and experience of local and regional idiosyncrasies few can match. Then there are the statistics, showing growing interest for Artificial Intelligence, or AI, in Bulgaria, Croatia, Romania, Serbia and Greece. A simple check in Google Trends on search terms and topics like “machine learning” or “deep learning” shows an increase in exponential interest in the last five years. Moreover, countries like Bulgaria have long embraced machine learning when it comes to academic and scientific developments. In other words, there is a lot of potential and opportunities to help enterprises in SEE gain a competitive advantage over mature markets like those in Western Europe.

The challenges in the region are mostly connected with the richness and diversity of local languages. And not only that. Despite the huge potential of AI a large part of the business community still relies on conventional work methods and technologies. So as an innovative solution provider, we continuously put a lot of effort into educating the market and explaining the practical gain for their business. On the bright side, as we are working on more and more projects we have plenty of reason to believe the current challenges will translate into future products.

What projects is Identrics currently working on?

We are working on a number of tailormade projects. For example, a leading provider of business news and market intelligence for SEE has asked us to align its long-stored archives of 400,000 articles and 4,000 company profiles. The client wanted to link the news stories with the profiles so that the reader could make immediate reference to the company owners, sales, executives, locations, etc.

For a publishing and financial information company we have developed software for text content analysis. Such service, for example, is machine identification of entities of special interest for the client such as individuals, companies and products in news articles.

We have also developed a product for another client with global presence, which allows common working environment for media monitoring between humans and machines. This environment solves the scalability and coherence problems in companies that use manual annotation processes to analyse huge amounts of data.

Another project we are working on is related to articles deduplication and similarity measurement for big news groups. They are now able to analyse the popularity trends and suggest the most relevant content on a particular topic to their users.

There are also some new inquiries coming from the e-commerce sector from Europe for our automated content generation solutions, which we are excited about.

What are some other more specific solutions you offer and what are their business implications?

А common and quite interesting example is our tool for sentiment analysis, which can find applications in absolutely any industry. Sentiment analysis helps discover how people feel about a particular topic. Say you want to know if people on Facebook think that restaurant X in the Romania capital Bucharest is good or bad. Facebook sentiment analysis will answer the question. You can even learn why people think the restaurant is good or bad, by extracting the exact words that indicate why people did or didn’t like it.

Through such text mining solutions, you can get answers in seconds in the comfort of your office chair. This kind of analyses are pretty accurate and can easily reach above 90% precision compared with human annotations. Generally our production ready models aim to provide between 90%-99% precision which can be improved over time.

In recent years fake news have taken centre stage. How semantic technologies and deep learning can help in the fight against fake news?

The issue with classifying news as “fake” is similar to sentiment analysis. Fake news have been around since the dawn of communication, but with some recent global and local political developments online platforms are giving them the power to misinform on a larger scale, affecting the credibility of information providers, brands, companies and people.

The task is a global challenge at this time and scientists worldwide are working hard on choosing the best algorithm for the solution. Identrics too has joined the efforts to tackle fake news by working on algorithms capable of deciding whether or not specific news, comments, tweets and even profiles are fake.

In what way are you expecting Identrics to develop in the next couple of years?

We have always set ambitious goals for ourselves. In the short-term our aim is to raise awareness of our machine learning capabilities in order to help more companies from different industries to solve their pressing problems with productivity, cost and business growth. Our work is part of the global trend to facilitate information extraction and use the power of big data. 2018 will validate this trend with best practices and routines in the business, which will lead to more acceptable business environment for companies like Identrics in the region.