Olfeo URL DB

Accurately and precisely categorize URLs and domains

Contact us to test the solution >
for software publishers
Categorizing a large volume of URLs is a complex challenge.
Heterogeneity and ambiguity of content
Websites can cover a wide range of topics, often with content that overlaps between different categories. Finding a way to classify these URLs so that the categories are both accurate and useful to the end user can be difficult.
Rapid evolution of the internet
Websites are constantly evolving, with new content being added, sites disappearing, and changes in the relevance of topics. Maintaining an up-to-date and properly categorized URL database in the face of this dynamic is a demanding task.
Massive volume of data
With billions of active websites, the sheer volume of data to be processed can be overwhelming. This requires automated solutions, such as machine learning, which must be trained, tested, and constantly refined.
Taking user experience into account
It is crucial that categorization systems are designed with the end user in mind, which means they must be intuitive, easy to navigate, and relevant to users' search needs.
OUR SOLUTION

Olfeo OEM offers the most reliable URL database on the market

Olfeo offers its white-label URL and domain database to software publishers who want to enhance their product's functionality.
Thanks to its reliability, sophisticated categorization, and comprehensiveness covering 99%+ of queries, Olfeo OEM brings greater precision, context, and value to the data collected by its customers.

Contact us for a free trial
Contact us for a free trial >
The ULR database for your software solution
+100
Categories divided into 9 themes
25 million
of classified domains, corresponding to hundreds of millions of URLs
99%
query recognition rate
OUR STRENGTHS

The most reliable classification on the market thanks to our database of categorized domains and URLs

Capable of filtering hundreds of millions of URLs, Olfeo's URL database can effectively cover users' browsing profiles. With a unique recognition rate of over 99%, the vast majority of sites visited are recognized and correctly categorized thanks to our approach combining AI-based automatic pre-classification and validation by a human operator.

This unique approach effectively enriches the services provided by your software solutions.

All content categorized in our URL database undergoes a two-step analysis: automatic and human. The systematic manual analysis of each piece of classified content, carried out by an Olfeo expert, is a guarantee of quality. Our classification reliability rate is over 99%. 

 

Systematic human verification of each piece of content is facilitated by the use of high-quality pre-ranking tools. Powerful artificial intelligence algorithms analyze web pages, interacting with our keyword databases enriched by our linguists, to produce extremely high-quality pre-ranking results.

However, to achieve unparalleled classification accuracy, only human intervention, guided by a proven methodology shared by the team, can achieve a navigation recognition rate of over 99% and a false positive rate close to 0.

Integrating Olfeo databases does not require the installation of any third-party software. You retain control over data exchange with your customers. You are completely autonomous in the use of our Olfeo databases, and their integration is extremely simple.

The 2.5 GB database can be easily integrated into any physical or virtual alliance. In the form of an LMDB file, the database is updated securely on a weekly basis to ensure that the data remains up to date.

To ensure the quality of the data in the database, a continuous process is in place:

1. The database is continuously enriched by feedback from users of OLFEO products, by research into new domains covering targeted topics, and by monitoring and utilizing lists of domains available in open sources.

2. The integration of a new artificial intelligence algorithm into our automatic pre-processing chain, in collaboration with the DGA (Direction Générale de L’armement), provides us with continuous improvement and enhanced accuracy during this initial domain classification stage.

3. Human analysis by our teams of experts allows for confirmation of the automatic analysis or correction where necessary.

4. Finally, continuous improvement allows us to monitor possible changes in content and update categories. New categories are also created to adapt to changes in internet usage. For example, the Generative AI category was recently deployed to meet the demand of our customers who needed to track the consumption of services related to the use of generative AI (ChatGPT, etc.).

USE CASES

Olfeo OEM covers a wide range of use cases.

The Olfeo URL database helps improve cybersecurity solutions, particularly in terms of filtering.

With a recognition rate of over 99% and false positives close to 0, cybersecurity solution providers can implement filtering that ensures both a very high level of security against malicious sites or those containing illegal content, and also precision in granting access management based on groups or categories.

The Olfeo OEM database is regularly updated with numerous counterfeit websites in order to assist brands in monitoring the emergence of these threats.

Our identification of counterfeit websites, based on more than 20 years of classification experience, enables us to provide brands with an initial level of detection and classification of so-called counterfeit sites.

Our internal tools, which use a machine learning algorithm, make a difference in pre-identifying websites that could be considered fraudulent. But technology alone is not enough to provide a sufficiently detailed response. Olfeo's team of experienced analysts systematically verifies and confirms all sites before they are finally classified in the category in question.

This high-quality expertise is the basis for Olfeo's counterfeit detection service, which combines human analysis capabilities with our powerful internal tools.

Data from investigations and/or open sources is often very voluminous. Analyzing it is therefore often difficult, especially since it must often be done within a limited time frame. This means that relevant information must be found very quickly.

Olfeo OEM enriches data and provides context to facilitate investigative actions with an excellent level of reliability.