Our open source crawler

15 July 2022

DNS Belgium has made Mercator, the crawler that checks the data of domain names in order to combat abuse, open source. This means that everyone can now use the code and the crawler for their own purposes. We hope that this will help the domain name sector.

The crawler developed by DNS Belgium looks at recently registered domain names and collects information that is publicly available for each of those domain names. This includes:

  • DNS records, data used to translate your domain into an IP address .
  • Location: where is the web page hosted, where are the name servers and SMTP servers located?
  • Web content such as html code, VAT number, a screenshot of the homepage, web technologies used.
  • Information about the outgoing mail server .
  • TLS configuration for a secure connection to the Internet.

The information we collect here is already publicly available. We are therefore not breaking any privacy laws.

DNS Belgium checks all these data to make the .be zone as safe as possible. This monitoring enables us to detect quickly the misuse of domain names for malicious purposes.

The developers of DNS Belgium developed this crawler themselves. ‘We embarked on the project three years ago. Initially, we had limited resources to devote to it. But we wanted to do it anyway, because we thought it was important to collect data as quickly as possible,’ says Quentin Loos, who co-developed the crawler. Our first objective was to detect fake web shops. For this we could rely mainly on the content of the website.’

Helping the sector move forward

The crawler that DNS Belgium developed is now open source. This means that anyone can access the code of the crawler and use it for free. ‘There are three reasons for doing this,’ says Quentin. ‘First, we want to help the sector. Small registries usually don't have the resources to develop crawlers, even though they could make good use of them.’

‘Furthermore, it is our mission to be a centre of excellence and to help society. We can do that by playing an exemplary role in terms of innovation. There are other registries that have also developed their own crawler. By making ours available for free, we would encourage them to do so as well. This would help us to develop new functionality together more easily, detect bugs, etc.’

And finally, by sharing knowledge and experience with other crawlers, we might standardise the way we present and exchange data. To combat fraud, it is useful to exchange data between registries or to share data in a uniform way with registrars .

Making our crawler open source is one of the many actions we take to make the Internet safer.

This is not the first time that DNS Belgium has made code publicly available. We often do that because we try to help our sector move forward. We have also contributed to other open-source projects in the CENTR community and to products from AWS or Spring.

‘In order to promote the spread of our crawler, we are heavily promoting it in the CENTR community. For example, we've already held a workshop with multiple registries where each participant set up the crawler in a new AWS account,’ says Quentin.

Making our crawler open source is one of the many actions we take to make the Internet safer. By doing so, we achieve our mission and contribute actively to the SDGs, which we hold dear.

With this article we support the Sustainable Development Goals of the United Nations.