Rspamd 1.3.1 has been released

2016-08-01 00:00:00 +0000

We have released new updates for Rspamd and Rmilter: 1.3.1 and 1.9.1 accordingly. There are a couple of important bugfixes and some useful new features in these releases.

Systemd activation has been removed from Rspamd and Rmilter

Over the recent years, we have experienced constant issues reported by users about Systemd socket activation. This feature seems to be completely broken in systems with both IPv4/IPv6 enabled. Moreover, it seems to be harmful as Rspamd setup time is quite significant whilst socket activation is mostly intended for interactive on-demand daemons. Socket activation has been added under the pressure of Debian packaging rules, though Rspamd is completely rotten in Debian official repos and we strongly discourage Debian users from using the official Debian packages. While we are still looking for a new Debian maintainer for Rspamd, I can declare that there will be no sockets activation enabled by default in Rspamd nor Rmilter.

The switch to standalone mode should be transparent for users. The only significant difference is use of rspamd.service instead of rspamd.socket in service management commands. In some cases you might need to restart Rmilter after upgrade (for example, in Debian Jessie):

systemctl restart rmilter.service

Rmilter crash on BSD systems

There was a bug that was caused by use of pthread_specific variables. Unfortunately, libmilter is poorly designed and can destroy or move certain threads which causes horrific errors. Linux is somehow not affected, however, there is still use-after-free bug with this approach. In Rmilter 1.9.1, this issue has been fixed.

New multimap features in Rspamd

The multimap module has been significantly updated in Rspamd 1.3.1. It now supports maps that are checked conditionally depending on combinations of other symbols. There is now also support for multiple symbols and scores per map. A new hostname map type has been added to support matching of hostnames. Many new tests have been written to cover multimap functions and to provide resistance against regressions.

Greylisting fixes in Rspamd

There are a couple of fixes in the greylisting module, including authenticated users whitelisting, general logic fixes and restoration of selective greylisting.

Critical issue with Catena password scheme

There was a regression in 1.3 that prevented Catena password encryption scheme from being correctly read by the controller. It has been fixed in 1.3.1, so Catena passwords (those with $2$ prefix) could be used again.

Critical fix for Hyperscan cache

There was a race condition when a worker that writes Hyperscan file was killed in the middle of this process. Afterwards, a cache file was left in an inconsistent state which wasn’t detected or corrected on subsequent Rspamd runs. That caused multiple issues with regular expressions processing, including false and failed matches on arbitrary texts. Since 1.3.1, Rspamd strictly ensures that a Hyperscan cache file has been correctly written to the filesystem.

Message size limit in Rspamd

We now limit the incoming size of a message to prevent crashes on insane input. By default, this limit is set to 50Mb, however it can be changed by max_message setting in the options section:

# local.d/options.inc
max_message = 100Mb;

Rspamd configuration files includes logic

From this version, the behaviour of local.d and override.d is consistent with their description: values in local.d add values to the existing configuration overwriting the same keys, and values from override.d overwrites the whole sections. For example, if we have the original configuration that looks like this:

# orig.conf
rule "something" {
  key1 = value1;
  key2 = {
    subkey1 = "subvalue1";
  }
}
rule "other" {
  key3 = value3;
}

and some local.d/orig.conf that looks like this:

# local.d/orig.conf
rule "something" {
  key1 = other_value; # overwrite "value1"
  key2 = {
    subkey2 = "subvalue2"; # append new value
  }
}
rule "local" { # add new rule
  key_local = "value_local";
}

then we will have the following merged configuration:

# config with local.d/orig.conf
rule "something" {
  key1 = other_value; # from local
  key2 = {
    subkey1 = "subvalue1";
    subkey2 = "subvalue2"; # from local
  }
}
rule "other" {
  key3 = value3;
}
rule "local" { # from local
  key_local = "value_local";
}

If you have the same config but in override.d directory, then it will completely override all rules defined in the original file:

# config with override.d/orig.conf
rule "something" {
  key1 = other_value;
  key2 = {
    subkey2 = "subvalue2";
}
rule "local" {
  key_local = "value_local";
}

Other bugfixes and improvements

There are couple of other important bugfixes and improvements including a critical fix for extracting values from the top received header. The test framework has been improved with new functional tests and better integration with CircleCI

Rspamd 1.3 has been released

2016-07-25 00:00:00 +0000

Today, we’ve released major updates for both Rspamd and Rmilter: Rspamd is updated to version 1.3 and Rmilter is updated to version 1.9. These updates include many new features, including Rspamd proxy and fuzzy storage mirroring. Here is the list of the most important changes introduced in Rspamd 1.3:

Rspamd proxy support

We understand the importance of testing when building spam filtering engines. Most testing work requires checking with production mail flows. The idea behind Rspamd proxy is to create a lightweight shim that can forward requests to the main Rspamd instance and mirror a certain percentage of mail to some testing environment and compare scan results afterwards. Moreover, Rspamd proxy can encrypt traffic and provide zero-copy forwarding for local connections. You can read more about Rspamd proxy in the media section where there is a link to a presentation that describes proxy features in detail.

Fuzzy storage mirroring

Fuzzy hashes are a powerful method to filter spam mail. In version 1.3, Rspamd supports hash mirroring enabling master-slave replication for different storages. Furthermore, it is possible to organise collections of hashes from multiple sources. The synchronization structure is quite simple but it allows for building distributed systems providing the necessary level of redundancy and security. All these features are described in the fuzzy storage operational guide.

HTTPS maps support

Rspamd now supports HTTPS protocol when accessing HTTP resources. That allows for creation of maps that can query HTTPS resources, for example, there is now support for OpenPhish and PhishTank feeds in the phishing module.

Redis replication support

It is now possible to use Redis cluster in all Rspamd modules, since Redis requests are split between read_servers and write_servers. Another useful addition is the ability to specify Redis servers in a dedicated section to configure all modules that use Redis together. You can read more information about it in the Redis integration guide.

Improved content filtering

There are many features added to the multimap and mime_types plugins.

First of all, Rspamd can now read files list from some archives types, namely, zip and rar. This feature allows to ban some certain bad attachments that could be archived (or even encrypted) by spammers. Since archives are the main source of malware this feature should be extremely useful in filtering these sort of malicious messages. Mime_types plugin is updated to find the following bad patterns:

archive in archive
double extension to hide malware (e.g. pdf.exe)
certain list of blacklisted extensions (user’s configurable)

Secondly, multimap plugin can now scan message content. This allows you to create, for example, regular expressions map (powered by Hyperscan that can filter messages using a a comprehensive list of bad patterns.

There are a couple of other improvements to the multimap module, for example, new filters and map deduplication.

Internal greylisting support

Rspamd can now delay suspicious messages internally. In earlier versions, you would require some external tool (for example, Rmilter) to do this job for Rspamd. Since 1.3, you can do this task just within Rspamd regardless of the integration method you use.

Replies module

Similarly to the previous module, this Rmilter feature is now available within Rspamd. By means of this module, Rspamd can store our outgoing email IDs and automatically whitelist external replies to these messages. This feature allows for immediate delivery of replies, automatic notifications and bounces for local users.

DKIM signatures support

Rspamd now supports DKIM signing in addition to DKIM checks. Signing condition is defined by a custom Lua script that allows to select conditions for signing, the appropriate signing key and selector all by using Rspamd Lua API features. There are a couple of examples that are provided in the module’s documentation. rspamadm utility can now generate DKIM key pairs and DNS records for your domains.

WebUI improvements

There are various improvements in the Rspamd web interface. For example, it now includes a throughput graph powered by d3.js and the custom module contributed by Alexander Moisseev. You can now also learn fuzzy using WebUI and there are many bugs and visual issues fixed in this version.

Other changes

Rspamd 1.3 also includes other changes that improve stability, filtering quality and performance, for example faster hash function selection. There are many critical bug fixes that were not backported to the 1.2 branch, for instance, major Redis statistics rework (which can now be used in highly loaded production environments). Many rules have been rescored and reworked. There are also many bug fixes to the URL detection logic and phishing detection. The chartable module has been completely rewritten to provide more useful homograph detection. There are massive changes to the documentation: new guides, better FAQ section and completely reworked Rmilter section.

WARNING: There are a couple of incompatible changes for Rmilter, so please take a look at the migration document.

Rspamd 1.2 has been released

2016-03-21 00:00:00 +0000

The next major release of rspamd: 1.2.0 is now released.

Key features:

Dynamic rule updates
Regular expressions maps support
Better performance: pcre2 support, faster fuzzy hashes, faster IP lookups
Improved stability: fixed many important bugs and memory leaks

This version is a gradual improvement over the previous 1.1 branch. It is the first release with rule updates support. I believe that it would be easier to backport new rules or critical score changes from the experimental branch to the stable one. Updates are signed to protect their integrity and authenticate the update source.

Among other features introduced by this version are regular expression maps support (with hyperscan acceleration if available). These maps could be used to match many regular expressions and, at the same time, detect certain patterns in the messages being scanned.

Rspamd 1.2 has a couple of performance improvements: it now supports the PCRE 2 regular expressions library that is usually faster than pcre 1. Fuzzy hashing gets further improvements by utilizing AVX2 instructions which are available for the Intel Haswell CPU family. From version 1.2 onwards, rspamd uses a better algorithm to store IP addresses allowing lookups among millions of IPv4 and IPv6 records in almost zero time.

The new release is scanned with Coverity scan and other static analysis tools that helped to fix many potential bugs and leaks. I believe that rspamd 1.2 is stable, solid and completely production-ready so far.

The complete log of changes can be found here: https://github.com/rspamd/rspamd/blob/1.2.0/ChangeLog

There are many important additions in the documentation shipped with rspamd. There is now a frequently asked questions article that describes many aspects of practical rspamd use. The quick start guide has also been updated to improve new users’ experience when installing and running rspamd.

Rmilter has been also upgraded to version 1.7.5 which fixes important greylisting and clamav issues. The rmilter changelog is available here: https://github.com/vstakhov/rmilter/blob/1.7.5/ChangeLog

Rspamd vs Spamassassin performance comparison

2016-03-03 00:00:00 +0000

Just before 1.2 release, I have measured performance of rspamd comparing to SA. In this experiment, I’ve taken rspamd master branch with default rules. Then I’ve added all rules from SA using spamassassin plugin. Hence, two scanners run with almost exact set of rules.

This set is quite large and it includes about 3k of custom regexp rules. Rspamd runs without hyperscan and pcre2, so it performs literally the same job as SA does. And here are results for about 100k messages being scanned:

Total False Positives: 517
Total False Negatives: 348
Total messages: 101349

Total SA time: 423942 seconds, total rspamd time: 33149 seconds
Average SA time: 4182ms/msg, average rspamd time: 327ms/msg seconds

So the difference in checks is less than 1% and in many cases rspamd does better job than SA because, for example, multiple hits of URIBL rules, phishing detection and some other differences. And it’s still 13 times faster than SA. Moreover, it eats less memory and can process more messages in parallel. In other experiments, rspamd was able to process about 450 messages per second on a single SandyBridge 4 cores scanner box.

I plan to release rspamd 1.2 very soon with a lot of cool features, including dynamic rules updates. I would appreciate any help in testing of the experimental packages. In fact, they are already used in production and are even more stable than 1.1 branch.

Rspamd switches to apache 2 license

2016-02-04 00:00:00 +0000

In the modern world, software patents are a significant threat for the Open Source software. Therefore, I have decided to switch from the original BSD license to Apache-2 license. Whilst Apache license has the same permissive clauses as BSD license has there is an explicit definition of software patents in Apache license. The Apache License contains both a patent grant and a patent retaliation clause.

Another terms of licensing have not been changed: you can still use the code in your projects and you are not obliged to open your modifications to the code like you need in GPL. Contributed code is still licensed under BSD license.