Rspamd 1.3 has been released

2016-07-25 00:00:00 +0200

Today, we’ve released major updates for both Rspamd and Rmilter: Rspamd is updated to version 1.3 and Rmilter is updated to version 1.9. These updates include many new features, including Rspamd proxy and fuzzy storage mirroring. Here is the list of the most important changes introduced in Rspamd 1.3:

Rspamd proxy support

We understand the importance of testing when building spam filtering engines. Most testing work requires checking with production mail flows. The idea behind Rspamd proxy is to create a lightweight shim that can forward requests to the main Rspamd instance and mirror a certain percentage of mail to some testing environment and compare scan results afterwards. Moreover, Rspamd proxy can encrypt traffic and provide zero-copy forwarding for local connections. You can read more about Rspamd proxy in the media section where there is a link to a presentation that describes proxy features in detail.

Fuzzy storage mirroring

Fuzzy hashes are a powerful method to filter spam mail. In version 1.3, Rspamd supports hash mirroring enabling master-slave replication for different storages. Furthermore, it is possible to organise collections of hashes from multiple sources. The synchronization structure is quite simple but it allows for building distributed systems providing the necessary level of redundancy and security. All these features are described in the fuzzy storage operational guide.

HTTPS maps support

Rspamd now supports HTTPS protocol when accessing HTTP resources. That allows for creation of maps that can query HTTPS resources, for example, there is now support for OpenPhish and PhishTank feeds in the phishing module.

Redis replication support

It is now possible to use Redis cluster in all Rspamd modules, since Redis requests are split between read_servers and write_servers. Another useful addition is the ability to specify Redis servers in a dedicated section to configure all modules that use Redis together. You can read more information about it in the Redis integration guide.

Improved content filtering

There are many features added to the multimap and mime_types plugins.

First of all, Rspamd can now read files list from some archives types, namely, zip and rar. This feature allows to ban some certain bad attachments that could be archived (or even encrypted) by spammers. Since archives are the main source of malware this feature should be extremely useful in filtering these sort of malicious messages. Mime_types plugin is updated to find the following bad patterns:

  • archive in archive
  • double extension to hide malware (e.g. pdf.exe)
  • certain list of blacklisted extensions (user’s configurable)

Secondly, multimap plugin can now scan message content. This allows you to create, for example, regular expressions map (powered by Hyperscan that can filter messages using a a comprehensive list of bad patterns.

There are a couple of other improvements to the multimap module, for example, new filters and map deduplication.

Internal greylisting support

Rspamd can now delay suspicious messages internally. In earlier versions, you would require some external tool (for example, Rmilter) to do this job for Rspamd. Since 1.3, you can do this task just within Rspamd regardless of the integration method you use.

Replies module

Similarly to the previous module, this Rmilter feature is now available within Rspamd. By means of this module, Rspamd can store our outgoing email IDs and automatically whitelist external replies to these messages. This feature allows for immediate delivery of replies, automatic notifications and bounces for local users.

DKIM signatures support

Rspamd now supports DKIM signing in addition to DKIM checks. Signing condition is defined by a custom Lua script that allows to select conditions for signing, the appropriate signing key and selector all by using Rspamd Lua API features. There are a couple of examples that are provided in the module’s documentation. rspamadm utility can now generate DKIM key pairs and DNS records for your domains.

WebUI improvements

There are various improvements in the Rspamd web interface. For example, it now includes a throughput graph powered by d3.js and the custom module contributed by Alexander Moisseev. You can now also learn fuzzy using WebUI and there are many bugs and visual issues fixed in this version.

Other changes

Rspamd 1.3 also includes other changes that improve stability, filtering quality and performance, for example faster hash function selection. There are many critical bug fixes that were not backported to the 1.2 branch, for instance, major Redis statistics rework (which can now be used in highly loaded production environments). Many rules have been rescored and reworked. There are also many bug fixes to the URL detection logic and phishing detection. The chartable module has been completely rewritten to provide more useful homograph detection. There are massive changes to the documentation: new guides, better FAQ section and completely reworked Rmilter section.

WARNING: There are a couple of incompatible changes for Rmilter, so please take a look at the migration document.

Rspamd 1.2 has been released

2016-03-21 00:00:00 +0100

The next major release of rspamd: 1.2.0 is now released.

Key features:

  • Dynamic rule updates
  • Regular expressions maps support
  • Better performance: pcre2 support, faster fuzzy hashes, faster IP lookups
  • Improved stability: fixed many important bugs and memory leaks

This version is a gradual improvement over the previous 1.1 branch. It is the first release with rule updates support. I believe that it would be easier to backport new rules or critical score changes from the experimental branch to the stable one. Updates are signed to protect their integrity and authenticate the update source.

Among other features introduced by this version are regular expression maps support (with hyperscan acceleration if available). These maps could be used to match many regular expressions and, at the same time, detect certain patterns in the messages being scanned.

Rspamd 1.2 has a couple of performance improvements: it now supports the PCRE 2 regular expressions library that is usually faster than pcre 1. Fuzzy hashing gets further improvements by utilizing AVX2 instructions which are available for the Intel Haswell CPU family. From version 1.2 onwards, rspamd uses a better algorithm to store IP addresses allowing lookups among millions of IPv4 and IPv6 records in almost zero time.

The new release is scanned with Coverity scan and other static analysis tools that helped to fix many potential bugs and leaks. I believe that rspamd 1.2 is stable, solid and completely production-ready so far.

The complete log of changes can be found here: https://github.com/vstakhov/rspamd/blob/1.2.0/ChangeLog

There are many important additions in the documentation shipped with rspamd. There is now a frequently asked questions article that describes many aspects of practical rspamd use. The quick start guide has also been updated to improve new users’ experience when installing and running rspamd.

Rmilter has been also upgraded to version 1.7.5 which fixes important greylisting and clamav issues. The rmilter changelog is available here: https://github.com/vstakhov/rmilter/blob/1.7.5/ChangeLog

Rspamd vs Spamassassin performance comparison

2016-03-03 00:00:00 +0100

Just before 1.2 release, I have measured performance of rspamd comparing to SA. In this experiment, I’ve taken rspamd master branch with default rules. Then I’ve added all rules from SA using spamassassin plugin. Hence, two scanners run with almost exact set of rules.

This set is quite large and it includes about 3k of custom regexp rules. Rspamd runs without hyperscan and pcre2, so it performs literally the same job as SA does. And here are results for about 100k messages being scanned:

Total False Positives: 517
Total False Negatives: 348
Total messages: 101349

Total SA time: 423942 seconds, total rspamd time: 33149 seconds
Average SA time: 4182ms/msg, average rspamd time: 327ms/msg seconds

So the difference in checks is less than 1% and in many cases rspamd does better job than SA because, for example, multiple hits of URIBL rules, phishing detection and some other differences. And it’s still 13 times faster than SA. Moreover, it eats less memory and can process more messages in parallel. In other experiments, rspamd was able to process about 450 messages per second on a single SandyBridge 4 cores scanner box.

I plan to release rspamd 1.2 very soon with a lot of cool features, including dynamic rules updates. I would appreciate any help in testing of the experimental packages. In fact, they are already used in production and are even more stable than 1.1 branch.