# Rspamd 1.4.1 has been released

The next stable release 1.4.1 of Rspamd is available to download. This release includes various bugfixes and couple of new cool features. The most notable new feature is the Clickhouse plugin.

### Clickhouse plugin

This plugin is intended to export scan data to the clickhouse column oriented database. This feature allows to perform very deep analysis of data and use advanced statistical tool to examine your mail flows and the efficiency of Rspamd. For example, you can find the most abused domains, the largest spam senders, the attachments statistics, URLs statistics and so on and so forth. The module documentation includes some samples of what you can do with this tool.

### Universal maps

It was not very convenient that maps could only contain references to external resources. From the version 1.4.1, you can also embed maps into the configuration to simplify small maps definitions:

map = ["elt1", "elt2" ...]; # Embedded map
map = "/some/file.map"; # External map


### Lua modules debugging improvements

You can now specify lua modules in debug_modules to investigate some concrete module without global debug being enabled

### New rules

Steve Freegard has added a bunch of new rules useful for the actual spam trends, including such rules as:

• Freemail and disposable emails addresses
• Common Message-ID abuse
• Compromised hosts rules
• Rules for upstream services that have already run spam checks
• Commonly abused patterns in From, To and other headers
• Suspicious subjects
• MIME misusages

### Multiple fixes to the ANN module

Neural networks has been fixed to work in a distributed environment. Couple of consistency bugs have been found and eliminated during Redis operations.

### Other bugfixes

There are couple of other bugs and memory leaks that were fixed in this release. Please check the full release notes for details.

# Rspamd 1.4 has been released

Today, after 4 months of development, we’ve released major updates for both Rspamd and Rmilter: Rspamd is updated to version 1.4 and Rmilter is updated to version 1.10. These updates include many new features, including Redis pool support, new modules, improved neural networks support, zstd compression for protocol and many other important improvements.

### Redis pool support

Rspamd now connects to Redis using a pool of persistent connections. This feature does not require any special setup and allows reuse of existing connections improving load profile for Redis instances. Enabling this feature allowed Rspamd to use Redis more extensively for different tasks.

### New neural nets plugin

Neural nets plugin has been reworked to store both training vectors and neural nets in Redis. This change allows to use a single neural network for the whole cluster of Rspamd scanners improving thus both the quality of classification and the speed of training.

### Bayes improvements

Some work has been performed to improve the Bayesian statistical classifier. Rspamd now uses more metadata to estimate ham/spam probability. You can read more about Bayes classifier in Rspamd compared to other spam filters here: https://rspamd.com/misc/2016/10/14/bayes-performance.html.

### New Antivirus plugin

Rspamd can now check messages for viruses using Antivirus plugin. This module provides multiple features including:

• different antivirus types support: ClamAV, Sophos and F-Prot
• support of custom patterns (e.g. experimental databases for ClamAV)
• support of caching for checks result
• support of attachments only mode to save AV resources
• whitelists, size limits and custom condition scripts

### New MX check plugin

Rspamd can now verify MX validity for scanned messages using the new MX check plugin. This plugin is useful for protecting from messages with invalid return paths.

### Compression support in the protocol

Rmilter and Rspamd now support zstd compression. This algorithm is fast and efficient for reducing of network and CPU load when transferring data over the network. Zstd is also used to store large chunks of data in Redis (e.g. serialized neural nets).

### Reworked model for DNS failures in SPF, DKIM and DMARC

Rspamd now has better understanding of temporary failures when performing DNS related checks, e.g. DKIM, DMARC or SPF. There are special symbols to represent both temporary and permanent errors for these plugins.

Ratelimit module now supports adaptive ratelimits meaning that limits can be made stricter for new and/or bad reputation senders & more lenient for good reputation senders. Furthermore, ratelimits are now composable from keywords providing greater flexibility & user-defined keywords can be created with Lua functions to support custom requirements.

### Monitored objects

There is a new concept in Rspamd: monitored resources. This means that Rspamd periodically check if some resource is still available and healthy. For example, this feature is enabled for RBLs and URIBLs. In this mode, Rspamd checks that the DNSBL is available and that it does not blacklist the world. If these checks fail, then a monitored resource is ignored for further checks.

### Redis backend for fuzzy storage

It is now possible to store fuzzy hashes in Redis. This storage is more fast, scalable and more featureful than SQLite. rspamadm utility can convert fuzzy hashes from SQLite storage to Redis using fuzzy_convert tool.

### Delhash support for fuzzy storage

You can now remove a specific hash from fuzzy storage without a message, you just need to find it in the logs and call rspamc fuzzy_delhash <hex>. Multiple hashes can be specified for this command.

### Metric exporter and metadata exporter

Metric exporter allows for periodically pushing Rspamd’s internal statistics to an external monitoring system (currently just Graphite is supported). Metadata exporter is a flexible mechanism for conditionally pushing user-defined message metadata to an external system (current backends are Redis Pub/Sub & HTTP).

### Dynamic configuration in Redis

This feature is useful when you want to manage multiple instances of Rspamd centrally. Currently, dynamic configuration is limited to scores of symbols, actions and global enable/disable definitions for symbols only. In future, these functionality is planned to be extended.

### Users settings in Redis

Users settings module now supports loading for users settings from Redis server. This is useful feature for dynamic configuration of users’ preferences without reloading of the whole bunch of settings.

### Errors ring buffer

Rspamd logger now stores errors in a central ring buffer that contains information about the most recent errors occurred in all Rspamd processes. Controller worker can return this buffer as JSON when asking for /errors path (this requires enable_password).

### Messages rework

It is now possible to have multiple messages when returning Rspamd reply, e.g.

{"messages": {"smtp_message": "Try again later"}}


Rmilter 1.10 also supports this to tell MTA some specific error message (e.g. ratelimit or greylisting).

### Multiple updates to Rspamd Lua API

There are many new features in Rspamd Lua API:

• periodic events:
rspamd_config:add_periodic(ev_base, 1.0, function(cfg, ev_base)
local logger = require "rspamd_logger"
i = i + 1
logger.infox(cfg, "periodic function, %s", i)
return false -- if return false, then the periodic event is removed
end, true)

• on_load and on_terminate scripts
rspamd_config:add_on_load(function(cfg, ev_base, worker)
if worker:get_name() == 'normal' then
-- Do something
end
end)

• multiple hashes support
local hash = require "rspamd_cryptobox_hash"
hash.create_specific('md5', 'string'):hex()
-- b45cffe084dd3d20d928bee85e7b0f21

• HTTPS support in lua_http
• many improvements in ANN module, including batch training and threaded training
• zstd compression and decompression support has been added to rspamd_util

### Rules improvements

Various new rules to detect suspicious patterns; fixes to improve accuracy. Better HTML rules, fixed various bugs in DNS related services, namely, removed couple of untrusted DNSBLs (SORBS and UCEPROTECT).

### WebUI improvements

There are many major improvements to the Rspamd Web Interface including the following:

• new symbols scores configuration tab:
• new last errors table in the history tab
• WebUI is now loaded on demand for each tab
• updated d3 graphs scripts
• the default passwords are now BANNED from using in WebUI

## Conclusions

Rspamd 1.4 and Rmilter 1.10 are the current stable branches and all users are recommended to update their Rspamd versions. Please read the migration guide if you are unsure about the upgrade process.

# Rspamd bayes engine benchmark

I have recently decided to compare Bayes classifier in Rspamd with the closest analogues. I have tried 3 competitors:

1. Rspamd(version 1.4 git master)
2. Bogofilter - classical bayesian filter
3. Dspam - the most advanced bayesian filter used by many projects and people

For Dspam, I have tested both chain and osb tokenization modes. I have tried to test chi-square probabilities combiner (since the same algorithm is used in Rspamd), however, I could not make it working somehow.

## Testing methodology

First of all, I have collected some corpus of messages with about 1k of spam messages and 1k of ham messages. All messages were carefully selected and manually checked. Then, I have written a small script that performs the following steps:

1. Split corpus randomly into two equal parts with about 500 messages of Ham and Spam correspondingly.
2. Learn bayes classifier using the desired spam filtering engine (-d for Dspam, -b for Bogofilter).
3. Use the rest of messages to test classifier after learning procedure.
4. Use 95% confidence factor for Rspamd and Dspam (e.g. when probability of spam is less than 95% then consider that a classifier is in undefined state, Bogofilter, in turn, automatically provides 3 results: spam, ham, undefined).

This script collects 6 main values for each classifier:

1. Spam/Ham detection rate - number of messages that are correctly recognized as spam and ham
2. Spam FP rate - number of false positives for Spam: HAM messages that are recognized as SPAM
3. Ham FP rate - number of false positives for Ham: SPAM messages that are recognized as HAM
4. Ham and Spam FN rate - number of messages that are not recognized as Ham or Spam (but not classified as the opposite class, meaning uncertainty for a classifier)

The worse error for a classifier is Spam False Positive, since it detects an innocent message as Spam. Ham FP and false negatives are more permissive: they just mean that you receive more spam than you want.

## Results

The raw results are pasted at the following gist.

Here are the corresponding graphs for detection rate and errors for the competitors.

## Conclusions

Rspamd Bayes performs very well comparing to the competitors. It provides higher spam detection rate comparing to both Dspam and Bogofilter. All competitors demonstrated the common spam false positives rate. However, Dspam is more aggressive in marking messages as Ham (which is not bad because Bayes is the only check Dspam provides).

Rspamd is also much faster in learning and testing. With Redis backend, it learns 1k messages in less than 5 seconds. Dspam and Bogofilter both require about 30 seconds to learn.

I have not included SpamAssassin into the comparison since it uses naive Bayes classifier similar to Bogofilter. Hence, it’s quality is very close to Bogofilter's one.

Furthermore, unlike competitors, Rspamd provides a lot of other checks and features. The goal of this particular benchmark was to compare merely Bayesian engines of different spam filters. To summarise, I can conclude that quality of Bayes classifier in Rspamd is high enough to recommend it for using in the production environments or to replace Dspam or Bogofilter in your email system.

# Rspamd 1.3.5 has been released

The next stable version of Rspamd is now available to download. This release contains a couple of bugfixes and minor improvements.

### Termination handlers

Rspamd can now perform some actions on termination of worker processes. For example, it is useful for neural network plugin to save training data on exit. It was also essential for RRD statistics to synchronize RRD on controller’s termination to avoid negative message rates on graphs.

### Minimum learns has been fixed

This option was improperly configured previously so it didn’t work as desired. However, it is indeed useful to stop statistical classification before there is enough training for the Bayes classifier. With 1.3.5 release, this option has been fixed.

### Rspamd on OpenBSD

There were a couple of bug fixes that allowed Rspamd to run on OpenBSD again. These bugs were cloaked by other systems, however, they were potentially dangerous for those systems as well.

### DMARC and DKIM improvements

Andrew Lewis has added various improvements for DKIM, DMARC and SPF plugins to handle cases when the corresponding policies are not listed by senders: e.g. when there is no SPF record or DKIM key for some domain.

### Ratelimits improvements

It is now possible to disable ratelimits for specific users.

### Mailbox messages and rspamc

Rspamd command line client rspamc can now work with messages in UNIX mailbox format which is sometimes used to store messages on the disk.

### Spamhaus DROP Support

Rspamd now supports Spamhaus DROP dns block list that is used to block large botnets over the world.

### DKIM verification improvements

Some bugs related to canonicanization of empty messages are fixed in the DKIM plugin.

### Fix critical issue with line endings finding

There was a critical bug in Rspamd related to parsing of newlines offsets in a message. In some certain cases it could lead to serious malfunction in URLs detector and some other crucial parts of Rspamd.

### Minor bugfixes

There are a couple of minor bugfixes in this release, for example, parsing of \0 symbol in lua_tcp module. HFILTER_URL_ONLY is fixed not to produce overly high scores. All invocations of table.maxn have been removed from Lua plugins as this function is deprecated in Lua.

# Rspamd 1.3.4 has been released

The new stable versions of Rspamd and Rmilter have been released: 1.3.4 and 1.9.2 accordingly. There are a couple of improvements and some important bugfixes. Please note that in the unlikely case you have used regexp rules in Rmilter then you SHOULD NOT upgrade Rmilter and file a bug report (however, I’m pretty sure that it’s not used by anybody since it hasn’t ever been documented). Here is a list of notable changes in Rmilter and Rspamd.

### Rspamd reload command has been fixed

It is now possible to gracefully reload Rspamd configuration by sending HUP signal or by using reload subcommand for the init scripts. Graceful reload is useful when it’s required to update configuration without stopping email processing. During this process, Rspamd starts a new worker processes with the new configuration whilst the existing ones process the pending messages.

### Better ASN/country support

ASN/country detection module has been split from the ip_score module allowing use of this data in other modules, for example, in the multimap module to match maps based on country or ASN.

### Variable maps in the multimap module

It’s now possible to create maps based on the results of other Lua or internal Rspamd modules. This is particularly useful to link different modules with mulitmap.

### DNNSEC stub resolver support

It’s now possible to enable DNSSEC checks in Rspamd through use of a DNSSEC compatible recursive resolver (e.g. Unbound) and check for DNSSEC authentication results in Lua DNS module.

### DMARC and DKIM module fixes

There are some important fixes for DMARC and DKIM modules in this version of Rspamd that are related to canonicalization in DKIM and subdomains policies in DMARC.

### Redis backend configuration

Now Redis backend in the statistical module can use the global redis settings similar to other modules.

Each task and each MIME part now has its own checksum that could be used to detect the same message or the same attachment.

### DKIM signature header is now folded by Rspamd

Since DKIM signature header might be quite long, Rspamd now folds it to fit 80 characters wide common for MIME messages.

### Ratelimit module fixed

This release of Rspamd fixes a regression introduced in 1.3.3 which prevented the ratelimit module from working properly.

Processing of X-Forwarded-For header in the controller has been fixed.

### Rmilter configuration improvements

It is now possible to use += operator to append elements to Rmilter lists (e.g. whitelists) and = to redefine the parameter completely. Hosts lists now can contain hostnames along with IP addresses. List parameters can now be empty to clear lists that are non-empty by default. DKIM signing can be completely disabled in the configuration.

### Rmilter regexp rules are removed

Support for regexp rules has been removed from Rmilter. This is an old feature which has never been documented nor used by any users. It was likely broken so I have decided to remove it from Rmilter completely to simplify configuration parser and the overall processing logic. If you are using it then do not update Rmilter and please file a bug report in the github issue tracker.

### Rmilter bugfixes

Unconditional greylisting support is now restored in Rmilter. Headers added or removed by Rspamd are now treated by Rmilter correctly.