2018-03-12 00:00:00 +0000
We are proud to announce the next major version of Rspamd today: 1.7.0. This version includes more than 1000 changes since Rspamd 1.6 branch. The most significant changes are:
- Better machine learning support and embedding of Lua Torch
- Language detection support based on ngramms and heuristics
- New
rspamadm configwizard
command for a simple configuration setup
- New statistics model for
Redis
backend allowing expiration and better analytics
- Improved wizards for statistics conversion and management
- Added automatic corpus test and rescoring utility based on Google Summer of Code 2017 project completed by @cpragadeesh
- New Elasticsearch plugin
- New experimental Reputation plugin
- Various other important improvements and fixes
Release cycle model change
From this version onwards, Rspamd will no longer have stable
and experimental
branches. All development will be concentrated in the main branch with more frequent releases of both minor and major versions. Major version change (e.g. from 1.7
to 1.8
) will mean some important change with backward compatility or a clear conversion path. Minor releases (e.g. 1.7.0
-> 1.7.1
) will be released as soon as there are enough important changes. Any critical fix will cause a new version to be released.
We have decided to eliminate the concept of stable branches since it makes the processes of development and migration more complicated for both developers and Rspamd users. We will concentrate on stability and backward compatibility of the main branch instead.
Migration notes
Please read migration notes. We believe that no explicit configuration changes are required to upgrade from Rspamd 1.6.6.
New features in Rspamd 1.7
Here is a brief description of the new features appeared in Rspamd 1.7. The full list is available on the changes page.
Better machine learning support
Rspamd has bundled torch support which is enabled by default. This framework is one of the principal Machine Learning frameworks implemented in Lua. We have decided to bundle it directly in Rspamd due to poor support of packaged version of Torch in the vast majority of Linux and Unix distributions. It is currently available for Intel x86_64 architecture only. Rspamd includes the following components of this framework:
- basic Tensors and math framework (torch7)
- neural networks support (nn)
- optimization algorithms package (optim)
- random forests support
So far, Torch is used in neural network module to improve neural network model and the speed of processing. This module has been improved to support clustered configurations, mail flows separation, and multiple neural networks.
Torch also empowers a new rescore
utility described later in this article.
You can use Torch to build your own Machine Learning models to improve the quality of your spam filtering rules.
Language detection support
Rspamd 1.7 includes a new language detection support. It uses NGramms model to support more than 50 languages. Rspamd implements a fast and sophisticated algorithm to detect texts languages using unicode properties, ngramms and statistical methods to provide a more precise language detection. This information could be used for training models (e.g. word2vec embedding), better bayes classification (e.g. by removing of stop words), or by individual rules.
New rspamadm configwizard
command
This command is intended to simplify Rspamd configuration and migration. It provides an interactive console UI to setup the most commonly used Rspamd functions not necessarily configured out-of-the-box:

It can also be used on any stage to adjust configuration:
- DKIM signing setup:
rspamadm configwizard dkim
- Controller password:
rspamadm configwizard controller
- Statistics tools:
rspamadm configwizard statistic
- Redis setup:
rspamadm configwizard redis
New model for Redis backend of statistical data
Rspamd has switched from the old layout of Redis storage where tokens were stored in two large hash tables: RSBAYES_SPAM
and RSBAYES_HAM
to a model where each token is stored separately: RS_<token_id>
and 2 buckets S
for spam hits and H
for ham hits. The new model requires more space in Redis, however, it allows to expire meaningless or not frequently used tokens efficiently reducing storage requirements. A new explicitly enabled plugin called bayes_expire
provides inteligent renewal and eviction of statistical tokens.
It is also possible to store token values inside statistical buckets for debugging and analytics purposes.
You can convert an old statistics in Redis (or sqlite) to a new one using rspamadm configwizard statistic
command.
This project has been started as a Google Summer Of Code project and was completed by @cpragadeesh in 2017. It allows Rspamd to run against some pre-labeled corpus of spam and ham (rspamadm corpus_test
tool) and then analyze anonymous logs produced by this command to adjust the best possible scores for Rspamd rules (rspamadm rescore
). Here is a sample of this command run:
$ rspamadm rescore -l ./rescore_logs -o scores.ucl -d -i 10 --penalty-weight=0 --momentum=0.1 --learning-rate=0.01 --l1=0.0001
...
We plan to improve our scores using this tool and distribute updates using rspamd_update
plugin.
Elasticsearch plugin
Rspamd now provides elastic search plugin that provides integration with Elasticsearch engine and Kibana board to visualise Rspamd data. It has such features as symbols heat map, geographical distribution of spam/ham sources, proportions of traffic scanned and so on. Elastic search plugin and clickhouse plugin provide a decent platform to collect analytics about your mail flows and spam filter efficiency.
Reputation plugin
Rspamd 1.7 includes an experimental reputation plugin. It is currently in work-in-progress state, however, we have a working prototype with the following features:
- DKIM domain, URL domains and IP reputation support
- Flexible backends supports: both Redis for private usage and DNS for public services
- Flexible time buckets: long term buckets and short term buckets
- Flexible aggregation tool (weighted aggregation, decision trees and so on)
This plugin is intended to replace the original ip score plugin and will provide much more reputation types (e.g. URL and DKIM reputation). It is also possible to build systems with both public reputation data that could be provided via DNS and internal reputation data stored in Redis buckets. This could be particularly beneficial for large email service providers.
In this version, the plugin is in still experimental stage but it is close to production testing so far.
7Zip support
Rspamd can now detect and process data from 7zip
files. This functionality lives within mime types module and allows to filter malicious files in 7zip attachments.
Various improvements and changes
In conclusion, we can add that this version of Rspamd includes a lot of improvements in stability, performance and quality of filtering areas. You can take a look at the changes page to get the full changelog.
2017-06-12 00:00:00 +0000
Today, we release the new major version 1.6.0
of Rspamd. The most significant change in this version is the addition of Milter protocol support in Rspamd. Therefore, Rmilter project is finally turned to the abandoned state and should not be used in new installations. All Rmilter users should consider migration to Rspamd milter support. This release has some incompatible changes so please check the migration guide.
Here is the list of most noticeable changes. The full list is available on the changes page.
Milter protocol support
From Rspamd 1.6, rspamd proxy worker supports milter
protocol which is supported by some of the popular MTAs, such as Postfix or Sendmail. The introduction of this feature also finally obsoletes the Rmilter project in honor of the new integration method. Milter support is presented in rspamd_proxy
only, however, there are two possibilities to use milter protocol:
- Proxy mode (for large instances) with a dedicated scan layer
- Self-scan mode (for small instances)
Here, we describe the simplest self-scan
option:

In this mode, rspamd_proxy
scans messages itself and talk to MTA directly using Milter protocol. The advantage of this mode is its simplicity. Here is a sample configuration for this mode:
# local.d/worker-proxy.inc
milter = yes; # Enable milter mode
timeout = 120s; # Needed for Milter usually
upstream "local" {
default = yes; # Self-scan upstreams are always default
self_scan = yes; # Enable self-scan
}
For more advanced proxy usage, please see the corresponding documentation.
ARC support added
There is full support of ARC signatures and seals for emails scanned in Rspamd 1.6.0. ARC signatures can establish that a specific message has been signed and then forwarded by a number of trusted relays. There is a good overview of the ARC
standard here: https://dmarc.org/presentations/ARC-Overview-2016Q2-v03.pdf.
Rspamd ARC module supports both verification and signing for outbound messages. Its configuration is very similar to dkim_signing module.
New statistics model for Redis storage
Rspamd 1.6 includes experimental support for new tokens storage in Redis. In this scheme, it is easier to get data about specific tokens and perform tokens expiration. However, this support is not enabled by default in this release but you can try it as well as Bayes expiration plugin by yourself. In future releases, this model will be the default and you will be able to convert the existing storage to the new scheme without data loss.
New expiration algorithm for internal caches
Rspamd now has an implementation of Least Frequently Used (LFU) algorithm instead of the classic Least Recently Used (LRU) used before. The idea comes from the Redis server where it has been used for a long time. With this algorithm, Rspamd will cache frequently used items for longer time and the overall performance of the caches is expected to be better.
DMARC reports support
DMARC module now supports reports sending (using SMTP smarthost) for specific domains and policies. There are many options available for fine tuning of these reports content, frequency, domains and many other. DMARC reports are intended to provide better feedback for the resources that are using DMARC (e.g. paypal.com
) from their recipients. Namely, they can detect some phishing trends and react to them somehow.
Spam trap plugin
New plugin spam trap has been added to Rspamd to simplify spamtraps organization. This plugin allows to learn fuzzy storages or/and Bayes from some honeypots.
URL redirector improvements
There are various changes in url redirector module. Namely, it now expires processing items more aggressively to avoid leftovers. Some of dependency issues have also been resolved. Furthermore, this plugin now has a list of top redirection destinations allowing to deal with some bad URLs exploited by spammers.
Multiple metrics support has been removed from Rspamd
From version 1.6, multiple metrics support is now completely removed from Rspamd. The only valid metric is now default
. This feature has never ever been used since version 0.2
of Rspamd, however, it consumed some resources and introduced extra complicatinons to the protocol and configuration.
Hence, this feature has been removed and the new endpoint /checkv2
has been added to the protocol. The legacy /check
and /symbols
endpoints are still using old protocol definition and they will be used for backward compatibility in future.
Compression support in proxy
Rspamd proxy now supports transport compression when sending messages to the scanning layer.
Here is a list of significant performance related changes:
- Hfilter regular expressions now can use hyperscan
- DKIM body hash is now cached to improve multiple signatures support
- Snowball stemmers are also cached for better performance
Miscellaneous
Here is a list of other changes made in this release:
- Various rules fixes (
FORWARDED
, URI_COUNT_ODD
and others)
- Bugfixes and other improvements
- New Lua API functions
2017-05-15 00:00:00 +0000
Synopsis
We have migrated hardware that served https://rspamd.com site and related services including fuzzy storage.
Problem description
All Rspamd users who are using rspamd.com
fuzzy storage might see the following messages in the log:
fuzzy_check_timer_callback: got IO timeout with server rspamd.com(5.9.155.182), after 3 retransmits
Normally, Rspamd re-resolves hostnames in this case. However, if there is a single server specified (as enabled by default) there is no resolving on errors. Unfortunately, this bug has been fixed merely in the master branch and is not released in the stable versions yet.
Potential outcome
The quality of filtering might be temporary reduced as fuzzy storage helps to filter certain spam types.
Workaround
You just need to restart Rspamd and it will use the new IP address as intended. We do apologise for any inconveniences caused.
2017-03-17 00:00:00 +0000
We have released the new stable version of Rspamd today. It includes couple of important fixes and improvements. Here is the list of the most important ones.
Base64 decoding fix
We have found and resolved a serious flaw in the current base64 decoder in Rspamd. It could lead to the corrupted output in case if decoder finds out some non base64 characters, for example, spaces or newlines. This bug could affect statistics, fuzzy checks and couple of other fields in Rspamd. Hence, we recommend to update to 1.5.3
as soon as possible.
Redis history
This release includes experimental feature that allows to save history in Redis. There is the initial WebUI support of this feature, however, it is not now enabled by default. In future, we plan to enable it and to enchance history with a set of new options:
- displaying of sender and recipient in history table;
- support of symbols options;
- clustered history;
- dynamic load of history rows;
- compressed history;
All these features are implemented for the backend part (namely, Rspamd controller) but it still requires some major rework of the web interface itself, therefore, this work is postponed till the next version.
Dkim plugin improvements
DKIM signing module now supports the type of private key passed to the module: in addition to PEM format stored in a file, DKIM signing now supports raw keys, base64 encoded keys and PEM keys from raw strings.
DKIM signing now also supports maps for selecting domains to sign.
Other plugins improvements
greylist
plugin now supports excluding low-scoring messages from greylisting
whitelist
plugin can now load list of maps
ratelimit
plugin now excludes greylisted messages
metadata exporter
uses rule-specific settings for emails
metadata exporter
can now use non-ASCII characters in reports
Rules update
Here is the list of rules that are fixed or reworked:
URI_COUNT_ODD
rule now excludes visual URLs which reduces its FP rate
RCPT_COUNT*
and HAS_X_PRIO*
rules are reworked to the normal Rspamd symbols conventions
misc.lua
has been split to multiple modules that share the common rules
Other bugs fixes
- imported important fixes for
ac-trie
module
- fixed local networks proxying
- fixed memory corruption in periodic tasks during worker cleanup phase
- fixed subject rewriting
- improved
zstd
lua API to avoid extra reallocation
2017-03-01 00:00:00 +0000
We are pleased to announce the new major Rspamd release 1.5
today. This release includes a lot of major reworks, new cool features and a significant number of bugs being fixed. The update from the previous versions shouldn’t be hard, however, please check the migration document to be sure that the new version will not break the existing configuration.
Here is a list of major changes for this version.
New MIME parser
Rspamd has used the GMime library for a very long time but we decided to switch to from it for several reasons.
The main problem is that Rspamd requires very precise control of MIME parsing as it has to deal with broken messages not for displaying purposes but for extracting data from them. This procedure has some simplifications and some complications comparing to a generic MIME parser, such as GMime: for example, we do not need to support streaming mode but we have to deal with many non-standard messages that are intended to be parsed incorrectly by some adversary side, e.g. spammers. The current architecture is described here: https://gist.github.com/vstakhov/937f253d5935ee4158688932589b1dcc
Through use of the new parser, Rspamd can now deal with the following messages:
- Messages with redundant
Content-Type
headers:
Content-Type: text/plain
Content-Type: multipart/alternative
Currently, Rspamd always prefers multipart types over plain types and text types unless there is not specific binary type (e.g. if there is text/plain
and application/octet-stream
)
- Messages with broken multipart structures:
- new parts after closing boundary (e.g. attachment in multipart/mixed after the closing part)
- incorrect inheritance
- incorrect multipart type (now Rspamd just ignores the exact
multipart/*
type)
- Filenames that are badly encoded (non-utf8)
- Incorrect
Content-Transfer-Encoding
(now heuristic based):
8bit
when content is Base64
base64
or qp
when content is 8bit
- Bad
Content-Type
, e.g. text
- Messages with no headers or messages with no body
- Messages with mixed newlines in headers and/or body
Switching from libiconv to libicu
Rspamd has switched charset conversion from libiconv
to libicu
. This allowed to speed up the conversion time since libicu
is much faster (~100MB of text from windows-1251
to utf-8
):
0,83s user 0,08s system 98% cpu 0,921 total - iconv
0,36s user 0,07s system 95% cpu 0,450 total - libicu
Furthermore, switching to libicu
allowed for implementation of many useful features:
- heuristic charset detection (NGramms for 1byte charsets);
- visual obfuscation detector (e.g.
google.com
-> gооgle.com
)
- better IDNA processing
- better unicode manipulation
WebUI rework
The Web interface has been reworked for better representation and configuration:
- The web interface now supports displaying & aggregating statistics from a cluster of Rspamd machines
- The internal structure of the Web Interface has changed to a set of modules so that new features could be implemented without touching the overall logic
- The throughput graph has been improved and now displays a small pie chart for the specified time range
Lua TCP module rework
In Rspamd 1.5
Lua TCP module now supports complex protocols with dialogs and states similar to AnyEvent
module in Perl. For example, it is now possible to set a reaction for each communication stage and perform full SMTP or IMAP dialog.
URL redirector module
URL shorteners and redirectors are part of the modern email ecosystem and they are widely used in many emails, both legitimate and not (e.g. in Spam and Phishing). Rspamd has an old and outdated utility service that is intended to resolve such redirects called redirector.pl
. It is written in Perl and hasn’t been updated for a long time. It has a long dependencies list and performs a lot of unnecessary tasks. In Rspamd 1.5, there is a new lightweight lua redirector module which is intented to resolve URLs redirect in a more efficient and simple way. Dereferenced links are processed by SURBL module and added as tags for other modules. Redis is used for caching. This module is not enabled by default so far, but it can easily be enabled by placing redirector_hosts_map = "/etc/rspamd/redirectors.inc";
in /etc/rspamd/local.d/surbl.conf
.
The Rmilter headers module provides an easy way to add common headers; support is available for Authentication-Results, SpamAssassin-compatible headers and user-defined headers among others.
DKIM signing module
The DKIM signing module provides a simple policy-based approach to DKIM signing similar to Rmilter. It supports multiple cool features, for example, you can now store your DKIM keys in Redis.
Force actions module
The Force actions module provides a way to force actions for messages based on flexible conditions (an expression consisting of symbols to verify presence/absence of & the already-assigned action of a message), optionally setting SMTP messages & rewritten subjects.
Configuration of this module has been reworked to provide more flexible operation & library functions have been added to provide JSON-formatted general message metadata, e-Mail alerts and more - making this module readily useful for quarantines, logging & alerting.
URLs can now be assigned tags and it is the job of the URL tags plugin to persist these in Redis for a period of time; which could be used to avoid redundant checks.
URL reputation plugin
The URL reputation plugin filters URLs for relevance and assigns dynamic reputation to selected TLDs which is persisted in Redis.
Multimap ‘received’ maps
Now multimap can be used to match information extracted from Received headers (which could be filtered based on their position in the message). It is also possible to use SMTP HELO
messages in maps for this module. There are also new URL filters, SMTP message setup depending on map data and the ability to skip archives checks for certain filetypes or maps.
Changes in RBL module
Support has been added for using hashes in email
and helo
RBLs (so that information which can’t be represented in a DNS record could be queried).
Support for Avira SAVAPI in antivirus module
Rspamd antivirus module now supports AVIRA antivirus. This code has been contributed by Christian Rößner.
Neural net plugin improvements
We have fixed couple of issues in the neural network plugin allowing to have multiple configurations in the cluster. We have also fixed couple of issues with storing and loading of learning vectors especially in errors handling paths. New metatokens have been added to improve neural network classification quality.
Fuzzy matching for images
Rspamd fuzzy hashing now support matching of the images attached to emails checked. To enable this feature, Rspamd should be built with libgd
support (provided by the pre-built packages). However, this feature is not currently enabled by default as it seems to be too aggressive when used in conjunction with large fuzzy storages producing a lot of false positive hits.
New rules
There are couple of new rules added to Rspamd 1.5:
OMOGRAPH_URL
: detects visually confusable URLs
FROM_NAME_HAS_TITLE
: fixed title match
- Add
REPLYTO_EMAIL_HAS_TITLE
rule
- Add
FROM_NAME_EXCESS_SPACE
rule
Rspamadm grep
A grep
-like tool inspired by exigrep
has been added to rspamadm- see rspamadm grep --help
for usage information: this provides a convenient way to produce logically collated logs based on search strings/regular expressions.
There are number of improvements regarding the performance of processing:
- Base64 decoder now has
sse4.2
and avx
backends
- Better internal caching of various ‘heavy’ objects
- Switching to a faster hash function
t1ha
- Enabled link time optimizations for the pre-built packages
- Bundled luajit 2.1 which has significant performance improvements to the provided Debian packages
Stability improvements and bug fixes
We constantly improve the stability of Rspamd and in this version we have fixed number of issues related to the graceful reload
. Historically, this command has very poor support and there were a number of issues related to memory leaks and corruptions that could occur during reload. In this release, we have fixed a lot of such issues, therefore, you can use reload
more safely now. We have also eliminated various issues related to unicode processing, Lua API, signals race conditions and other important problems found by Rspamd users.