Rspamd 1.7.9 has been released

2018-08-01 00:00:00 +0000

We have released Rspamd 1.7.9 today. There are no incompatible changes introduced with this version to our best knowledge.

The most important features and fixes

  • Ratelimits are reworked and now work as intended (and documented)
  • Clickhouse module supports data retention policies
  • Reworked C modules to avoid global contexts (simplifies leaks detection on reload)
  • Reputation plugin now supports SPF records reputation
  • WebUI code is now even more conformant to the modern JS standards
  • Maps are now distributed remotely with local file safety fallback to allow faster maps update without waiting for a new release
  • Antivirus module checks attachments only (as decoded content) in attachments_only mode to improve AV performance by hiding the mime content from them

Full list of the meaningful changes

  • [CritFix] Fix caseless comparison of equal length strings
  • [Feature] Add HTTP basic auth support to elastic and clickhouse plugins
  • [Feature] Add SPF selector to reputation
  • [Feature] Add support of the fallback backends for HTTP maps
  • [Feature] Allow to print full mime structure when extracting mime data
  • [Feature] Allow to split symbols in reputation plugin
  • [Feature] Check attachments only on AV scanners in attachments_only mode
  • [Feature] Disable all SSL checks if ssl_no_verify flag is set
  • [Feature] Implement parsing of scoped IPv6 addresses
  • [Feature] Improve rspamc counters output
  • [Fix] Add sanity checks when expanding SPF macros
  • [Fix] Allow to parse SA rules with no spaces around =~ (dirty hack)
  • [Fix] Avoid one extra byte writing
  • [Fix] Deal with direct hash table
  • [Fix] Detect empty text part as text, not HTML
  • [Fix] Do not reduce map watch timeout for mixed http/file maps
  • [Fix] Fix HTML part detection heuristic
  • [Fix] Fix double free in redirectors cleanup
  • [Fix] Fix legacy history handling in the controller
  • [Fix] Fix messages insertion
  • [Fix] Fix sending string method
  • [Fix] Fix statconver command line arguments
  • [Fix] Fixed argument checking for being null
  • [Fix] Fixed issues reported by luacheck
  • [Fix] Freeze updates queue when do actual storage update
  • [Fix] HTTP map hash is per-backend and not per-map
  • [Fix] Plug memory leak in fuzzy updates
  • [Fix] Prefer ‘MTA-Name’ when producing authentication results
  • [Fix] Replace bad unicode sequences instead of stopping on them
  • [Fix] Set classifier version on learning
  • [Project] Reworked ratelimits
  • [Project] Apply topological sorting for symbols in Rspamd
  • [Project] Remove global contexts from C modules
  • [Project] Move performance critical hash tables to khash
  • [WebUI] Avoid unused indexes
  • [WebUI] Do not execute on_success callback
  • [WebUI] Fix history reset for “All SERVERS” (#2346)
  • [WebUI] Fix query URL for selected server
  • [WebUI] Fix symbols display in legacy history,
  • [WebUI] Hide symbols order selector for legacy history
  • [WebUI] Refactor query functions into one
  • [WebUI] Remove previously-attached event handlers
  • [WebUI] Save symbols to the selected server
  • [WebUI] Unify arguments of query functions
  • [WebUI] Use common query functions to get graph data
  • [WebUI] Use common query functions to save symbols

Rspamd 1.7.8 has been released

2018-07-12 00:00:00 +0000

We have released Rspamd 1.7.8 today. There are no incompatible changes introduced with this version to our best knowledge.

The most important features and fixes

  • Rspamd mime tool can now show you fuzzy hashes extracted from text
  • Fuzzy hashes are now updated when being hitted to prevent expiration of the important hashes
  • Fuzzy updates queue is now deduplicated that allows to reduce amount of Redis update requests by 10 times in some cases
  • HTTP maps are now cached on disk to provide preload on startup
  • WebUI code is now more conformant to the modern JS standards (special thanks to Alexander Moisseev)

Full list of the meaningful changes

  • [Feature] Add more extended statistics about fuzzy updates
  • [Feature] Add more non-conformant Received headers support
  • [Feature] Add preliminary function to get fuzzy hashes from text in Lua
  • [Feature] Allow to configure AV module rejection message
  • [Feature] Implement fuzzy hashes extraction in mime tool
  • [Feature] Improve WHITE_ON_WHITE rule
  • [Feature] Improve integer -> string conversion
  • [Feature] Reuse maps in multimap module more aggressively
  • [Fix] Avoid race condition in skip map as pool lifetime is not enough
  • [Fix] Eliminate all specific C plugins pools
  • [Fix] Fix DKIM check rule if DNS is unavailable
  • [Fix] Fix build where ucontext is defined in ucontext.h
  • [Fix] Fix crash in base url handling
  • [Fix] Fix descriptors leak in sqlite3 locking code
  • [Fix] Fix messages quarantine
  • [Fix] Fix padded numbers printing
  • [Fix] Fix race condition on maps reinit
  • [Fix] Fix regexp functions when no data is passed
  • [Fix] Fix specific urls extraction
  • [Fix] Fix styles propagation
  • [Fix] Improve resetting of the limit buckets
  • [Fix] Initialize sqlite3 properly
  • [Fix] Work with broken resolvers in resolv.conf
  • [Project] Implement HTTP maps caching
  • [Project] Refresh fuzzy hashes when matched
  • [Project] Add logic to deduplicate fuzzy updates queue
  • [WebUI] Add missed declarations
  • [WebUI] Avoid using “undefined” property
  • [WebUI] Do not accept passwords containing control characters
  • [WebUI] Do not redeclare variables
  • [WebUI] Enable strict mode,
  • [WebUI] Fix variable assignment
  • [WebUI] Initialize variables at declaration
  • [WebUI] Remove duplicated path from RequireJS config
  • [WebUI] Remove unused block
  • [WebUI] Remove unused variable
  • [WebUI] Remove unused variables
  • [WebUI] Use self-explanatory notation
  • [WebUI] Use type-safe equality operators

Rspamd 1.7.7 has been released

2018-07-02 00:00:00 +0000

We have released Rspamd 1.7.7 today. There are no incompatible changes introduced with this version to our best knowledge.

The most important features and fixes

  • Add rspamadm mime tool to do various email operations:
    • extract text/HTML content
    • extract statistical tokens
    • exctact URLs
  • Fixed encryption mode in Rspamd proxy
  • Fixed various crashes in maps during reload
  • Preload maps data before starting of the worker processes when possible
  • Better HTML styles processing: add ZeroFont exploit filtering rules
  • Fix ED25519 DKIM signatures as described by the latest RFC draft
  • Added crash reporting system via libunwind

Full list of the meaningful changes

  • [CritFix] Check NM part of pubkey to match it with rotating keypairs
  • [CritFix] Do not overwrite PID of the main process
  • [CritFix] Fix maps after reload
  • [CritFix] Fix maps race conditions on reload
  • [CritFix] Fix shmem leak in encrypting proxy mode
  • [Feature] Add a concept of ignored symbols to avoid race conditions
  • [Feature] Add ability to print bayes tokens in rspamadm mime
  • [Feature] Add method to get statistical tokens in Lua API
  • [Feature] Add preliminary mime stat command
  • [Feature] Add rspamadm mime tool
  • [Feature] Add urls extraction tool
  • [Feature] Address ZeroFont exploit
  • [Feature] Allow rspamadm mime to process multiple files
  • [Feature] Allow to extract words in rspamadm mime
  • [Feature] Allow to print mime part data
  • [Feature] Allow to show HTML structure on extraction
  • [Feature] Distinguish IP failures from connection failures
  • [Feature] Improve output for mime command
  • [Feature] Improve styles propagation
  • [Feature] Main process crash will now cleanup all children
  • [Feature] Preload file and static maps in main process
  • [Feature] Print stack trace on crash
  • [Feature] Process font size in HTML parser
  • [Feature] Propagate content length of invisible tags
  • [Feature] Read ordinary file maps in chunks to be more safe on rewrites
  • [Feature] Support base tag in HTML
  • [Feature] Support more size suffixes when parsing HTML styles
  • [Feature] Support opacity style
  • [Fix] Another fix for nested composites
  • [Fix] Fill nm id in keypairs cache code
  • [Fix] Fix colors alpha channel handling
  • [Fix] Fix destruction logic
  • [Fix] Fix double free
  • [Fix] Fix maps preload logic
  • [Fix] Fix nested composites process
  • [Fix] Fix proxying of Exim connections
  • [Fix] Fix reload crash
  • [Fix] Fix rspamadm -l command
  • [Fix] Update ed25519 signing schema
  • [WebUI] Stop using “const” declaration
  • [WebUI] Update RequireJS to 2.3.5

Rspamd 1.7.6 has been released

2018-06-15 00:00:00 +0000

We have released Rspamd 1.7.6 today. There are no incompatible changes introduced with this version to our best knowledge.

The most important features and fixes

  • Fix multiple neural networks support: it is now possible to learn multiple neural networks with different settings as documented
  • Rework rspamadm to use mostly Lua for subcommands for better documentation and extensions support
  • Add pubkey checks for dkim_signing module (#2277)
  • DMARC reports are now compressed using gzip as suggested by RFC
  • Settings module can now skip message processing to improve performance
  • Bayes classifier now consider more metatokens from the headers
  • ED25519 DKIM signatures are now supported
  • Fixed serious issues with composites, maps and other components
  • Major memory leak hunting and eliminating (especially those that occurs during reload)
  • Add more tests and allow to create fake DNS records to make certain tests self-contained (e.g. DKIM or DMARC)

Full list of the meaningful changes

  • [CritFix] Fix multiple neural networks support
  • [Feature] Add decryption function to keypair command
  • [Feature] Add gzip compression for HTTP requests in elastic module
  • [Feature] Add gzip methods to lua util
  • [Feature] Add maps based on Top Level Domains
  • [Feature] Add pubkey checks for dkim_signing
  • [Feature] Add support of fake DNS records
  • [Feature] Add tool to encrypt files
  • [Feature] Allow to add symbols using settings directly
  • [Feature] Allow to match private and public keys for DKIM signatures
  • [Feature] Allow to set task flags via settings
  • [Feature] Allow to specify fake DNS address from the config
  • [Feature] Implement signatures verification using rspamadm keypair
  • [Feature] Implement signing using rspamadm keypair
  • [Feature] Improve error reporting for DKIM key access issues
  • [Feature] Provide $HOSTNAME variable in UCL
  • [Feature] Rework levenshtein distance computation
  • [Feature] Split message parsing and processing
  • [Feature] Support ED25519 DKIM signatures
  • [Feature] Support encrypted configs in UCL
  • [Feature] Suppress duplicate warning on very large radix tries
  • [Feature] Use OSB to combine header names
  • [Fix] Cleanup maps data on shutdown
  • [Fix] Fix ‘~’ behaviour in composites
  • [Fix] Fix HTTP maps updates
  • [Fix] Fix NIST signatures
  • [Fix] Fix RFC822 comments when processing a mime address
  • [Fix] Fix double free
  • [Fix] Fix dynamic settings application
  • [Fix] Fix for CommuniGate Pro maillist
  • [Fix] Fix keypair creation method to actually create keypair…
  • [Fix] Fix matching patterns with no paths
  • [Fix] Fix memory leak in parsing comments
  • [Fix] Fix parsing of urls with numeric password
  • [Fix] Fix plugins intialisation in configwizard
  • [Fix] Fix potential crash on reload
  • [Fix] Fix potential race condition for a finished HTTP connections
  • [Fix] Fix race-condition leak on processes reload
  • [Fix] Fix signing in openssl mode
  • [Fix] Free language detector structures
  • [Fix] Relax alignment requirements
  • [Fix] Send DMARC reports compressed
  • [Fix] Try to fix leak in dmarc module
  • [Fix] Try to plug memory leak in metric exporter
  • [Project] Convert rspamadm subcommands to Lua
  • [WebUI] Display smtp sender/recipient in history
  • [WebUI] Fix elements disabling in “Symbols” tab
  • [WebUI] Limit recipients list in history column to 3
  • [WebUI] Match envelope and mime addresses following in arbitrary order
  • [WebUI] Update column header
  • [WebUI] Wrap addresses in history

Rspamd 1.7.0 has been released

2018-03-12 00:00:00 +0000

We are proud to announce the next major version of Rspamd today: 1.7.0. This version includes more than 1000 changes since Rspamd 1.6 branch. The most significant changes are:

  • Better machine learning support and embedding of Lua Torch
  • Language detection support based on ngramms and heuristics
  • New rspamadm configwizard command for a simple configuration setup
  • New statistics model for Redis backend allowing expiration and better analytics
  • Improved wizards for statistics conversion and management
  • Added automatic corpus test and rescoring utility based on Google Summer of Code 2017 project completed by @cpragadeesh
  • New Elasticsearch plugin
  • New experimental Reputation plugin
  • Various other important improvements and fixes

Release cycle model change

From this version onwards, Rspamd will no longer have stable and experimental branches. All development will be concentrated in the main branch with more frequent releases of both minor and major versions. Major version change (e.g. from 1.7 to 1.8) will mean some important change with backward compatility or a clear conversion path. Minor releases (e.g. 1.7.0 -> 1.7.1) will be released as soon as there are enough important changes. Any critical fix will cause a new version to be released.

We have decided to eliminate the concept of stable branches since it makes the processes of development and migration more complicated for both developers and Rspamd users. We will concentrate on stability and backward compatibility of the main branch instead.

Migration notes

Please read migration notes. We believe that no explicit configuration changes are required to upgrade from Rspamd 1.6.6.

New features in Rspamd 1.7

Here is a brief description of the new features appeared in Rspamd 1.7.

Better machine learning support

Rspamd has bundled torch support which is enabled by default. This framework is one of the principal Machine Learning frameworks implemented in Lua. We have decided to bundle it directly in Rspamd due to poor support of packaged version of Torch in the vast majority of Linux and Unix distributions. It is currently available for Intel x86_64 architecture only. Rspamd includes the following components of this framework:

  • basic Tensors and math framework (torch7)
  • neural networks support (nn)
  • optimization algorithms package (optim)
  • random forests support

So far, Torch is used in neural network module to improve neural network model and the speed of processing. This module has been improved to support clustered configurations, mail flows separation, and multiple neural networks.

Torch also empowers a new rescore utility described later in this article.

You can use Torch to build your own Machine Learning models to improve the quality of your spam filtering rules.

Language detection support

Rspamd 1.7 includes a new language detection support. It uses NGramms model to support more than 50 languages. Rspamd implements a fast and sophisticated algorithm to detect texts languages using unicode properties, ngramms and statistical methods to provide a more precise language detection. This information could be used for training models (e.g. word2vec embedding), better bayes classification (e.g. by removing of stop words), or by individual rules.

New rspamadm configwizard command

This command is intended to simplify Rspamd configuration and migration. It provides an interactive console UI to setup the most commonly used Rspamd functions not necessarily configured out-of-the-box:

It can also be used on any stage to adjust configuration:

  • DKIM signing setup: rspamadm configwizard dkim
  • Controller password: rspamadm configwizard controller
  • Statistics tools: rspamadm configwizard statistic
  • Redis setup: rspamadm configwizard redis

New model for Redis backend of statistical data

Rspamd has switched from the old layout of Redis storage where tokens were stored in two large hash tables: RSBAYES_SPAM and RSBAYES_HAM to a model where each token is stored separately: RS_<token_id> and 2 buckets S for spam hits and H for ham hits. The new model requires more space in Redis, however, it allows to expire meaningless or not frequently used tokens efficiently reducing storage requirements. A new explicitly enabled plugin called bayes_expire provides inteligent renewal and eviction of statistical tokens.

It is also possible to store token values inside statistical buckets for debugging and analytics purposes.

You can convert an old statistics in Redis (or sqlite) to a new one using rspamadm configwizard statistic command.

Corpus test and rescore tool

This project has been started as a Google Summer Of Code project and was completed by @cpragadeesh in 2017. It allows Rspamd to run against some pre-labeled corpus of spam and ham (rspamadm corpus_test tool) and then analyze anonymous logs produced by this command to adjust the best possible scores for Rspamd rules (rspamadm rescore). Here is a sample of this command run:

$ rspamadm rescore -l ./rescore_logs -o scores.ucl -d -i 10 --penalty-weight=0 --momentum=0.1 --learning-rate=0.01 --l1=0.0001

...

We plan to improve our scores using this tool and distribute updates using rspamd_update plugin.

Elasticsearch plugin

Rspamd now provides elastic search plugin that provides integration with Elasticsearch engine and Kibana board to visualise Rspamd data. It has such features as symbols heat map, geographical distribution of spam/ham sources, proportions of traffic scanned and so on. Elastic search plugin and clickhouse plugin provide a decent platform to collect analytics about your mail flows and spam filter efficiency.

Reputation plugin

Rspamd 1.7 includes an experimental reputation plugin. It is currently in work-in-progress state, however, we have a working prototype with the following features:

  • DKIM domain, URL domains and IP reputation support
  • Flexible backends supports: both Redis for private usage and DNS for public services
  • Flexible time buckets: long term buckets and short term buckets
  • Flexible aggregation tool (weighted aggregation, decision trees and so on)

This plugin is intended to replace the original ip score plugin and will provide much more reputation types (e.g. URL and DKIM reputation). It is also possible to build systems with both public reputation data that could be provided via DNS and internal reputation data stored in Redis buckets. This could be particularly beneficial for large email service providers.

In this version, the plugin is in still experimental stage but it is close to production testing so far.

7Zip support

Rspamd can now detect and process data from 7zip files. This functionality lives within mime types module and allows to filter malicious files in 7zip attachments.

Various improvements and changes

In conclusion, we can add that this version of Rspamd includes a lot of improvements in stability, performance and quality of filtering areas.