Rspamd proxy worker

This worker provides various functionalities for building multi-layered systems and handling the Milter protocol. Here is a brief list of functions provided by the proxy worker:

Forwarding messages to the scanning layer
Direct interaction with the MTA using the Milter protocol
Performing load balancing, retransmitting, and health checks for the scanning layer
Adding encryption and/or compression to scan requests
Mirroring some traffic to a test server
Comparing results of mirrored requests
Performing message scans autonomously (self-scan mode)

The hosts option for the upstream and mirror can specify IP addresses or Unix domain sockets, as described in the upstreams documentation. If the port number is omitted, port 11333 is assumed.

For a full list of options, please refer to rspamadm confighelp workers.rspamd_proxy.

Default configuration

The proxy worker’s most widely useful feature is its ability to communicate using the Milter protocol, and the default configuration is designed with this in mind. By default, the proxy worker is enabled and listening on localhost:11332 in milter mode, with localhost configured as an upstream (refer to $CONFDIR/worker-proxy.inc).

This means that users who require Milter protocol support in their installations can use it straight out of the box.

For users who do not need Milter support, it’s generally more efficient to use normal workers directly and disable the proxy worker to save resources.

Milter support

Starting from Rspamd 1.6, the rspamd proxy worker supports the milter protocol, which is compatible with popular MTAs like Postfix and Sendmail. This new feature also marks the obsolescence of the Rmilter project in recognition of the improved integration method.

To enable Milter mode, use the milter boolean worker option. When enabled, the proxy communicates exclusively in the Milter protocol. If disabled, the proxy can be used with Rspamd’s native HTTP protocol and the legacy protocol used by Exim.

It’s important to note that Milter support is available in the rspamd_proxy worker only. There are two ways to use the Milter protocol:

Proxy mode (for large instances) with a dedicated scan layer
Self-scan mode (for small instances)

If your setup doesn’t allow your MTA to reject emails, you can set discard_on_reject (available from version 1.6.2 onwards) to true to discard spam emails.

Self-scan mode

In this mode, the rspamd_proxy worker scans messages independently and communicates directly with the MTA using the Milter protocol. The advantage of this mode is its simplicity. Below is a sample configuration for this mode:

# local.d/worker-proxy.inc
upstream "local" {
  self_scan = yes; # Enable self-scan
}

# Proxy worker is listening on *:11332 by default
#bind_socket = localhost:11332;

Also you can disable¹ normal worker to free up system resources as it is not necessary in self-scan mode:

# local.d/worker-normal.inc
enabled = false;

But there is a drawback: since rspamc uses normal worker by default you need to explicitly point it to controller worker port (11334)²:

rspamc -h rspamd.example.org:11334 input-file

1. The enabled option is available for workers since Rspamd 1.6.2, in previous versions you can use count = 0; instead.

2. When connecting to local IP rspamc uses controller port by default (1.7+).

Proxy mode

In this mode, a dedicated layer of Rspamd scanners is employed, featuring load-balancing and optional encryption and/or compression. For this particular setup, the configuration may vary. Below is a concise example of proxy mode with four scanners, where two of them are allocated more resources to handle a higher volume of requests. Additionally, the local worker is disabled:

# local.d/worker-proxy.inc
upstream "local" {
  disabled = true;
}

upstream "scan" {
  default = yes;
  hosts = "round-robin:host1:11333:10,host2:11333:10,host3:11333:5,host4:11333:5";
  key = "..."; # Public key for encryption, generated by rspamadm keypair (optional)
  compression = yes; # Use zstd compression (optional)
}

Mirroring

The proxy can be utilized for testing purposes, including:

evaluating new versions of Rspamd
testing new plugins
validating new rules
experimenting with configuration changes
assessing ML models

In this mode, Rspamd mirrors a portion of its traffic to a test cluster. The scan results from the test cluster are disregarded when responding to clients. However, optional comparison scripts can be initiated to assess the mirrored results. Below is a sample configuration for this setup, with no utilization of milter mode in this example:

# local.d/worker-proxy.inc
# Main scan layer
upstream "scan" {
  default = "yes";
  hosts = "round-robin:host1:11333:10,host2:11333:10,host3:11333:5,host4:11333:5";
  key = "..."; # Public key for encryption, generated by rspamadm keypair
  compression = yes; # Use zstd compression
}

mirror "test" {
  hosts = "test:11333";
  probability = 0.1; # Mirror 10% of traffic
  key = "..."; # Public key for encryption, generated by rspamadm keypair
  compression = yes; # Use zstd compression
}

Compare scripts

Comparison scripts are designed for executing straightforward actions with the results obtained from a mirror and the main cluster machine. These scripts do not support asynchronous requests, so your options are limited to logging or writing to files. Below is a basic example of such a script in the configuration:

# local.d/worker-proxy.inc
  script =<<EOD
return function(results)
  local log = require "rspamd_logger"

  for k,v in pairs(results) do
    if type(v) == 'table' then
      log.infox("%s: %s", k, v['score'])
    else
      log.infox("err: %s: %s", k, v)
    end
  end
end
EOD;