Module rspamd_task

This module provides routines for tasks manipulation in rspamd. Tasks usually represent messages being scanned, and this API provides access to such elements as headers, symbols, metrics and so on and so forth. Normally, task objects are passed to the lua callbacks allowing to check specific properties of messages and add the corresponding symbols to the scan’s results.

Example:

rspamd_config.DATE_IN_PAST = function(task)
	if rspamd_config:get_api_version() >= 5 then
	local dm = task:get_date{format = 'message', gmt = true}
	local dt = task:get_date{format = 'connect', gmt = true}
		-- A day
		if dt - dm > 86400 then
			return true
		end
	end

	return false
end

Brief content:

Methods:

task:get_cfg()

task:get_mempool()

task:get_session()

task:get_ev_base()

task:insert_result(symbol, weigth[, option1, ...])

task:set_pre_result(action, description)

task:append_message(message)

task:get_urls([need_emails])

task:has_urls([need_emails])

task:get_content()

task:get_rawbody()

task:get_emails()

task:get_text_parts()

task:get_parts()

task:get_request_header(name)

task:set_request_header(name, value)

task:get_header(name[, case_sensitive])

task:get_header_raw(name[, case_sensitive])

task:get_header_full(name[, case_sensitive])

task:get_raw_headers()

task:get_received_headers()

task:get_queue_id()

task:get_uid()

task:get_resolver()

task:inc_dns_req()

task:get_dns_req()

task:has_recipients([type])

task:get_recipients([type])

task:has_from([type])

task:get_from([type])

task:get_user()

task:get_from_ip()

task:set_from_ip(str)

task:get_client_ip()

task:get_helo()

task:get_hostname()

task:get_images()

task:get_archives()

task:get_symbol(name)

task:get_symbols()

task:get_symbols_numeric()

task:has_symbol(name)

task:get_date(type[, gmt])

task:get_message_id()

task:get_metric_score(name)

task:get_metric_action(name)

task:set_metric_score(name, score)

task:set_metric_action(name, action)

task:set_metric_subject(subject)

task:learn(is_spam[, classifier)

task:set_settings(obj)

task:get_settings()

task:lookup_settings(key)

task:get_settings_id()

task:set_rmilter_reply(obj)

task:process_re(params)

task:cache_set(key, value)

task:cache_get(key)

task:get_size()

task:set_flag(flag_name[, set])

task:has_flag(flag_name)

task:get_flags()

task:get_digest()

task:store_in_file([mode])

Methods

The module rspamd_task defines the following methods.

Method task:get_cfg()

Get configuration object for a task.

Parameters:

No parameters

Returns:

  • {rspamd_config}: (config.html)[configuration object] for the task

Back to module description.

Method task:get_mempool()

Returns memory pool valid for a lifetime of task. It is used internally by many rspamd routines.

Parameters:

No parameters

Returns:

  • {rspamd_mempool}: memory pool object

Back to module description.

Method task:get_session()

Returns asynchronous session object that is used by many rspamd asynchronous utilities internally.

Parameters:

No parameters

Returns:

  • {rspamd_session}: session object

Back to module description.

Method task:get_ev_base()

Return asynchronous event base for using in callbacks and resolver.

Parameters:

No parameters

Returns:

  • {rspamd_ev_base}: event base

Back to module description.

Method task:insert_result(symbol, weigth[, option1, ...])

Insert specific symbol to the tasks scanning results assigning the initial weight to it.

Parameters:

  • symbol {string}: symbol to insert
  • weight {number}: initial weight (this weight is multiplied by the metric weight)
  • options {string}: list of optional options attached to a symbol inserted

Returns:

No return

Example:

local function cb(task)
	if task:get_header('Some header') then
		task:insert_result('SOME_HEADER', 1.0, 'Got some header')
	end
end

Back to module description.

Method task:set_pre_result(action, description)

Sets pre-result for a task. It is used in pre-filters to specify early result of the task scanned. If a pre-filter sets some result, then further processing may be skipped. For selecting action it is possible to use global table rspamd_actions or a string value:

  • reject: reject message permanently
  • add header: add spam header
  • rewrite subject: rewrite subject to spam subject
  • greylist: greylist message
  • accept or no action: whitelist message

Parameters:

  • action {rspamd_action or string}: a numeric or string action value
  • description {string}: optional descripton

Returns:

No return

Example:

local function cb(task)
	local gr = task:get_header('Greylist')
	if gr and gr == 'greylist' then
		task:set_pre_result(rspamd_actions['greylist'], 'Greylisting required')
	end
end

Back to module description.

Method task:append_message(message)

Adds a message to scanning output.

Parameters:

  • message {string}:

Returns:

No return

Example:

local function cb(task)
	task:append_message('Example message')
end

Back to module description.

Method task:get_urls([need_emails])

Get all URLs found in a message.

Parameters:

  • need_emails {boolean}: if true then reutrn also email urls

Returns:

  • {table rspamd_url}: list of all urls found

Example:

local function phishing_cb(task)
	local urls = task:get_urls();

	if urls then
		for _,url in ipairs(urls) do
			if url:is_phished() then
				return true
			end
		end
	end
	return false
end

Back to module description.

Method task:has_urls([need_emails])

Returns ‘true’ if a task has urls listed

Parameters:

  • need_emails {boolean}: if true then reutrn also email urls

Returns:

  • {boolean}: true if a task has urls (urls or emails if need_emails is true)

Back to module description.

Method task:get_content()

Get raw content for the specified task

Parameters:

No parameters

Returns:

  • {text}: the data contained in the task

Back to module description.

Method task:get_rawbody()

Get raw body for the specified task

Parameters:

No parameters

Returns:

  • {text}: the data contained in the task

Back to module description.

Method task:get_emails()

Get all email addresses found in a message.

Parameters:

No parameters

Returns:

  • {table rspamd_url}: list of all email addresses found

Back to module description.

Method task:get_text_parts()

Get all text (and HTML) parts found in a message

Parameters:

No parameters

Returns:

  • {table rspamd_text_part}: list of text parts

Back to module description.

Method task:get_parts()

Get all mime parts found in a message

Parameters:

No parameters

Returns:

  • {table rspamd_mime_part}: list of mime parts

Back to module description.

Method task:get_request_header(name)

Get value of a HTTP request header.

Parameters:

  • name {string}: name of header to get

Returns:

  • {rspamd_text}: value of an HTTP header

Back to module description.

Method task:set_request_header(name, value)

Set value of a HTTP request header. If value is omitted, then a header is removed

Parameters:

  • name {string}: name of header to get
  • value {rspamd_text/string}: new header’s value

Returns:

No return

Back to module description.

Method task:get_header(name[, case_sensitive])

Get decoded value of a header specified with optional case_sensitive flag. By default headers are searched in caseless matter.

Parameters:

  • name {string}: name of header to get
  • case_sensitive {boolean}: case sensitiveness flag to search for a header

Returns:

  • {string}: decoded value of a header

Back to module description.

Method task:get_header_raw(name[, case_sensitive])

Get raw value of a header specified with optional case_sensitive flag. By default headers are searched in caseless matter.

Parameters:

  • name {string}: name of header to get
  • case_sensitive {boolean}: case sensitiveness flag to search for a header

Returns:

  • {string}: raw value of a header

Back to module description.

Method task:get_header_full(name[, case_sensitive])

Get raw value of a header specified with optional case_sensitive flag. By default headers are searched in caseless matter. This method returns more information about the header as a list of tables with the following structure:

  • name - name of a header
  • value - raw value of a header
  • decoded - decoded value of a header
  • tab_separated - true if a header and a value are separated by tab character
  • empty_separator - true if there are no separator between a header and a value

Parameters:

  • name {string}: name of header to get
  • case_sensitive {boolean}: case sensitiveness flag to search for a header

Returns:

  • {list of tables}: all values of a header as specified above

Example:

function check_header_delimiter_tab(task, header_name)
	for _,rh in ipairs(task:get_header_full(header_name)) do
		if rh['tab_separated'] then return true end
	end
	return false
end

Back to module description.

Method task:get_raw_headers()

Get all undecoded headers of a message as a string

Parameters:

No parameters

Returns:

  • {rspamd_text}: all raw headers for a message as opaque text

Back to module description.

Method task:get_received_headers()

Returns a list of tables of parsed received headers. A tables returned have the following structure:

  • from_hostname - string that represents hostname provided by a peer
  • from_ip - string representation of IP address as provided by a peer
  • real_hostname - hostname as resolved by MTA
  • real_ip - string representation of IP as resolved by PTR request of MTA
  • by_hostname - MTA hostname
  • proto - protocol, e.g. ESMTP or ESMTPS
  • timestamp - received timetamp
  • for - for value (unparsed mailbox)

Please note that in some situations rspamd cannot parse all the fields of received headers. In that case you should check all strings for validity.

Parameters:

No parameters

Returns:

  • {table of tables}: list of received headers described above

Back to module description.

Method task:get_queue_id()

Returns queue ID of the message being processed.

Parameters:

No parameters

Returns:

No return

Back to module description.

Method task:get_uid()

Returns ID of the task being processed.

Parameters:

No parameters

Returns:

No return

Back to module description.

Method task:get_resolver()

Returns ready to use rspamd_resolver object suitable for making asynchronous DNS requests.

Parameters:

No parameters

Returns:

  • {rspamd_resolver}: resolver object associated with the task’s session

Example:

local logger = require "rspamd_logger"

local function task_cb(task)
	local function dns_cb(resolver, to_resolve, results, err)
		-- task object is available due to closure
		task:inc_dns_req()
		if results then
			logger.info(string.format('<%s> [%s] resolved for symbol: %s',
				task:get_message_id(), to_resolve, 'EXAMPLE_SYMBOL'))
			task:insert_result('EXAMPLE_SYMBOL', 1)
		end
	end
	local r = task:get_resolver()
	r:resolve_a(task:get_session(), task:get_mempool(), 'example.com', dns_cb)
end

Back to module description.

Method task:inc_dns_req()

Increment number of DNS requests for the task. Is used just for logging purposes.

Parameters:

No parameters

Returns:

No return

Back to module description.

Method task:get_dns_req()

Get number of dns requests being sent in the task

Parameters:

No parameters

Returns:

  • {number}: number of DNS requests

Back to module description.

Method task:has_recipients([type])

Return true if there are SMTP or MIME recipients for a task.

Parameters:

  • type {integer|string}: if specified has the following meaning: 0 or any means try SMTP recipients and fallback to MIME if failed, 1 or smtp means checking merely SMTP recipients and 2 or mime means MIME recipients only

Returns:

  • {bool}: true if there are recipients of the following type

Back to module description.

Method task:get_recipients([type])

Return SMTP or MIME recipients for a task. This function returns list of internet addresses each one is a table with the following structure:

  • name - name of internet address in UTF8, e.g. for Vsevolod Stakhov <blah@foo.com> it returns Vsevolod Stakhov
  • addr - address part of the address
  • user - user part (if present) of the address, e.g. blah
  • domain - domain part (if present), e.g. foo.com

Parameters:

  • type {integer|string}: if specified has the following meaning: 0 or any means try SMTP recipients and fallback to MIME if failed, 1 or smtp means checking merely SMTP recipients and 2 or mime means MIME recipients only

Returns:

  • {list of addresses}: list of recipients or nil

Back to module description.

Method task:has_from([type])

Return true if there is SMTP or MIME sender for a task.

Parameters:

  • type {integer|string}: if specified has the following meaning: 0 or any means try SMTP recipients and fallback to MIME if failed, 1 or smtp means checking merely SMTP recipients and 2 or mime means MIME recipients only

Returns:

  • {bool}: true if there is sender of the following type

Back to module description.

Method task:get_from([type])

Return SMTP or MIME sender for a task. This function returns an internet address which one is a table with the following structure:

  • name - name of internet address in UTF8, e.g. for Vsevolod Stakhov <blah@foo.com> it returns Vsevolod Stakhov
  • addr - address part of the address
  • user - user part (if present) of the address, e.g. blah
  • domain - domain part (if present), e.g. foo.com

Parameters:

  • type {integer|string}: if specified has the following meaning: 0 or any means try SMTP sender and fallback to MIME if failed, 1 or smtp means checking merely SMTP sender and 2 or mime means MIME From: only

Returns:

  • {address}: sender or nil

Back to module description.

Method task:get_user()

Returns authenticated user name for this task if specified by an MTA.

Parameters:

No parameters

Returns:

  • {string}: username or nil

Back to module description.

Method task:get_from_ip()

Returns ip_addr object of a sender that is provided by MTA

Parameters:

No parameters

Returns:

  • {rspamd_ip}: ip address object

Back to module description.

Method task:set_from_ip(str)

Set tasks’s IP address based on the passed string

Parameters:

  • str {string}: string representation of ip

Returns:

No return

Back to module description.

Method task:get_client_ip()

Returns ip_addr object of a client connected to rspamd (normally, it is an IP address of MTA)

Parameters:

No parameters

Returns:

  • {rspamd_ip}: ip address object

Back to module description.

Method task:get_helo()

Returns the value of SMTP helo provided by MTA.

Parameters:

No parameters

Returns:

  • {string}: HELO value

Back to module description.

Method task:get_hostname()

Returns the value of sender’s hostname provided by MTA

Parameters:

No parameters

Returns:

  • {string}: hostname value

Back to module description.

Method task:get_images()

Returns list of all images found in a task as a table of rspamd_image. Each image has the following methods:

  • get_width - return width of an image in pixels
  • get_height - return height of an image in pixels
  • get_type - return string representation of image’s type (e.g. ‘jpeg’)
  • get_filename - return string with image’s file name
  • get_size - return size in bytes

Parameters:

No parameters

Returns:

  • {list of rspamd_image}: images found in a message

Back to module description.

Method task:get_archives()

Returns list of all archives found in a task as a table of rspamd_archive. Each archive has the following methods available:

  • get_files - return list of strings with filenames inside archive
  • get_files_full - return list of tables with all information about files
  • is_encrypted - return true if an archive is encrypted
  • get_type - return string representation of image’s type (e.g. ‘zip’)
  • get_filename - return string with archive’s file name
  • get_size - return size in bytes

Parameters:

No parameters

Returns:

  • {list of rspamd_archive}: archives found in a message

Back to module description.

Method task:get_symbol(name)

Searches for a symbol name in all metrics results and returns a list of tables one per metric that describes the symbol inserted. Please note that this function is intended to return values for inserted symbols, so if this symbol was not inserted it won’t be in the function’s output. This method is useful for post-filters mainly. The symbols are returned as the list of the following tables:

  • metric - name of metric
  • score - score of a symbol in that metric
  • options - a table of strings representing options of a symbol
  • group - a group of symbol (or ‘ungrouped’)

Parameters:

  • name {string}: symbol’s name

Returns:

  • {list of tables}: list of tables or nil if symbol was not found in any metric

Back to module description.

Method task:get_symbols()

Returns array of all symbols matched for this task

Parameters:

No parameters

Returns:

  • {table, table}: table of strings with symbols names + table of theirs scores

Back to module description.

Method task:get_symbols_numeric()

Returns array of all symbols matched for this task

Parameters:

No parameters

Returns:

  • {table|number, table|number}: table of numbers with symbols ids + table of theirs scores

Back to module description.

Method task:has_symbol(name)

Fast path to check if a specified symbol is in the task’s results

Parameters:

  • name {string}: symbol’s name

Returns:

  • {boolean}: true if symbol has been found

Back to module description.

Method task:get_date(type[, gmt])

Returns timestamp for a connection or for a MIME message. This function can be called with a single table arguments with the following fields:

  • format - a format of date returned:
  • message - returns a mime date as integer (unix timestamp)
  • message_str - returns a mime date as string (UTC format)
  • connect - returns a unix timestamp of a connection to rspamd
  • connect_str - returns connection time in UTC format
  • gmt - returns date in GMT timezone (normal for unix timestamps)

By default this function returns connection time in numeric format.

Parameters:

  • type {string}: date format as described above
  • gmt {boolean}: gmt flag as described above

Returns:

  • {string/number}: date representation according to format

Example:

rspamd_config.DATE_IN_PAST = function(task)
	local dm = task:get_date{format = 'message', gmt = true}
	local dt = task:get_date{format = 'connect', gmt = true}
	-- A day
	if dt - dm > 86400 then
		return true
	end

	return false
end

Back to module description.

Method task:get_message_id()

Returns message id of the specified task

Parameters:

No parameters

Returns:

  • {string}: if of a message

Back to module description.

Method task:get_metric_score(name)

Get the current score of metric name. Should be used in post-filters only.

Parameters:

  • name {string}: name of a metric

Returns:

  • {table}: table containing the current score and required score of the metric

Back to module description.

Method task:get_metric_action(name)

Get the current action of metric name. Should be used in post-filters only.

Parameters:

  • name {string}: name of a metric

Returns:

  • {string}: the current action of the metric as a string

Back to module description.

Method task:set_metric_score(name, score)

Set the current score of metric name. Should be used in post-filters only.

Parameters:

  • name {string}: name of a metric
  • score {number}: the current score of the metric

Returns:

No return

Back to module description.

Method task:set_metric_action(name, action)

Set the current action of metric name. Should be used in post-filters only.

Parameters:

  • name {string}: name of a metric
  • action {string}: name to set

Returns:

No return

Back to module description.

Method task:set_metric_subject(subject)

Set the subject in the default metric

Parameters:

  • subject {string}: subject to set

Returns:

No return

Back to module description.

Method task:learn(is_spam[, classifier)

Learn classifier classifier with the task. If is_spam is true then message is learnt as spam. Otherwise HAM is learnt. By default, this function learns bayes classifier.

Parameters:

  • is_spam {boolean}: learn spam or ham
  • classifier {string}: classifier’s name

Returns:

  • {boolean}: true if classifier has been learnt successfully

Back to module description.

Method task:set_settings(obj)

Set users settings object for a task. The format of this object is described here.

Parameters:

  • obj {any}: any lua object that corresponds to the settings format

Returns:

No return

Back to module description.

Method task:get_settings()

Gets users settings object for a task. The format of this object is described here.

Parameters:

No parameters

Returns:

  • {lua object}: lua object generated from UCL

Back to module description.

Method task:lookup_settings(key)

Gets users settings object with the specified key for a task.

Parameters:

  • key {string}: key to lookup

Returns:

  • {lua object}: lua object generated from UCL

Back to module description.

Method task:get_settings_id()

Get numeric hash of settings id if specified for this task. 0 is returned otherwise.

Parameters:

No parameters

Returns:

  • {number}: settings-id hash

Back to module description.

Method task:set_rmilter_reply(obj)

Set special reply for rmilter

Parameters:

  • obj {any}: any lua object that corresponds to the settings format

Returns:

No return

Example:

task:set_rmilter_reply({
	add_headers = {['X-Lua'] = 'test'},
	-- 1 is the position of header to remove
	remove_headers = {['DKIM-Signature'] = 1},
})

Back to module description.

Method task:process_re(params)

Processes the specified regexp and returns number of captures (cached or new) Params is the table with the follwoing fields (mandatory fields are marked with *):

  • re* : regular expression object
  • type*: type of regular expression:
  • mime: mime regexp
  • header: header regexp
  • rawheader: raw header expression
  • rawmime: raw mime regexp
  • body: raw body regexp
  • url: url regexp
  • header: for header and rawheader regexp means the name of header
  • strong: case sensitive match for headers

Parameters:

No parameters

Returns:

  • {number}: number of regexp occurences in the task (limited by 255 so far)

Back to module description.

Method task:cache_set(key, value)

Store some value to the task cache

Parameters:

  • key {string}: key to use
  • value {any}: any value (including functions and tables)

Returns:

No return

Back to module description.

Method task:cache_get(key)

Returns cached value or nil if nothing is cached

Parameters:

  • key {string}: key to use

Returns:

  • {any}: cached value

Back to module description.

Method task:get_size()

Returns size of the task in bytes (that includes headers + parts size)

Parameters:

No parameters

Returns:

  • {number}: size in bytes

Back to module description.

Method task:set_flag(flag_name[, set])

Set specific flag for task:

  • no_log: do not log task summary
  • no_stat: do not include task into scanned stats
  • pass_all: check all filters for task
  • extended_urls: output extended info about urls
  • skip: skip task processing
  • learn_spam: learn message as spam
  • learn_ham: learn message as ham
  • broken_headers: header data is broken for a message

Parameters:

  • flag {string}: to set
  • set {boolean}: set or clear flag (default is set)

Returns:

No return

Example:

--[[
For messages with undefined queue ID (scanned with rspamc or WebUI)
do not include results into statistics and do not log task summary
(it will not appear in the WebUI history as well).
]]--

-- Callback function to set flags
local function no_log_stat_cb(task)
  if not task:get_queue_id() then
    task:set_flag('no_log')
    task:set_flag('no_stat')
  end
end

rspamd_config:register_symbol({
  name = 'LOCAL_NO_LOG_STAT',
  type = 'postfilter',
  callback = no_log_stat_cb
})

Back to module description.

Method task:has_flag(flag_name)

Checks for a specific flag in task:

  • no_log: do not log task summary
  • no_stat: do not include task into scanned stats
  • pass_all: check all filters for task
  • extended_urls: output extended info about urls
  • skip: skip task processing
  • learn_spam: learn message as spam
  • learn_ham: learn message as ham
  • broken_headers: header data is broken for a message

Parameters:

  • flag {string}: to check

Returns:

  • {boolean}: true if flags is set

Back to module description.

Method task:get_flags()

Get list of flags for task:

  • no_log: do not log task summary
  • no_stat: do not include task into scanned stats
  • pass_all: check all filters for task
  • extended_urls: output extended info about urls
  • skip: skip task processing
  • learn_spam: learn message as spam
  • learn_ham: learn message as ham
  • broken_headers: header data is broken for a message

Parameters:

No parameters

Returns:

  • {array of strings}: table with all flags as strings

Back to module description.

Method task:get_digest()

Returns message’s unique digest (32 hex symbols)

Parameters:

No parameters

Returns:

  • {string}: hex digest

Back to module description.

Method task:store_in_file([mode])

If task was loaded using file scan, then this method just returns its name, otherwise, a fresh temporary file is created and its name is returned. Default mode is 0600. To convert lua number to the octal mode you can use the following trick: tonumber("0644", 8). The file is automatically removed when task is destroyed.

Parameters:

  • mode {number}: mode for new file

Returns:

  • {string}: file name with task content

Back to module description.

Back to top.