Module rspamd_util

This module contains some generic purpose utilities that could be useful for testing and production rules.

Brief content:

Functions:

util.create_event_base()

util.load_rspamd_config(filename)

util.config_from_ucl(any, string)

util.encode_base64(input[, str_len, [newlines_type]])

util.decode_base64(input)

util.encode_base32(input)

util.decode_base32(input)

util.decode_url(input)

util.tokenize_text(input[, exceptions])

util.tanh(num)

util.parse_html(input)

util.levenshtein_distance(s1, s2)

util.parse_addr(str)

util.fold_header(name, value, [how, [stop_chars]])

util.is_uppercase(str)

util.humanize_number(num)

util.get_tld(host)

util.glob(pattern)

util.parse_mail_address(str, pool)

util.strlen_utf8(str)

util.strcasecmp(str1, str2)

util.strcasecmp(str1, str2)

util.strequal_caseless(str1, str2)

util.get_ticks()

util.get_time()

util.time_to_string(seconds)

util.stat(fname)

util.unlink(fname)

util.lock_file(fname, [fd])

util.unlock_file(fd, [close_fd])

util.create_file(fname, [mode])

util.close_file(fd)

util.random_hex(size)

util.zstd_compress(data)

util.zstd_decompress(data)

util.gzip_decompress(data)

util.gzip_compress(data)

util.normalize_prob(prob, [bias = 0.5])

util.is_utf_spoofed(str, [str2])

util.is_valid_utf8(str)

util.readline([prompt])

util.readpassphrase([prompt])

util.file_exists(file)

util.mkdir(dir[, recursive])

util.umask(mask)

util.isatty()

util.pack(fmt, ...)

util.packsize(fmt)

util.unpack(fmt, s [, pos])

util.caseless_hash(str[, seed])

util.caseless_hash_fast(str[, seed])

util.get_hostname()

Functions

The module rspamd_util defines the following functions.

Function util.create_event_base()

Creates new event base for processing asynchronous events

Parameters:

No parameters

Returns:

  • {ev_base}: new event processing base

Back to module description.

Function util.load_rspamd_config(filename)

Load rspamd config from the specified file

Parameters:

No parameters

Returns:

  • {confg}: new configuration object suitable for access

Back to module description.

Function util.config_from_ucl(any, string)

Load rspamd config from ucl reperesented by any lua table

Parameters:

No parameters

Returns:

  • {confg}: new configuration object suitable for access

Back to module description.

Function util.encode_base64(input[, str_len, [newlines_type]])

Encodes data in base64 breaking lines if needed

Parameters:

  • input {text or string}: input data
  • str_len {number}: optional size of lines or 0 if split is not needed

Returns:

  • {rspamd_text}: encoded data chunk

Back to module description.

Function util.decode_base64(input)

Decodes data from base64 ignoring whitespace characters

Parameters:

  • input {text or string}: data to decode; if rspamd{text} is used then the string is modified in-place

Returns:

  • {rspamd_text}: decoded data chunk

Back to module description.

Function util.encode_base32(input)

Encodes data in base32 breaking lines if needed

Parameters:

  • input {text or string}: input data

Returns:

  • {rspamd_text}: encoded data chunk

Back to module description.

Function util.decode_base32(input)

Decodes data from base32 ignoring whitespace characters

Parameters:

  • input {text or string}: data to decode

Returns:

  • {rspamd_text}: decoded data chunk

Back to module description.

Function util.decode_url(input)

Decodes data from url encoding

Parameters:

  • input {text or string}: data to decode

Returns:

  • {rspamd_text}: decoded data chunk

Back to module description.

Function util.tokenize_text(input[, exceptions])

Create tokens from a text using optional exceptions list

Parameters:

  • input {text/string}: input data
  • exceptions, {table}: a table of pairs containing <start_pos,length> of exceptions in the input

Returns:

  • {table/strings}: list of strings representing words in the text

Back to module description.

Function util.tanh(num)

Calculates hyperbolic tanhent of the specified floating point value

Parameters:

  • num {number}: input number

Returns:

  • {number}: hyperbolic tanhent of the variable

Back to module description.

Function util.parse_html(input)

Parses HTML and returns the according text

Parameters:

  • in {string|text}: input HTML

Returns:

  • {rspamd_text}: processed text with no HTML tags

Back to module description.

Function util.levenshtein_distance(s1, s2)

Returns levenstein distance between two strings

Parameters:

  • s1 {string}: the first string
  • s2 {string}: the second string

Returns:

  • {number}: number of differences in two strings

Back to module description.

Function util.parse_addr(str)

Parse rfc822 address to components. Returns a table of components:

  • name: name of address (e.g. Some User)
  • addr: address part (e.g. user@example.com)

Parameters:

  • str {string}: input string

Returns:

  • {table}: resulting table of components

Back to module description.

Function util.fold_header(name, value, [how, [stop_chars]])

Fold rfc822 header according to the folding rules

Parameters:

  • name {string}: name of the header
  • value {string}: value of the header
  • how {string}: “cr” for \r, “lf” for \n and “crlf” for \r\n (default)
  • stop_chars {string}: also fold header when the

Returns:

  • {string}: Folded value of the header

Back to module description.

Function util.is_uppercase(str)

Returns true if a string is all uppercase

Parameters:

  • str {string}: input string

Returns:

  • {bool}: true if a string is all uppercase

Back to module description.

Function util.humanize_number(num)

Returns humanized representation of given number (like 1k instead of 1000)

Parameters:

  • num {number}: number to humanize

Returns:

  • {string}: humanized representation of a number

Back to module description.

Function util.get_tld(host)

Returns effective second level domain part (eSLD) for the specified host

Parameters:

  • host {string}: hostname

Returns:

  • {string}: eSLD part of the hostname or the full hostname if eSLD was not found

Back to module description.

Function util.glob(pattern)

Returns results for the glob match for the specified pattern

Parameters:

  • pattern {string}: glob pattern to match (‘?’ and ‘*’ are supported)

Returns:

  • {table/string}: list of matched files

Back to module description.

Function util.parse_mail_address(str, pool)

Parses email address and returns a table of tables in the following format:

  • name - name of internet address in UTF8, e.g. for Vsevolod Stakhov <blah@foo.com> it returns Vsevolod Stakhov
  • addr - address part of the address
  • user - user part (if present) of the address, e.g. blah
  • domain - domain part (if present), e.g. foo.com

Parameters:

  • str {string}: input string
  • pool {rspamd_mempool}: memory pool to use

Returns:

  • {table/tables}: parsed list of mail addresses

Back to module description.

Function util.strlen_utf8(str)

Returns length of string encoded in utf-8 in characters. If invalid characters are found, then this function returns number of bytes.

Parameters:

  • str {string}: utf8 encoded string

Returns:

  • {number}: number of characters in string

Back to module description.

Function util.strcasecmp(str1, str2)

Compares two utf8 strings regardless of their case. Return value >0, 0 and <0 if str1 is more, equal or less than str2

Parameters:

  • str1 {string}: utf8 encoded string
  • str2 {string}: utf8 encoded string

Returns:

  • {number}: result of comparison

Back to module description.

Function util.strcasecmp(str1, str2)

Compares two ascii strings regardless of their case. Return value >0, 0 and <0 if str1 is more, equal or less than str2

Parameters:

  • str1 {string}: plain string
  • str2 {string}: plain string

Returns:

  • {number}: result of comparison

Back to module description.

Function util.strequal_caseless(str1, str2)

Compares two utf8 strings regardless of their case. Return true if str1 is equal to str2

Parameters:

  • str1 {string}: utf8 encoded string
  • str2 {string}: utf8 encoded string

Returns:

  • {bool}: result of comparison

Back to module description.

Function util.get_ticks()

Returns current number of ticks as floating point number

Parameters:

No parameters

Returns:

  • {number}: number of current clock ticks (monotonically increasing)

Back to module description.

Function util.get_time()

Returns current time as unix time in floating point representation

Parameters:

No parameters

Returns:

  • {number}: number of seconds since 01.01.1970

Back to module description.

Function util.time_to_string(seconds)

Converts time from Unix time to HTTP date format

Parameters:

  • seconds {number}: unix timestamp

Returns:

  • {string}: date as HTTP date

Back to module description.

Function util.stat(fname)

Performs stat(2) on a specified filepath and returns table of values

  • size: size of file in bytes
  • type: type of filepath: regular, directory, special
  • mtime: modification time as unix time

Parameters:

No parameters

Returns:

  • {string,table}: string is returned when error is occurred

Example:

local err,st = util.stat('/etc/password')

if err then
-- handle error
else
print(st['size'])
end

Back to module description.

Function util.unlink(fname)

Removes the specified file from the filesystem

Parameters:

  • fname {string}: filename to remove

Returns:

  • {boolean,[string]}: true if file has been deleted or false,’error string’

Back to module description.

Function util.lock_file(fname, [fd])

Lock the specified file. This function returns {number} which must be passed to util.unlock_file after usage or you’ll have a resource leak

Parameters:

  • fname {string}: filename to lock
  • fd {number}: use the specified fd instead of opening one

Returns:

  • {number|nil,string}: number if locking was successful or nil + error otherwise

Back to module description.

Function util.unlock_file(fd, [close_fd])

Unlock the specified file closing the file descriptor associated.

Parameters:

  • fd {number}: descriptor to unlock
  • close_fd {boolean}: close descriptor on unlocking (default: TRUE)

Returns:

  • {boolean[,string]}: true if a file was unlocked

Back to module description.

Function util.create_file(fname, [mode])

Creates the specified file with the default mode 0644

Parameters:

  • fname {string}: filename to create
  • mode {number}: open mode (you should use octal number here)

Returns:

  • {number|nil,string}: file descriptor or pair nil + error string

Back to module description.

Function util.close_file(fd)

Closes descriptor fd

Parameters:

  • fd {number}: descriptor to close

Returns:

  • {boolean[,string]}: true if a file was closed

Back to module description.

Function util.random_hex(size)

Returns random hex string of the specified size

Parameters:

  • len {number}: length of desired string in bytes

Returns:

  • {string}: string with random hex digests

Back to module description.

Function util.zstd_compress(data)

Compresses input using zstd compression

Parameters:

  • data {string/rspamd_text}: input data

Returns:

  • {rspamd_text}: compressed data

Back to module description.

Function util.zstd_decompress(data)

Decompresses input using zstd algorithm

Parameters:

  • data {string/rspamd_text}: compressed data

Returns:

  • {error,rspamd_text}: pair of error + decompressed text

Back to module description.

Function util.gzip_decompress(data)

Decompresses input using gzip algorithm

Parameters:

  • data {string/rspamd_text}: compressed data

Returns:

  • {error,rspamd_text}: pair of error + decompressed text

Back to module description.

Function util.gzip_compress(data)

Compresses input using gzip compression

Parameters:

  • data {string/rspamd_text}: input data

Returns:

  • {rspamd_text}: compressed data

Back to module description.

Function util.normalize_prob(prob, [bias = 0.5])

Normalize probabilities using polynom

Parameters:

  • prob {number}: probability param
  • bias {number}: number to subtract for making the final solution

Returns:

  • {number}: normalized number

Back to module description.

Function util.is_utf_spoofed(str, [str2])

Returns true if a string is spoofed (possibly with another string str2)

Parameters:

No parameters

Returns:

  • {boolean}: true if a string is spoofed

Back to module description.

Function util.is_valid_utf8(str)

Returns true if a string is valid UTF8 string

Parameters:

No parameters

Returns:

  • {boolean}: true if a string is spoofed

Back to module description.

Function util.readline([prompt])

Returns string read from stdin with history and editing support

Parameters:

No parameters

Returns:

  • {string}: string read from the input (with line endings stripped)

Back to module description.

Function util.readpassphrase([prompt])

Returns string read from stdin disabling echo

Parameters:

No parameters

Returns:

  • {string}: string read from the input (with line endings stripped)

Back to module description.

Function util.file_exists(file)

Checks if a specified file exists and is available for reading

Parameters:

No parameters

Returns:

  • {boolean,string}: true if file exists + string error if not

Back to module description.

Function util.mkdir(dir[, recursive])

Creates a specified directory

Parameters:

No parameters

Returns:

  • {boolean[,error]}: true if directory has been created

Back to module description.

Function util.umask(mask)

Sets new umask. Accepts either numeric octal string, e.g. ‘022’ or a plain number, e.g. 0x12 (since Lua does not support octal integrals)

Parameters:

No parameters

Returns:

  • {number}: old umask

Back to module description.

Function util.isatty()

Returns if stdout is a tty

Parameters:

No parameters

Returns:

  • {boolean}: true in case of output being tty

Back to module description.

Function util.pack(fmt, ...)

Backport of Lua 5.3 string.pack function: Returns a binary string containing the values v1, v2, etc. packed (that is, serialized in binary form) according to the format string fmt A format string is a sequence of conversion options. The conversion options are as follows:

  • <: sets little endian
  • : sets big endian

  • =: sets native endian
  • ![n]: sets maximum alignment to n (default is native alignment)
  • b: a signed byte (char)
  • B: an unsigned byte (char)
  • h: a signed short (native size)
  • H: an unsigned short (native size)
  • l: a signed long (native size)
  • L: an unsigned long (native size)
  • j: a lua_Integer
  • J: a lua_Unsigned
  • T: a size_t (native size)
  • i[n]: a signed int with n bytes (default is native size)
  • I[n]: an unsigned int with n bytes (default is native size)
  • f: a float (native size)
  • d: a double (native size)
  • n: a lua_Number
  • cn: a fixed-sized string with n bytes
  • z: a zero-terminated string
  • s[n]: a string preceded by its length coded as an unsigned integer with
  • n bytes (default is a size_t)
  • x: one byte of padding
  • Xop: an empty item that aligns according to option op (which is otherwise ignored)
  • ’ ‘: (empty space) ignored

(A “[n]” means an optional integral numeral.) Except for padding, spaces, and configurations (options “xX <=>!”), each option corresponds to an argument (in string.pack) or a result (in string.unpack).

For options “!n”, “sn”, “in”, and “In”, n can be any integer between 1 and All integral options check overflows; string.pack checks whether the given value fits in the given size; string.unpack checks whether the read value fits in a Lua integer.

Any format string starts as if prefixed by “!1=”, that is, with maximum alignment of 1 (no alignment) and native endianness.

Alignment works as follows: For each option, the format gets extra padding until the data starts at an offset that is a multiple of the minimum between the option size and the maximum alignment; this minimum must be a power of 2. Options “c” and “z” are not aligned; option “s” follows the alignment of its starting integer.

All padding is filled with zeros by string.pack (and ignored by unpack).

Parameters:

No parameters

Returns:

No return

Back to module description.

Function util.packsize(fmt)

Returns size of the packed binary string returned for the same fmt argument by util.pack

Parameters:

No parameters

Returns:

No return

Back to module description.

Function util.unpack(fmt, s [, pos])

Unpacks string s according to the format string fmt as described in util.pack

Parameters:

No parameters

Returns:

  • s {multiple} list of unpacked values according to fmt

Back to module description.

Function util.caseless_hash(str[, seed])

Calculates caseless non-crypto hash from a string or rspamd text

Parameters:

  • str: string or lua_text
  • seed: mandatory seed (0xdeadbabe by default)

Returns:

  • {int64}: boxed int64_t

Back to module description.

Function util.caseless_hash_fast(str[, seed])

Calculates caseless non-crypto hash from a string or rspamd text

Parameters:

  • str: string or lua_text
  • seed: mandatory seed (0xdeadbabe by default)

Returns:

  • {number}: number from int64_t

Back to module description.

Function util.get_hostname()

Returns hostname for this machine

Parameters:

No parameters

Returns:

  • {string}: hostname

Back to module description.

Back to top.