Imap Spam Filter
Overview
Enables imap spam filter. Please read carefully. Click example images to display full size version in new tab/window.
Understanding Spam Filter - Read First
Before you change any spam filter settings, please click here to better understand the spam filter system.
Base Settings
Enable Learning - Enables learning filters. Will auto learn when tickets are accepted or rejected via spam tickets screen.
Spam Relevancy Tokens - This tells b8 how many tokens should be used to calculate the spamminess of a text. The default setting is 15 (integer). This seems to be a quite reasonable value. When using too many tokens, the filter will fail on texts filled with useless stuff or with passages from a newspaper, etc. not being very spammish.
Spam Score Deviation - This defines a minimum deviation from 0.5 that a token's rating must have to be considered when calculating the spamminess. Tokens with a rating closer to 0.5 than this value will simply be skipped. If you don't want to use this feature, set this to 0. Defaults to 0.5 (float).
Gary Robinsons X Constant - This is Gary Robinson's X constant. A completely unknown token will be rated with the value of "Gary Robinsons X Constant". The default 0.5 (float) seems to be quite reasonable, as we can't say if a token that also can't be rated by degeneration is good or bad. If you receive much more spam than ham or vice versa, you could change this setting accordingly.
Gary Robinsons S Constant - This is Gary Robinson's S constant. This is essentially the probability that the "Gary Robinsons S Constant" value is correct for a completely unknown token. It will also shift the probability of rarely seen tokens towards this value. The default is 0.3 (float).
Spam Relevancy Tokens - This tells b8 how many tokens should be used to calculate the spamminess of a text. The default setting is 15 (integer). This seems to be a quite reasonable value. When using too many tokens, the filter will fail on texts filled with useless stuff or with passages from a newspaper, etc. not being very spammish.
Spam Score Deviation - This defines a minimum deviation from 0.5 that a token's rating must have to be considered when calculating the spamminess. Tokens with a rating closer to 0.5 than this value will simply be skipped. If you don't want to use this feature, set this to 0. Defaults to 0.5 (float).
Gary Robinsons X Constant - This is Gary Robinson's X constant. A completely unknown token will be rated with the value of "Gary Robinsons X Constant". The default 0.5 (float) seems to be quite reasonable, as we can't say if a token that also can't be rated by degeneration is good or bad. If you receive much more spam than ham or vice versa, you could change this setting accordingly.
Gary Robinsons S Constant - This is Gary Robinson's S constant. This is essentially the probability that the "Gary Robinsons S Constant" value is correct for a completely unknown token. It will also shift the probability of rarely seen tokens towards this value. The default is 0.3 (float).
Lexer Settings
Check Pure Numbers - Should pure numbers also be considered?
Check URIs (Uniform Resource Identifiers) - Look for URIs?
Extract HTML - Extract HTML?
Min Token Length - The minimal length for a token to be considered when calculating the rating of a text.
Max Token Length - The maximum length for a token to be considered when calculating the rating of a text.
Check URIs (Uniform Resource Identifiers) - Look for URIs?
Extract HTML - Extract HTML?
Min Token Length - The minimal length for a token to be considered when calculating the rating of a text.
Max Token Length - The maximum length for a token to be considered when calculating the rating of a text.
Degenerator Settings
Enable Multibyte Operations - Use multibyte operations when searching for degenerated versions of an unknown token. When activating this, b8 needs PHP's mbstring module to work.
Internal Encoding Set for Multibyte Operations - Using multibyte operations will make the degenerator more effective for foreign characters. Specify your preferred encoding set. Options shown in the drop down will be the only ones supported by the server.
Internal Encoding Set for Multibyte Operations - Using multibyte operations will make the degenerator more effective for foreign characters. Specify your preferred encoding set. Options shown in the drop down will be the only ones supported by the server.
Learning Options - Add to Learning Filters
Enter Text Block (ie: Email Message Body) - Enter message block to add to learning filters.
Analyse & Classify as Spam - Use this option to classify entered text as spam.
Analyse & Classify as Ham - Use this option to classify entered text as ham.
Analyse & Classify as Spam - Use this option to classify entered text as spam.
Analyse & Classify as Ham - Use this option to classify entered text as ham.
Learning Options - Reset Learning Filters
Reset Learning Filters - Deleted learning filters.
Reset Filters Older Than X Days - Only reset filters older than X days. It can be efficient to occasionally clear old learning filters and remove old entries from the db.
Reset Filters Older Than X Days - Only reset filters older than X days. It can be efficient to occasionally clear old learning filters and remove old entries from the db.
Learning Options - Skip Filters
Skip Filters - If match is found, message is always ignored and deleted. Comma delimit. Use cautiously. - Any text entered here will automatically flag a message for deletion. This should be used with caution. Enter
multiple options separated with a comma.
Imap Logs
All operations of the imap filters are logged if enabled in the settings (Settings > General > Imap Settings). If you find something has been caught by the spam filters view the logs for
more information.
Enable / Disable Spam Filter
Spam filters are enabled /disabled per account. More info here.
