BayesFilter Class
Provides properties and methods for checking e-mails messages for spam probability and learning the filter from proven spam and non-spam messages.
Inheritance Hierarchy
SystemObject
  MailBee.AntiSpamBayesFilter

Namespace: MailBee.AntiSpam
Assembly: MailBee.NET (in MailBee.NET.dll) Version: 11.2.0 build 590 for .NET 4.5
Syntax
public class BayesFilter

The BayesFilter type exposes the following members.

Constructors
  NameDescription
Public methodBayesFilter
Creates an instance of BayesFilter class.
Public methodBayesFilter(String)
Creates and unlocks an instance of BayesFilter class.
Top
Methods
  NameDescription
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodLoadDatabase(Stream, Stream)
Loads Bayesian database from a stream.
Public methodCode exampleLoadDatabase(String, String)
Loads Bayesian database from disk.
Public methodLoadDatabaseAsync(Stream, Stream)
async/await version of LoadDatabase(Stream, Stream).
Public methodLoadDatabaseAsync(String, String)
async/await version of LoadDatabase(String, String).
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodCode exampleSaveDatabase(Stream, Stream)
Saves the Bayesian database to stream.
Public methodCode exampleSaveDatabase(String, String)
Saves the Bayesian database to disk.
Public methodSaveDatabase(Stream, Stream, Int32, Boolean)
Compacts the database by removing non-significant data and saves the database to stream.
Public methodCode exampleSaveDatabase(String, String, Int32, Boolean)
Compacts the database by removing non-significant data and saves the database to disk.
Public methodSaveDatabaseAsync(Stream, Stream)
async/await version of SaveDatabase(Stream, Stream).
Public methodSaveDatabaseAsync(String, String)
async/await version of SaveDatabase(String, String).
Public methodSaveDatabaseAsync(Stream, Stream, Int32, Boolean)
Public methodSaveDatabaseAsync(String, String, Int32, Boolean)
Public methodCode exampleScoreMessage
Analyses the message and returns the probability of the message being spam.
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Public methodCode exampleTrainFilter
Learns from the specified message as from spam or non-spam source.
Top
Properties
  NameDescription
Public propertyAlgorithm
Gets or sets the algorithm to be used for scoring messages.
Public propertyAutoLearning
Gets or sets if the filter should automatically learn during scoring messages on the words with the sufficient spam/non-spam weight.
Public propertyAutoLearningGradeAbove
Gets or sets the minimum score the message should get so that the filter will treat it as spam and automatically learn this.
Public propertyAutoLearningGradeBelow
Gets or sets the maximum score the message should get so that the filter will treat it as non-spam and automatically learn this.
Public propertyStatic memberLicenseKey Obsolete.
Assigns the license key.
Public propertyOnLockedDatabase
Gets or sets the application-supplied method which will be executed if MailBee founds that spam or non-spam database file is locked when it needs to perform I/O operation with it.
Public propertyTrialDaysLeft
Gets the number of days left to the date of the trial license key expiration.
Top
Remarks

The filter uses the existing database of spam and non-spam messages to score new messages in the range of 0-100%. 0% corresponds to absolutely non-spam message, 100% corresponds to absolutely spam message.

The key point is learning the filter: telling the filter which messages are spam or non-spam. This is called training the filter.

Initially, while the database is empty, the filter has no existing messages which to compare with the message in question.

Thus, the first task is to train the filter with several hundreds of typical spam and non-spam messages you usually get. This will increase the filter efficiency from zero to the suitable value. The training should, however, continue in the future to further improve the quality of spam recognition. The larger database is, the better spam/non-spam recognition is.

The filter will operate correctly ONLY if it was trained with a good number of spam AND non-spam messages.

You can train the filter using TrainFilter(MailMessage, Boolean) method.

To score a message (determine if it's spam or not), use ScoreMessage(MailMessage) method.

Note Note
MailBee also supports other technologies which can be used as a sort of antispam check. They include DNS RBL filter (RblFilter class), full support of DomainKeys (signing and verification of e-mails using DomainKeys signatures) and DNX MX/Reverse DNS checks (see GetMXHosts(String) and GetPtrData(String) methods).
Examples

This sample trains the Bayesian filter for spam and for non-spam messages; saves resulting database and scores sample e-mails for spam probability using this database.

It's assumed the spam and non-spam samples are .EML files located in C:\AntiSpam\Spam and C:\AntiSpam\NonSpam folders respectively. The database itself (spam.dat and nonspam.dat) will be saved in C:\AntiSpam folder.

// To use the code below, import these namespaces at the top of your code.
using System;
using System.IO;
using MailBee.Mime;
using MailBee.AntiSpam;

class Sample
{
    static void Main(string[] args)
    {
        BayesFilter filter = new BayesFilter();
        MailMessage msg = new MailMessage();

        // Train Bayesian filter for spam messages.
        string[] files = Directory.GetFiles(@"C:\AntiSpam\Spam", "*.eml");
        foreach (string file in files)
        {
            msg.LoadMessage(file);
            filter.TrainFilter(msg, true); // Mark as spam.
        }

        // Train Bayesian filter for non-spam messages.
        files = Directory.GetFiles(@"C:\AntiSpam\NonSpam", "*.eml");
        foreach (string file in files)
        {
            msg.LoadMessage(file);
            filter.TrainFilter(msg, false); // Mark as non-spam.
        }

        // Save Bayesian database to disk.
        filter.SaveDatabase(@"C:\AntiSpam\spam.dat", @"C:\AntiSpam\nonspam.dat");

        // Test our emails for spam.
        files = Directory.GetFiles(@"C:\AntiSpam\Emails", "*.eml");
        foreach (string file in files)
        {
            msg.LoadMessage(file);
            Console.WriteLine("Spam probability is: {0}%", filter.ScoreMessage(msg));
        }
    }
}
See Also