BayesFilter Class |
Namespace: MailBee.AntiSpam
The BayesFilter type exposes the following members.
Name | Description | |
---|---|---|
BayesFilter |
Creates an instance of BayesFilter class.
| |
BayesFilter(String) |
Creates and unlocks an instance of BayesFilter class.
|
Name | Description | |
---|---|---|
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
LoadDatabase(Stream, Stream) |
Loads Bayesian database from a stream.
| |
LoadDatabase(String, String) |
Loads Bayesian database from disk.
| |
LoadDatabaseAsync(Stream, Stream) |
async/await version of LoadDatabase(Stream, Stream).
| |
LoadDatabaseAsync(String, String) |
async/await version of LoadDatabase(String, String).
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
SaveDatabase(Stream, Stream) |
Saves the Bayesian database to stream.
| |
SaveDatabase(String, String) |
Saves the Bayesian database to disk.
| |
SaveDatabase(Stream, Stream, Int32, Boolean) |
Compacts the database by removing non-significant data and saves
the database to stream.
| |
SaveDatabase(String, String, Int32, Boolean) |
Compacts the database by removing non-significant data and saves
the database to disk.
| |
SaveDatabaseAsync(Stream, Stream) |
async/await version of SaveDatabase(Stream, Stream).
| |
SaveDatabaseAsync(String, String) |
async/await version of SaveDatabase(String, String).
| |
SaveDatabaseAsync(Stream, Stream, Int32, Boolean) |
async/await version of SaveDatabase(Stream, Stream, Int32, Boolean).
| |
SaveDatabaseAsync(String, String, Int32, Boolean) |
async/await version of SaveDatabase(String, String, Int32, Boolean).
| |
ScoreMessage |
Analyses the message and returns the probability of the message being spam.
| |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
TrainFilter |
Learns from the specified message as from spam or non-spam source.
|
Name | Description | |
---|---|---|
Algorithm |
Gets or sets the algorithm to be used for scoring messages.
| |
AutoLearning |
Gets or sets if the filter should automatically learn during scoring
messages on the words with the sufficient spam/non-spam weight.
| |
AutoLearningGradeAbove |
Gets or sets the minimum score the message should get so that the filter will treat
it as spam and automatically learn this.
| |
AutoLearningGradeBelow |
Gets or sets the maximum score the message should get so that the filter will treat
it as non-spam and automatically learn this.
| |
LicenseKey | Obsolete.
Assigns the license key.
| |
OnLockedDatabase |
Gets or sets the application-supplied method which will be executed if MailBee founds that
spam or non-spam database file is locked when it needs to perform I/O operation with it.
| |
TrialDaysLeft |
Gets the number of days left to the date of the trial license key expiration.
|
The filter uses the existing database of spam and non-spam messages to score new messages in the range of 0-100%. 0% corresponds to absolutely non-spam message, 100% corresponds to absolutely spam message.
The key point is learning the filter: telling the filter which messages are spam or non-spam. This is called training the filter.
Initially, while the database is empty, the filter has no existing messages which to compare with the message in question.
Thus, the first task is to train the filter with several hundreds of typical spam and non-spam messages you usually get. This will increase the filter efficiency from zero to the suitable value. The training should, however, continue in the future to further improve the quality of spam recognition. The larger database is, the better spam/non-spam recognition is.
The filter will operate correctly ONLY if it was trained with a good number of spam AND non-spam messages.
You can train the filter using TrainFilter(MailMessage, Boolean) method.
To score a message (determine if it's spam or not), use ScoreMessage(MailMessage) method.
Note |
---|
MailBee also supports other technologies which can be used as a sort of antispam check. They include DNS RBL filter (RblFilter class), full support of DomainKeys (signing and verification of e-mails using DomainKeys signatures) and DNX MX/Reverse DNS checks (see GetMXHosts(String) and GetPtrData(String) methods). |
This sample trains the Bayesian filter for spam and for non-spam messages; saves resulting database and scores sample e-mails for spam probability using this database.
It's assumed the spam and non-spam samples are .EML files located in C:\AntiSpam\Spam and C:\AntiSpam\NonSpam folders respectively. The database itself (spam.dat and nonspam.dat) will be saved in C:\AntiSpam folder.
// To use the code below, import these namespaces at the top of your code. using System; using System.IO; using MailBee.Mime; using MailBee.AntiSpam; class Sample { static void Main(string[] args) { BayesFilter filter = new BayesFilter(); MailMessage msg = new MailMessage(); // Train Bayesian filter for spam messages. string[] files = Directory.GetFiles(@"C:\AntiSpam\Spam", "*.eml"); foreach (string file in files) { msg.LoadMessage(file); filter.TrainFilter(msg, true); // Mark as spam. } // Train Bayesian filter for non-spam messages. files = Directory.GetFiles(@"C:\AntiSpam\NonSpam", "*.eml"); foreach (string file in files) { msg.LoadMessage(file); filter.TrainFilter(msg, false); // Mark as non-spam. } // Save Bayesian database to disk. filter.SaveDatabase(@"C:\AntiSpam\spam.dat", @"C:\AntiSpam\nonspam.dat"); // Test our emails for spam. files = Directory.GetFiles(@"C:\AntiSpam\Emails", "*.eml"); foreach (string file in files) { msg.LoadMessage(file); Console.WriteLine("Spam probability is: {0}%", filter.ScoreMessage(msg)); } } }