MailBee.NET HTML Component

MailBee.NET HTML Component

Parse or adjust HTML data, access its DOM structure, strip unwanted content, extract BODY section, find images or links
MailBee.NET Objects bundle includes SMTP, POP3, IMAP, EWS, Security, Antispam, Outlook Converter, Address Validator, PDF components and also BounceMail, HTML, MIME, ICalVCard components which are a free functionality.

MailBee.NET HTML component lets .NET applications process HTML documents such as HTML body of an e-mail in a number of ways. It parses HTML data and provides its Document Object Model (DOM). You can traverse through this DOM structure examining of adjusting HTML tags or their content or just define some rules to be applied to every tag (e.g. remove any tag which contains javascript in its attributes and all <SCRIPT> blocks).

In fact, MailBee.NET HTML does for HTML what XML parsers do for XML documents. Unlike XML parsers, MailBee.NET HTML can live just fine with HTML documents which do not conform to XHTML standard. Unclosed or intersecting tags and other formatting problems won't break it.

This component is a free functionality within MailBee.NET Objects. This means you can use it with any licensed component of MailBee.NET Objects family at no additional cost.

Typically, you will use MailBee.NET HTML with MailBee.NET POP3 or MailBee.NET IMAP components and their supporting MIME classes. For instance, MIME classes already provide built-in functionality to make links open in a new window. With MailBee.NET HTML, you can also remove any unsafe content. That's how you can do this for the e-mail downloaded from a POP3 server:

// Download last e-mail from inbox (assuming pop is Pop3 instance)
MailMessage msg = pop.DownloadEntireMessage(pop.InboxMessageCount);
// Use MIME to clean to remove all attributes from A HREF
// and insert target=_blank instead.
msg.Parser.AHRefCleanup = AHRefTagAttributes.ClassAndStyle |
    AHRefTagAttributes.Onclick | AHRefTagAttributes.Target;
msg.Parser.AHRefSuffix = "target=_blank;
// Also use MailBee.NET HTML to remove all unsafe stuff from the HTML body.
Processor htmlProcessor = new Processor();
htmlProcessor.Dom.OuterHtml = msg.BodyHtmlText;
RuleSet rules = RuleSet.GetSafeHtmlRules();
string result = htmlProcessor.Dom.ProcessToString(rules, null);
Console.WriteLine(result);
' Download last e-mail from inbox (assuming pop is Pop3 instance)
Dim msg As MailMessage = pop.DownloadEntireMessage(pop.InboxMessageCount)
' Use MIME to clean to remove all attributes from A HREF
' and insert target=_blank instead.
msg.Parser.AHRefCleanup = AHRefTagAttributes.ClassAndStyle Or _
    AHRefTagAttributes.Onclick Or AHRefTagAttributes.Target
msg.Parser.AHRefSuffix = "target=_blank"
' Also use MailBee.NET HTML to remove all unsafe stuff from the HTML body.
Dim htmlProcessor As Processor = New Processor()
htmlProcessor.Dom.OuterHtml = msg.BodyHtmlText
Dim rules As RuleSet = RuleSet.GetSafeHtmlRules()
Dim result As String = htmlProcessor.Dom.ProcessToString(rules, Nothing)
Console.WriteLine(result)
The code above assumes that MailBee, MailBee.Mime, MailBee.Html, and MailBee.Pop3Mail namespaces are declared ("using" in C#, "Imports" in VB) and 'pop' variable denotes MailBee.Pop3Mail.Pop3 instance object in authenticated state.
Current version: 11.0 Last update: 14 March 2017

Common problems you can solve with MailBee.NET HTML classes:

  • HTML e-mail body includes unsafe content like javascript, applets, iframes. You want to get rid of it.
  • Links in HTML e-mail body open in the same window. You want them open in a new window.
  • External images get displayed. You want to block them and display only those images which are embedded in the e-mail.
  • HTML e-mail is a complete HTML document with <HTML> and <BODY> tags but you need to display it inside another HTML document like your web page which already has its own <HTML> and <BODY> tags. You want to get only the content of BODY tag of the HTML e-mail to avoid incorrect formatting with multiple <HTML> and <BODY> sections on the web page.
  • HTML page with rich formatting contains some useful data which you need to extract (for instance, if you’re working on a web crawler or search engine). You want to easily traverse through the DOM to find nodes containing the data you’re looking after.

Besides that, you can use MailBee.NET HTML to parse or adjust any HTML data, not related to e-mail tasks at all. This is because MailBee.NET HTML works directly with strings and streams, not with MailMessage object.

Visual Studio

Xamarin

Written in 100% managed code, MailBee.NET HTML only requires the .NET framework to be installed on the computer.

MailBee.NET HTML component can be used in any .NET language including C# and VB.NET. Supported .NET frameworks include .NET 2.0/3.0/3.5/4.0/4.5/4.6, both 32-bit and 64-bit. Also supports Xamarin Mono, iOS, Android.

Clients Say:

"I've been looking at the MailBee.NET application and love it." Matt Yeager
"Thanks again for your help. I like your products..." Ron Hill
"By the way I love your software. Great work thank you." Dennis Drogemuller