Aspose.Words, Find All Emails in Word Document

Aspose.Words component enables reading, editing and printing of Word documents and converting of Word document to other formats, without using of Microsoft Office Automation.

By using of Aspose.Words instead of Microsoft Office Automation we get faster, more secure, stable and scalable solution. In this article, I will explain how to extract all email addresses and other valuable information from selected Word document.

Please download sample Visual Studio 2015 project, used in this article from AsposeWords Get Emails From Word Example. Zip package also contains sample SampleDocWithEmails.docx Word document you can use to test the project.

To get all emails from Microsoft Word document, follow this process: Open Visual Studio and create new Windows Forms project. In order to start work with Aspose.Words, first step is to add reference in your project like on image bellow:

Add reference to project

Now, find Aspose.Words.dll file on location where you installed Aspose.Words (by default it should be in /Program Files (x86) folder. After you select the Aspose.Words.dll file, click OK and your project References list should look like this:

Aspose.Words Reference

Aspose.Words Reference

Now you can use Aspose.Words component in your project. First create some user interface like in next image:

Aspose.Words project UI

Aspose.Words project UI

The idea of project is pretty simple: User will first click on Select Document button to find Word document. Then, click on “Get Emails From Document” button should read the document, extract email addresses and show result in Emails found text box on the right side.

Code for “Select Document” button shows open file dialog, so user can find the document:

private void btnSelectFile_Click(object sender, EventArgs e)
{
    OpenFileDialog docDialog = new OpenFileDialog();
    DialogResult result = docDialog.ShowDialog(); // Show the dialog.
    if (result == DialogResult.OK) // Test result.
    {
        tbFileName.Text = docDialog.FileName;
    }
}

After file is selected, click on “Get Emails From Document” button will process the Word document and extract wanted data:

private void btnGetEmails_click(object sender, EventArgs e)
{
    // Check if file is selected and exists, exit procedure if not
    if (!validate())
    {
        return;
    }

    // Create new instance of Word document
    Aspose.Words.Document doc = new Document(tbFileName.Text);
    // Read text from document
    string docText = doc.GetText();
    // Get emails to list
    List<string> emails = getEmailsFromString(docText);
    // Show found emails on form
    tbEmails.Text = String.Join(Environment.NewLine, emails.Distinct().ToArray());
}

Please notice validate() function on the top, which we use first to ensure that correct file is selected, and inform user about appropriate action to correct the problem:

private bool validate()
{
    if (tbFileName.Text == "")
    {
        MessageBox.Show("Please select Word document first.", "Warning", MessageBoxButtons.OK);
        return false;
    }
    else if (!File.Exists(tbFileName.Text))
    {
        MessageBox.Show("Selected file does not exist", "Warning", MessageBoxButtons.OK);
        return false;
    }
    return true;
}

To get emails from text, I use .Net Regular Expressions. To keep main function clean, extracting of emails is encapsulated in getEmailsFromString() function:

private List<string> getEmailsFromString(string text)
{
    List<string> emails = new List<string>();

    Regex emailPattern = new Regex(@"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*",
    RegexOptions.IgnoreCase);
    // find emails
    MatchCollection emailsFound = emailPattern.Matches(text);

    foreach (Match email in emailsFound)
    {
        emails.Add(email.Value);
    }

    return emails;
}

And that’s all! The project is ready to test now. Start the project and find some Word document on disk, or use sample SampleDocWithEmails.docx file. Click on Get Emails button will find all email addresses in document and display them in text box on the right side.

Conclusion

As you can see, with help of Aspose.Words component, reading and extracting data from Word document is incredibly easy and fast process. You can use this approach to get any other kind of valuable information from single document like URLs, phone numbers and any other formatted data. As an exercise you can try to change project code to process all documents in selected folder.

Aspose.Words.Document is main class which represents Word document. It contains over a hundred of properties and methods which can be used to read and manipulate Word document in any possible way.

Happy Coding!

T-SQL Update Table From Another Table

Let say you have two tables and you want to update values in first table with values from second table. That could be for example if you have Product table and you have to load new prices and discounts from Import_Prices table.

The T-SQL code to update data could look like this:

UPDATE Product
SET Product.Price = Import_Prices.Price,
    Product.Discount = Import_Prices.Discount
FROM Product INNER JOIN Import_Prices
   ON Product.ID = Import_Prices.ProductID

In this case column ID of table Product corresponds to column ProductID of table Import_Prices .

Be careful to define relationship correctly and eventually filter only needed rows with WHERE clause. UPDATE command could change all table rows if you don’t narrow it correctly, so be sure that you have backup created before using ad-hoc UPDATE query :)

ASP.NET MVC @Html.EditorFor TextArea

@Html.EditorFor creates single line text box for string data by default. That is good for short fields like user name or article title, but not comfortable for longer multi line content.

For long fields, using of textarea tag is more suitable solution. To inform Razor to use multi line textarea instead of single line text box, add this attribute to your model property:

[DataType(DataType.MultilineText)]
public string LongText { get; set; }

When you use DataType.MultilineText parameter, Razor will render textarea tag instead of single-line text box. That’s good enough in many cases, but sometimes you want more control over width, height and other styles of textarea. You can change text area styles with CSS:

textarea
{
    width: 600px;
    height: 300px;
}

This CSS will adjust all textarea tags to same styles. If you want different styles for text fields, define CSS class in view in EditorFor method like this:

@Html.EditorFor(model => model.LongText, new { htmlAttributes = new { @class = "long-text-editor" } })

Another option is to use @Html.TextAreaFor instead of @Html.EditorFor. Advantage of this method is that you don’t need to change model class. That could be preferred option if you work with automatically generated model, like when Entity Framework database first is used. Code in view could look like this:

@Html.TextAreaFor(model => model.LongText, 10, 80, new { htmlAttributes = new { @class = "form-control" } })

Please notice there are 4 parameters used for TextAreaFor method. Second and third parameters define rows and columns of textarea. Sample above will create text area of 10 rows and 80 columns. So, now you have textarea without need to change model.

Happy coding!

T-SQL Dynamic Order By

When you have some list of records in web or Windows application, sooner or later your client will say: Hey could you make columns sortable? It would be great if I could just click on header and order rows?

When coders try to implement dynamic Order By in T-SQL, first logical approach is to create variable and place column name in it. For example:

SELECT * FROM SomeTable ORDER BY @Variable

It seems logical on first sight, but this approach is not working. Variable is not understood as column, but as string value. Thus, rows will be ordered by constant. In other words, rows will not be ordered on any way.

So, what to do? Fortunately, there is better solution than writing several same queries with different ORDER BY clause. Check this example with using of CASE:

CREATE PROCEDURE uspGetOrderedItems
(
  @sortField VARCHAR(100) = 'ItemName',
  @sortDirection VARCHAR(4) = 'DESC'
)
AS
BEGIN         
  SET NOCOUNT ON

  SELECT * FROM Item 

  ORDER BY 
    (CASE WHEN @sortField = 'ItemName' AND @sortDirection = 'ASC' THEN ItemName END) ASC,
    (CASE WHEN @sortField = 'ItemName' AND @sortDirection = 'DESC' THEN ItemName END) DESC,
    (CASE WHEN @sortField = 'Price' AND @sortDirection = 'ASC' THEN Price END) ASC,
    (CASE WHEN @sortField = 'Price' AND @sortDirection = 'DESC' THEN Price END) DESC;
END

One more variation of same idea is to use ROW_NUM() function. If table contains lots of records, it’s better to provide paging. It’s usually bad idea to show thousands or millions of records in grid. For paging and ordering at same time, we can use ROW_NUMBER() function.

Here is example stored procedure which returns data, both paged and ordered, based on 4 input parameters:

CREATE PROCEDURE uspGetOrderedItemsWithPaging
(
  @sortField VARCHAR(100) = 'ItemName',
  @sortDirection VARCHAR(4) = 'DESC',
  @startRow int = 1,
  @endRow int = 20
)
AS
BEGIN         
  SET NOCOUNT ON

  SELECT * FROM 
    (
    SELECT ROW_NUMBER() OVER (	
      ORDER BY 
        (CASE WHEN @sortField = 'ItemName' AND @sortDirection = 'ASC' THEN ItemName END) ASC,
        (CASE WHEN @sortField = 'ItemName' AND @sortDirection = 'DESC' THEN ItemName END) DESC,
        (CASE WHEN @sortField = 'Price' AND @sortDirection = 'ASC' THEN Price END) ASC,
        (CASE WHEN @sortField = 'Price' AND @sortDirection = 'DESC' THEN Price END) DESC
      ) AS ROW_NUM, *
    FROM Item
    ) Result
    WHERE ROW_NUM BETWEEN @startRow AND @endRow;

END

In addition to these 4 parameters, based on your business needs you could add additional parameters which will be used in WHERE clause to filter the data.

How To Print Only Part Of Web Page

I got a request to implement print feature on page, but without opening new window. Instead of classic using of window.open and then window.print methods, all should stay on same page. Only content of one DIV tag should be printed.

After some search on Google I found solutions that use JavaScript, but personally I don’t like the idea to remove content of page to string, print what is needed and then take content back. It’s complicated and it could break something else on page.

Here is better solution to print one DIV only, without JavaScript. I organized HTML like this:

<header>
    Site header
</header
<section>
    Main template section
</section>
<footer>
    Site footer
</footer>

<div id="printDoc">
    Div which will be printed
</div>

Now, add this CSS:

#printTicket{
    display:none;
}

@media print {
  header, section, footer {
    display:none;
  }

  #printDoc {
    display:block;
  }
}

So, the idea is simple. Use @media print in CSS to define what will be shown in print time.

You can even create Print Preview feature on page. With JavaScript or jQuery place DIV on center of the screen. Add Print button which will call window.print(); method to print the window. Of course, only single DIV will be printed. :)

Happy coding!

Using DATEADD To Increase Or Decrease DateTime Value On SQL Server

If you want to add or subtract some time from DateTime value, SQL Server has very useful DATEADD function.

For example, to add two days to SomeDate field, use this code:

SELECT DATEADD(dd, 2, SomeDate) FROM SomeTable;

 

To subtract time, just use negative value of second parameter.

Syntax of DATEADD function is:

DATEADD( TimeIntervalName, Value, FieldOrVariable )

In previous example, d is abbreviation for day. Here is complete list of interval names and their abbreviations that can be used with DATEADD function:

Nanosecond: ns
Microsecond: mcs
Millisecond:  ms
Second:  ss or s
Minute: mi or n
Hour: hh
WeekDay: dw or w
Week: wk or ww
Day: dd or d
DayOfYear: dy or y
Month:  mm or just m
Quarter:  qq or q
Year:  yy or yyyy

Note that Nanosecond and Microsecond can be used only on SQL Server 2008 or later.

T-SQL To Create Full Text Catalog On SQL Server Express

SQL Express Management Studio has one irritating behavior when you try to create Full Text catalog. Even if you install SQL Server with advanced services and FullText is installed, Management Studio will still return an error like this:

SQL Server Express Create Full Text Catalog Error

SQL Server Express Create Full Text Catalog Error

Solution is to create Full Text Catalog directly with SQL query:

USE YourDatabaseName;
GO
CREATE FULLTEXT CATALOG ftMainCatalog AS DEFAULT;

After catalog is created, you can edit in Management Studio by using right click -> Properties, where you can add tables and columns to catalog by using Management Studio GUI.

How To Implement Google Analytics Tracking in WordPress Website

Google Analytics is powerful and free tool which provides all possible information about website audience.

To install Analytics tracking code on WordPress website, there are two basic ways:

Use Google Analytics Plugin

The method is to use specialized plugin. One very popular is plugin Google Analytics by Yoast. It has a lot of options, and even a payable Premium edition. Yoast is integrated with Google Analytics and provides a lot of data inside WordPress backend.

If you are not technical person, their free version is probably method you should choose. Just install plugin from Plugins section of Dashboard, set Analytics login information and you’re done.

Insert Google Analytics tracking code directly in theme code

For me, it’s a little too much to install big plugin just to place few lines of tracking code. Yoast has additional features, like showing of Analytics data directly in WordPress Dashboard. But, I prefer to load directly to Analytics website since there I see more information.

One option is to insert tracking code to footer.php or header.php file of WordPress theme. Go to Dashboard -> Appearance -> Themes -> Editor page and click on Header on file list on right side. Place Google Analytics code just before ending </head> tag. Click Update File button to save the changes.

This option is also pretty easy and fast, and there is no additional plugin involved. However, there is one flaw. Since you edit theme file, if you update theme later to new version, that will remove tracking code too. So, you can either not update theme after you install Analytics tracking code, or add it again on same way after theme update.
Sometimes this is good enough, like when you use custom theme and have full control over update process.

Finally, I found Super Simple Google Analytics. It’s small plugin which only inserts Google tracking code on right place and nothing else. On this way you can avoid huge plugin and also Analytics tracking code will survive theme update.

How To Colorize Source Code Snippet In WordPress Post

There are lots of WordPress plugins for colorizing source code syntax, and many of them are pretty good.

Two most impressive I found are Crayon Syntax Highlighter and SyntaxHighlighter.

Both are great, and it’s hard to decide which is better. Both probably have more options than common user needs. Finally, I decided to use Crayon because it has slightly better ranking in WordPress plugins directory. Here is one small sample:

// Hello World in JavaScript
alert('Hello World!');

Websites hosted on WordPress.com have pre-installed syntax plugin. To place source code in article on blog hosted on WordPress.com, just use [sourcecode] or just [code] tag in text view. For previous code example if your blog is on WordPress.com you, should write something like this:

[code language=”javascript”]
// Hello World in JavaScript
alert(‘Hello World!’);
[/code]

More info about this feature you can find on http://en.support.wordpress.com/code/posting-source-code/