NovelEssay.com Programming Blog

Exploration of Big Data, Machine Learning, Natural Language Processing, and other fun problems.

C# IQueryable with a Dynamic Predicate Builder for Entity Framework and Linq

Scenario:

You have a table in a web page (or WinForms) with filterable columns. A user can add filter requirements or not, which makes your query's "where" clause very dynamic. This is easy to solve if you just have a few columns, but what if you have a dozen columns all which may or may not have filters that need to be applied to the "where" clause?


Solution:

Use the PredicateBuilder in the LinqKit Nuget package.


1) Install the LinqKit Nuget package.


2) Add the typical using statement for LinqKit

using LinqKit;


3) Create a function to populate your PredicateBuilder

IQueryable<person> GetPeoplePredicate(SearchParametersPeople searchParameters)
{
	var predicate = PredicateBuilder.True<person>();

	if (!string.IsNullOrEmpty(searchParameters.personName))
	{
		predicate = predicate.And(p => p.Name.Contains(searchParameters.personName));
	}
	if (!string.IsNullOrEmpty(searchParameters.personTitle))
	{
		predicate = predicate.And(p => p.JobTitle.Contains(searchParameters.personTitle));
	}
	if (!string.IsNullOrEmpty(searchParameters.personLocation))
	{
		predicate = predicate.And(p => p.Locations.Contains(searchParameters.personLocation));
	}
	if (!string.IsNullOrEmpty(searchParameters.personBio))
	{
		predicate = predicate.And(p => p.Bio.Contains(searchParameters.personBio));
	}
	// Entity Framework requires AsExpandable
	//return _db.companies.Where(predicate);
	return _db.people.AsExpandable().Where(predicate);
}

This example shows several Person fields that might be filterable, and does a predicate.And call to append a new requirement to the "where" clause of our query. This example shows the "and" operation, so hits on all filters must pass to be part of the result set.

Notice that Entity Framework requires the AsExpandable() . We are using EF in this example.

Let's say you are doing a Keywords collection and want to have all hits for any keyword. Then, use a PredicateBuiler.False instead of PredicateBuilder.True. Notice, in this example the conditions have the "or" operator applied with each other via the predicate.Or call.

IQueryable<Product> SearchProducts (params string[] keywords)
{
  var predicate = PredicateBuilder.False<Product>();
  foreach (string keyword in keywords)
  {
    string temp = keyword;
    predicate = predicate.Or (p => p.Description.Contains (temp));
  }
  return db.Products.Where (predicate);
}


Now, let's take our dynamic predicate IQueryable query and join with another table. Here we join a Person table with a Company table:

IQueryable<person> query = GetPeoplePredicate(searchParametersPeople);
IQueryable<company> companyQuery = from company in _db.companies
		where company.NameLower.Contains(companyName.ToLower())
		select company;
query = from peopleResult in query
		join c in companyQuery on peopleResult.Company_Id equals c.Id
		select peopleResult;

In this example, we use our PredicateBuilder function to get a dynamically created Person query. Then, we create a second query to filter our Company table by a company names. Lastly, we use LINQ to join our Person query with our Company query to get a result set that contains People that have a relationship to our Company table filtered by the company name predicate.


Now, we have a really powerful dynamic query builder that we can let our users run adhoc queries against any number of fields across multiple related tables!

Text Extraction using C# .Net and Apache Tika


You want to using C# to extract text from documents and web pages. You want it to have high quality and be free. Try the .Net wrapper to the Apache Tika library!


Let's build a sample app and show the use case. First step, start a C# console application with Visual Studio. Use the Nuget package manager and install the TikaOnDotNet.TextExtractor packages.



Then, try this sample code. It shows an example of text extraction examples for a file, Url, and byte array sources.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using TikaOnDotNet.TextExtraction;

namespace TikaTest
{
    class Program
    {
        static void Main(string[] args)
        {

            TextExtractor textExtractor = new TextExtractor();

            // Fun Utf8 strings found here: http://www.columbia.edu/~fdc/utf8/
            string utf8InputString = @"It's a small village in eastern Lower Saxony. The ""oe"" in this case turns out to be the Lower Saxon ""lengthening e""(Dehnungs-e), which makes the previous vowel long (used in a number of Lower Saxon place names such as Soest and Itzehoe), not the ""e"" that indicates umlaut of the preceding vowel. Many thanks to the Óechtringen-Namenschreibungsuntersuchungskomitee (Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus Werner Lemberg who serves as Óechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer) for their relentless pursuit of the facts in this case. Conclusion: the accent almost certainly does not belong on this (or any other native German) word, but neither can it be dismissed as dirt on the page. To add to the mystery, it has been reported that other copies of the same edition of the PLZB do not show the accent! UPDATE (March 2006): David Krings was intrigued enough by this report to contact the mayor of Ebstorf, of which Oechtringen is a borough, who responded:";
            // Convert string to byte array
            byte[] byteArrayInput = Encoding.UTF8.GetBytes(utf8InputString);
            // Text Extraction Example for Byte Array
            TextExtractionResult result = textExtractor.Extract(byteArrayInput);
            Console.WriteLine(result.Text);

            // Text Extraction Example for Uri:
            result = textExtractor.Extract(new Uri("http://blog.novelessay.com"));
            Console.WriteLine(result.Text);

            // Text Extraction Example for File
            result = textExtractor.Extract(@"c:\myPdf.pdf");
            Console.WriteLine(result.Text);

            // Note that result also has metadata collection and content type attributes
            //result.Metadata
            //result.ContentType
        }
    }
}

Notice that the TextExtractionResult has a Metadata collection and also a content type attribute. Here's an example of the metadata provided along with the extracted text. It contains many things including author, dates, keywords, title, and description.


      

I've been very pleased with Tika's quality and ability to handle many different file types. I hope you try it out and enjoy it too.