NovelEssay.com Programming Blog

Exploration of Big Data, Machine Learning, Natural Language Processing, and other fun problems.

C# Bitmap Image Scaling for Bmp, Jpg, and Png

How do you scale your bmp, jpg, or png images in C#? Use System.Drawing.BitMap and follow the instructions in this article.


(Be sure to wrap all of your MemoryStreams and Bitmap objects with Using statements, so they get disposed, otherwise you'll have some memory leaks and locks on files.)


First, add references to System.Drawing in your C# Visual Studio project, and you'll want a using statement near the top of your code too:

using System.Drawing;

Then, we'll load up our image in to a Bitmap object and save it to a memory stream like this:

string sourceFile = "c:\my.png";

using (Bitmap srcBitmap = new Bitmap(sourceFile))
using (MemoryStream streamSrc = new MemoryStream())
{
srcBitmap.Save(streamSrc, System.Drawing.Imaging.ImageFormat.Jpeg);
We will calculate the area of the source image using the height and width attributes:
	int sourceImageArea = srcBitmap.Width * srcBitmap.Height;


If the source image area exceeds a threshold, we'll calculate a scale factor that we need to reduce the height and width by. That new scale factor will be applied to get a new width and height.

		float scaleRatio = maxOutputPixelArea;
scaleRatio /= sourceImageArea; // scaleRatio is less than 1
float newW = srcBitmap.Width;
newW *= scaleRatio;
float newH = srcBitmap.Height;
newH *= scaleRatio;

Lastly, we'll create a new Bitmap to save the converted image to with our new dimensions:

		using (Bitmap resizedImg = ResizeImage(srcBitmap, Convert.ToInt32(newW), Convert.ToInt32(newH)))
using (MemoryStream resizedImgStrm = new MemoryStream())
{
resizedImg.Save(resizedImgStrm, System.Drawing.Imaging.ImageFormat.Jpeg);
imageBytes = resizedImgStrm.ToArray();
// Save imageBytes to file, store to Database, or whatever
}


The full code looks like this:

string sourceFile = "c:\my.png";
int maxOutputPixelArea = 3000000;
using (Bitmap srcBitmap = new Bitmap(sourceFile))
using (MemoryStream streamSrc = new MemoryStream())
{
// Convert the image to byte[]
srcBitmap.Save(streamSrc, System.Drawing.Imaging.ImageFormat.Jpeg);
int sourceImageArea = srcBitmap.Width * srcBitmap.Height;
if (sourceImageArea > maxOutputPixelArea)
{
float scaleRatio = maxOutputPixelArea;
scaleRatio /= sourceImageArea; // scaleRatio is less than 1
float newW = srcBitmap.Width;
newW *= scaleRatio;
float newH = srcBitmap.Height;
newH *= scaleRatio;
using (Bitmap resizedImg = ResizeImage(srcBitmap, Convert.ToInt32(newW), Convert.ToInt32(newH)))
using (MemoryStream resizedImgStrm = new MemoryStream())
{
resizedImg.Save(resizedImgStrm, System.Drawing.Imaging.ImageFormat.Jpeg);
imageBytes = resizedImgStrm.ToArray();
// Save imageBytes to file, store to Database, or whatever
}
}


Finally, the ResizeImage code is this:

private static Bitmap ResizeImage(Image image, int width, int height)
{
var destRect = new Rectangle(0, 0, width, height);
var destImage = new Bitmap(width, height);
destImage.SetResolution(image.HorizontalResolution, image.VerticalResolution);
using (var graphics = Graphics.FromImage(destImage))
{
graphics.CompositingMode = CompositingMode.SourceCopy;
graphics.CompositingQuality = CompositingQuality.HighQuality;
graphics.InterpolationMode = InterpolationMode.HighQualityBicubic;
graphics.SmoothingMode = SmoothingMode.HighQuality;
graphics.PixelOffsetMode = PixelOffsetMode.HighQuality;
using (var wrapMode = new ImageAttributes())
{
wrapMode.SetWrapMode(WrapMode.TileFlipXY);
graphics.DrawImage(image, destRect, 0, 0, image.Width, image.Height, GraphicsUnit.Pixel, wrapMode);
}
}
return destImage;
}

Social Like and Share Buttons with AngularJS: 'The Angular Way'

You're using AngularJS for your web site, and you want to add some social media share and like buttons that look roughly like this:



Here's what you should do - Setup your App to inject the angulike like this:

(Full Angulike code is near the bottom of this article.)


var myApp = angular.module('myApp', ['angulike'])
.run([
      '$rootScope', function ($rootScope) {
          $rootScope.facebookAppId = 'Your FB App Id Here'; // set your facebook app id here
      }
])

Be sure to buzz over to Facebook.com and obtain an App ID, and set it in the code above.


Then, in your controller code, create a scope variable that has your Url, Name, and an Imgae to be shared:

    $scope.myModel = {
        Url: 'http://blog.novelessay.com',
        Name: "blog.novelessay.com makes you smarter!",
        ImageUrl: 'http://blog.novelessay.com/Images/awesome.jpg'
    };


In your HTML, simply add a few divs like this:

<div fb-like="myModel.Url"></div>
<div tweet="myModel.Name" tweet-url="myModel.Url"></div>
<div google-plus="myModel.Url"></div>
<div pin-it="myModel.Name" data-pin-it-image="myModel.ImageUrl" pin-it-url="myModel.Url"></div>

They will each become the various like and share buttons. Feel free to add css class styling to the div to fit your needs.

Of course, we need to add the AngularJs and Angulike.js script tags to our HTML too:

    <script src="https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.5.5/angular.min.js"></script>
    <script type='text/javascript' src='/scripts/angulike.js'></script>

And, finally, Angulike.js is the following:

/**
 * AngularJS directives for social sharing buttons - Facebook Like, Google+, Twitter and Pinterest 
 * @author Jason Watmore <jason@pointblankdevelopment.com.au> (http://jasonwatmore.com)
 * @version 1.2.0
 */
(function () {
    angular.module('angulike', [])

      .directive('fbLike', [
          '$window', '$rootScope', function ($window, $rootScope) {
              return {
                  restrict: 'A',
                  scope: {
                      fbLike: '=?'
                  },
                  link: function (scope, element, attrs) {
                      if (!$window.FB) {
                          // Load Facebook SDK if not already loaded
                          $.getScript('//connect.facebook.net/en_US/sdk.js', function () {
                              $window.FB.init({
                                  appId: $rootScope.facebookAppId,
                                  xfbml: true,
                                  version: 'v2.0'
                              });
                              renderLikeButton();
                          });
                      } else {
                          renderLikeButton();
                      }

                      var watchAdded = false;
                      function renderLikeButton() {
                          if (!!attrs.fbLike && !scope.fbLike && !watchAdded) {
                              // wait for data if it hasn't loaded yet
                              watchAdded = true;
                              var unbindWatch = scope.$watch('fbLike', function (newValue, oldValue) {
                                  if (newValue) {
                                      renderLikeButton();

                                      // only need to run once
                                      unbindWatch();
                                  }

                              });
                              return;
                          } else {
                              element.html('<div class="fb-like"' + (!!scope.fbLike ? ' data-href="' + scope.fbLike + '"' : '') + ' data-layout="button_count" data-action="like" data-show-faces="true" data-share="true"></div>');
                              $window.FB.XFBML.parse(element.parent()[0]);
                          }
                      }
                  }
              };
          }
      ])

      .directive('googlePlus', [
          '$window', function ($window) {
              return {
                  restrict: 'A',
                  scope: {
                      googlePlus: '=?'
                  },
                  link: function (scope, element, attrs) {
                      if (!$window.gapi) {
                          // Load Google SDK if not already loaded
                          $.getScript('//apis.google.com/js/platform.js', function () {
                              renderPlusButton();
                          });
                      } else {
                          renderPlusButton();
                      }

                      var watchAdded = false;
                      function renderPlusButton() {
                          if (!!attrs.googlePlus && !scope.googlePlus && !watchAdded) {
                              // wait for data if it hasn't loaded yet
                              watchAdded = true;
                              var unbindWatch = scope.$watch('googlePlus', function (newValue, oldValue) {
                                  if (newValue) {
                                      renderPlusButton();

                                      // only need to run once
                                      unbindWatch();
                                  }

                              });
                              return;
                          } else {
                              element.html('<div class="g-plusone"' + (!!scope.googlePlus ? ' data-href="' + scope.googlePlus + '"' : '') + ' data-size="medium"></div>');
                              $window.gapi.plusone.go(element.parent()[0]);
                          }
                      }
                  }
              };
          }
      ])

      .directive('tweet', [
          '$window', '$location',
          function ($window, $location) {
              return {
                  restrict: 'A',
                  scope: {
                      tweet: '=',
                      tweetUrl: '='
                  },
                  link: function (scope, element, attrs) {
                      if (!$window.twttr) {
                          // Load Twitter SDK if not already loaded
                          $.getScript('//platform.twitter.com/widgets.js', function () {
                              renderTweetButton();
                          });
                      } else {
                          renderTweetButton();
                      }

                      var watchAdded = false;
                      function renderTweetButton() {
                          if (!scope.tweet && !watchAdded) {
                              // wait for data if it hasn't loaded yet
                              watchAdded = true;
                              var unbindWatch = scope.$watch('tweet', function (newValue, oldValue) {
                                  if (newValue) {
                                      renderTweetButton();

                                      // only need to run once
                                      unbindWatch();
                                  }
                              });
                              return;
                          } else {
                              element.html('<a href="https://twitter.com/share" class="twitter-share-button" data-text="' + scope.tweet + '" data-url="' + (scope.tweetUrl || $location.absUrl()) + '">Tweet</a>');
                              $window.twttr.widgets.load(element.parent()[0]);
                          }
                      }
                  }
              };
          }
      ])

      .directive('pinIt', [
          '$window', '$location',
          function ($window, $location) {
              return {
                  restrict: 'A',
                  scope: {
                      pinIt: '=',
                      pinItImage: '=',
                      pinItUrl: '='
                  },
                  link: function (scope, element, attrs) {
                      if (!$window.parsePins) {
                          // Load Pinterest SDK if not already loaded
                          (function (d) {
                              var f = d.getElementsByTagName('SCRIPT')[0], p = d.createElement('SCRIPT');
                              p.type = 'text/javascript';
                              p.async = true;
                              p.src = '//assets.pinterest.com/js/pinit.js';
                              p['data-pin-build'] = 'parsePins';
                              p.onload = function () {
                                  if (!!$window.parsePins) {
                                      renderPinItButton();
                                  } else {
                                      setTimeout(p.onload, 100);
                                  }
                              };
                              f.parentNode.insertBefore(p, f);
                          }($window.document));
                      } else {
                          renderPinItButton();
                      }

                      var watchAdded = false;
                      function renderPinItButton() {
                          if (!scope.pinIt && !watchAdded) {
                              // wait for data if it hasn't loaded yet
                              watchAdded = true;
                              var unbindWatch = scope.$watch('pinIt', function (newValue, oldValue) {
                                  if (newValue) {
                                      renderPinItButton();

                                      // only need to run once
                                      unbindWatch();
                                  }
                              });
                              return;
                          } else {
                              element.html('<a href="//www.pinterest.com/pin/create/button/?url=' + (scope.pinItUrl || $location.absUrl()) + '&media=' + scope.pinItImage + '&description=' + scope.pinIt + '" data-pin-do="buttonPin" data-pin-config="beside"></a>');
                              $window.parsePins(element.parent()[0]);
                          }
                      }
                  }
              };
          }
      ]);

})();

A big thanks to Jason Watmore the author of Angulike for making the base set of Angular friendly code for this scenario.


After following this process, you should get some social buttons to like and share your content that look like this:



Installing Python Chainer and Theano on Windows with Anaconda for GPU Processing

Let's say you want to do some GPU processing on Windows and you want to use Python, because of awesome things like this:


We'll show the setup steps for installing Python Chainer and Theano on Windows 10 in this blog article.


Some Terms:

CUDAan API model created by Nvidia for GPU processing.

cuDNN - a neural network plugin library for CUDA

Chainer - a Python neural network framework package

Theano - a Python deep learning package


Initial Hardware and OS Requirements:

You need an Nvidia CUDA supported video card. (I have a NVidia GeForce GTX 750 Ti.) Check for your GPU card in the support list found here: https://developer.nvidia.com/cuda-gpus 

You need Windows 10. (Everything in this procedure is x64.)


Important: 

Versions matter a lot. I tried to do this exact same setup with Python 2.7, and I was not successful. I tried to do the same thing with Anaconda 2, and that didn't work. I tried to do this same thing with cuDNN 5.5, and that didn't work. - So many combinations didn't work for me that I decided to write about what did work.


Procedure:

1) Install Visual Studio 2015. You must install Visual Studio before installing the CUDA tool kit. You need the \bin\cl.exe compiler. I have the VS2015 Enterprise Edition, but the VS2015 Community Edition is free here: https://www.microsoft.com/en-us/download/details.aspx?id=48146


2) Install the CUDA Tool kit found here: https://developer.nvidia.com/cuda-downloads

That installs v8.0 to a path like this: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0


3) Download the cuDNN v5.0 here: https://developer.nvidia.com/cudnn

There is a v5.1 there, but it did not work for me. Feel free to try it, but I suggest trying v5.0 first.

The cuDNN is just 3 files. You'll want to drop them in the CUDA path:

  • Drop the cudnn.h file in the folder:  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include\
  • Drop the cudnn64_5.dll file in the folder:  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin\
  • Drop the cudnn.lib file in the folder:  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64\

4) Install Anaconda 3.6 for Windows x64 found here: https://repo.continuum.io/archive/Anaconda3-4.3.1-Windows-x86_64.exe

In case that link breaks, this is the page I found it at: https://www.continuum.io/downloads

You'll be doing most of your Anaconda/Python work in the Anaconda Console window. If Windows does not give you a nice link to the Anaconda Console, make a short cut with a link that looks like this:

"%windir%\system32\cmd.exe " "/K" C:\ProgramData\Anaconda3\Scripts\activate.bat C:\ProgramData\Anaconda3

I installed Anaconda for "All Users", so it put it at ProgramData. If you install to just one user, it puts Anaconda at a c:\users\<your name>\ path.


5) Building python packages requires a gcc/g++ compiler. Install MinGW for x64 here: https://sourceforge.net/projects/mingw-w64/

WARNING: During this install, be sure to pick the x86_64 install and not the i686 install!

The default install for MinGW is at c:\Program Files\mingw-w64\x86_64-6.3.0-posix-seh-rt_v5-rev1\mingw64\bin

The space in Program Files will break stuff later, so move it to something like this instead:

C:\mingw-w64\x86_64-6.3.0-posix-seh-rt_v5-rev1\mingw64\bin


6) Environment paths! 

If you have no idea how to set Enviornment variables in Windows, here's a link that describes how to do that: http://www.computerhope.com/issues/ch000549.htm

Add a variable called "CFlags" with this value:

  • -IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include

Add a variable called "CUDA_PATH" with this value:

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0

Add a variable called "LD_LIBRARY_PATH" with this value:

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64

Add a variable called "LDFLAGS" with this value:

  • -LC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64

Add all of the following to your PATH variable (or ensure they exist):

  • C:\mingw-w64\x86_64-6.3.0-posix-seh-rt_v5-rev1\mingw64\bin
  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin
  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\libnvvp
  • C:\Program Files (x86)\Microsoft Visual Studio 14.0\vc\bin
  • C:\ProgramData\Anaconda3
  • C:\ProgramData\Anaconda3\Scripts
  • C:\ProgramData\Anaconda3\Library\bin

(The Anacond3 paths might get set automatically for you.)


7) Next, bring up your Anaconda console prompt and install some packages. Type the following lines:

pip install Pillow
pip install Pycuda
pip install Theano
pip install Chainer
If any of those fail to install, stop and figure out why. Append a -vvvv to the end of the install lines to get a very-very-very verbose dump of the install process. 

Note: If you can't get pycuda to install due to "missing stdlib.h" errors, you can get the pycuda Whl file and install that directly instead.

It likely is because one of your steps #1-6 isn't quite right, or because your GCC compiler is trying to use an old x32 version that you installed long ago. (That was the case for me. I had Cygwin and a x32 GCC compiler that caused failing pip package installs.)

I also had some build fails on Chainer with some errors about "_hypot" being undefined. I fixed those by going to C:\ProgramData\Anaconda3\include\pyconfig.h, and commenting out the two places in that file that do this:
//#define hypot _hypot
That appear to have fixed that issue for me, but there's probably a better solution.

8) Sanity checks and smoke tests:
First, try to import the packages from a python command window. You can run this directly from your Anaconda console like this:
  • python -c "import theano"
  • python -c "import chainer"
  • python -c "import cupy"
If one of them fails, identify the error message and ask the Google about it. They should all work:


A last smoke test is to get the "Hello GPU" test code from here:

Here's a copy:
import pycuda.autoinit
import pycuda.driver as drv
import numpy
from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
  const int i = threadIdx.x;
  dest[i] = a[i] * b[i];
}
""")
multiply_them = mod.get_function("multiply_them")
a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)
dest = numpy.zeros_like(a)
multiply_them(
        drv.Out(dest), drv.In(a), drv.In(b),
        block=(400,1,1), grid=(1,1))
print (dest-a*b)

I had to change the last line of that code to have parenthesis around it like this:
print (dest-a*b)

When you run that with a command like this:
python pycuda_test.py
You should get an output of 0's that look like this:



Conclusion:
If you've gotten to here, congratulations! Your Windows 10 environment should be all setup to run Python GPU processing.

My GPU has been running for days at 90%, and my CPU is free for other work. This was a seriously miserable to figure out, but now it feels like my computer doubled the processing power!

Enjoy the awesome:

Handling inconsistent Url formats in C#

Let's say you have a whole bunch of Urls in a database or whatever collection format, but some are like these examples:

  • NovelEssay.com
  • www.NovelEssay.com
  • http://NovelEssay.com
  • http://www.NovelEssay.com

Is there an easy way to handle these variety of formats in C#?


Yes, we'll do some prefix detection and then use a Uri class to help parse the data.


First, we assume our original data is in the HomePage variable on the Record object. We'll detect if HomePage starts with "http://". If it doesn't, we'll add that:

if (!Record.HomePage.StartsWith(HTTP_PREFIX))
{
Record.HomePage = HTTP_PREFIX + Record.HomePage;
}

Next, we'll try to parse the HomePage with the Uri class like this:

try
{
    Uri myUri = new Uri(Record.HomePage);
}
catch (Exception e)
{
    // HomePage is not parsable as Uri object
}

If HomePage is still in a bad format, the Uri constructor will throw an exception that we want to catch.


Finally, if an exception isn't thrown, we can use the new myUri to extract the base domain and other parts of the Uri like this:
Record.Domain = myUri.Host;
One last trick, is to check the Record.Domain or myUri.Host to see if it begins with a www. prefix. Depending on how you want to normalize your host information, you may want to add or remove the www prefix on the Host (or Domain) value.

That's all. Happy Uri parsing!

Using IEqualityComparer for finding near duplicates with custom business logic in C#

Example use case:

Let's say you're crawling the web gathering up information about people, and you want to group any matches of "John Smith" that might actually be the same person. 


In the generic case, we're walking through how to manage business logic cases for finding near duplicates using good software design principles.


We'll start by having a class like the Person example that follows:

    public class Person
    {
        [Key]
        public long Id { get; set; }
        public string Name { get; set; }
        public string LinkedInUrl { get; set; }
        public string TwitterUrl { get; set; }
        public string FacebookUrl { get; set; }


Let's say we get a hit from some blog about "John Smith", and his TwitterUrl is twitter.com/jsmith. We'll populate a Person object like that and toss it in our database, repository, or whatever storage we're using like this:

Person foundPerson = new Person() { Name = "John Smith", TwitterUrl = "twitter.com/jsmith" };
List<Person> allPeopleFound = new List<Person>();
allPeopleFound.Add(foundPerson);

Later, we find another "John Smith", and his LinkedInUrl is linkedin.com/in/jsmith. We'll add that Person to our collection:

Person foundPerson = new Person() { Name = "John Smith", LinkedInUrl = "linkedin.com/in/jsmith" };
allPeopleFound.Add(foundPerson);

Finally, we find another "J. P. Smith", and his LinkedInUrl is linkedin.com/in/jsmith and Facebook URL is facebook.com/jps. We'll add that Person to our collection:

Person foundPerson = new Person() { Name = "John Smith", LinkedInUrl = "linkedin.com/in/jsmith", FacebookUrl = "facebook.com/jps" };
allPeopleFound.Add(foundPerson);



We could group allPeopleFound by Name, but that's certainly going to have many false positives in our "John Smith" group. That approach will also not let us group "J. P. Smith" with "John Smith".

Let's show the code we want to happen before we show the solution we need.

var nearDupePeople = allPeopleFound.GroupBy(c => c, new PersonComparer());
foreach (var nearDupePerson in nearDupePeople)
{
    foreach (var person in nearDupePerson)
    {
        // Here we are iterating through all person objects that were grouped to gether by the PersonComparer above
        // TODO: Now, that "JP Smith" and "John Smith" are found equal, we need to have business rules about multi-valued fields

Now, you should be thinking - What's with that PersonComparer class?


Not messing around, we'll show off the PersonComparer class that implements an IEqualityComparer.

public class PersonComparer : IEqualityComparer<Person>
{
    public bool Equals(Person p1, Person p2)
    {
        // Social media matches, various social network identity matching here:
        if (!string.IsNullOrEmpty(p1.LinkedInUrl) && !string.IsNullOrEmpty(p2.LinkedInUrl) && p1.LinkedInUrl.Equals(p2.LinkedInUrl))
        {
            return true;
        }
        if (!string.IsNullOrEmpty(p1.TwitterUrl) && !string.IsNullOrEmpty(p2.TwitterUrl) && p1.TwitterUrl.Equals(p2.TwitterUrl))
        {
            return true;
        }
        if (!string.IsNullOrEmpty(p1.FacebookUrl) && !string.IsNullOrEmpty(p2.FacebookUrl) && p1.FacebookUrl.Equals(p2.FacebookUrl))
        {
            return true;
        }
        return false;
    }
    public int GetHashCode(Person p)
    {
        return (p.LinkedInUrl + p.TwitterUrl + p.FacebookUrl).GetHashCode();
    }
}

Notice that our IEqualityComparer implementation needs to have two functions implemented: Equals and GetHashCode.


In our code, we'll call two Person objects equal if their LinkedIn Urls are the same, or their Twitter Urls are the same, or their Facebook Urls are the same. We don't consider two Person instances equal if their Name is equal.


Our GetHashCode function needs to account for all 3 properties we are using for equating Person objects, so we concatenate our 3 Url properties to get our object's hash code.


That's all there is to executing a custom "near duplicate" grouping and easily handling the business logic inside your implementation for IEqualityComparer.


C# Entity Framework implementing 'Contains' Query for MySQL

This article will show some tricks on how to use C# Entity Framework queries with 'Contains' predicates for a MySQL database.

The goal is to use C# queries like this against a MySQL database:

var result = dbContext.People.Where(p => p.Name.Contains("john"));

First, we need to create a Full-Text Search Interceptor. We'll start with the one found at the following link:

http://www.entityframework.info/Home/FullTextSearch

That one is specific to SQL Server, so we'll need to make some changes. After my changes, a version for MySQL is this:

static class LanguageExtensions
{
    public static bool In<T>(this T source, params T[] list)
    {
        return (list as IList<T>).Contains(source);
    }
}
public class FtsInterceptor : IDbCommandInterceptor
{
    private const string FullTextPrefix = "-FTSPREFIX-";
    public static string Fts(string search)
    {
        return string.Format("({0}{1})", FullTextPrefix, search);
    }
    public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
    {
    }
    public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
    {
    }
    public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
    {
        RewriteFullTextQuery(command);
    }
    public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
    {
    }
    public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
    {
        RewriteFullTextQuery(command);
    }
    public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
    {
    }
    public static void RewriteFullTextQuery(DbCommand cmd)
    {
        string text = cmd.CommandText;
        for (int i = 0; i < cmd.Parameters.Count; i++)
        {
            DbParameter parameter = cmd.Parameters[i];
            if (parameter.DbType.In(DbType.String, DbType.AnsiString, DbType.StringFixedLength, DbType.AnsiStringFixedLength))
            {
                if (parameter.Value == DBNull.Value)
                    continue;
                var value = (string)parameter.Value;
                if (value.IndexOf(FullTextPrefix) >= 0)
                {
                    parameter.Size = 4096;
                    parameter.DbType = DbType.AnsiStringFixedLength;
                    value = value.Replace(FullTextPrefix, "");
                    value = value.Substring(1, value.Length - 2);
                    parameter.Value = value;
                    cmd.CommandText = Regex.Replace(
                        text,
                        string.Format(@"`(\w*)`.`(\w*)`\s*LIKE\s*@{0}\s?", parameter.ParameterName),
                        string.Format(@"match(`$1`.`$2`) against(@{0})", parameter.ParameterName)
                    );

                    if (text == cmd.CommandText)
                        throw new Exception("FTS was not replaced on: " + text);
                    text = cmd.CommandText;
                }
            }
        }
    }
}

The most significant difference is in the Regex.Replace. For MySQL, we need to make our SQL use Full Text indices using the Match-Against combo like this:

select * from people WHERE MATCH (Name) against ('john')

Note: Be sure that your table has a Full Text index on the columns you are querying. You can always add a Full Text index to your table like this:

ALTER TABLE people ADD FULLTEXT INDEX people_name_FT_IDX (Name);


Next, we'll need to add our FtsInterceptor to our DbContext like this:

public partial class MyAppMySqlDbEntities : DbContext
{
    public MyAppMySqlDbEntities()
        : base("name=MyAppMySqlDbEntities")
    {
        DbInterception.Add(new FtsInterceptor());
        Configuration.UseDatabaseNullSemantics = true;

You'll also need a using statement to go with those DbInterception calls.

using System.Data.Entity.Infrastructure;
using System.Data.Entity.Infrastructure.Interception;

Finally, we're ready to make some queries:

var personNameFtsI = FtsInterceptor.Fts('john');
var query = myDbContext.People.Where(p => p.Name.Contains(personNameFtsI));
int count = query.Count();

I suggest you set a break point in the FtsInterceptor.RewriteFullTextQuery function, and when the query.Count line is executed that breakpoint should get hit. Then, you can step through that function and see how your SQL is changing to use the Match-Against combo that MySQL needs for a Full Text index search.


A special big thank you goes out to our friends at AlphaProspect.com for help figuring all of this out.


One final "gotcha" to watch out for is that MySQL seems to only like using one Full Text index per query, so if you do something like this:

var personNameFtsI = FtsInterceptor.Fts('john');
var personJobFtsI = FtsInterceptor.Fts('engineer');
var query = myDbContext.People.Where(p => p.Name.Contains(personNameFtsI) && p.JobTitle.Contains(personJobFtsI));
int count = query.Count();

The SQL result will look correct, and it should have two Match-Against clauses, but MySQL still won't use both Full Text indices. I suggest you capture your SQL from in the FtsInterceptor.RewriteFullTextQuery function, and then execute it in the MySQL workbench. Get the execution plan or "explain" result for your query. That'll show that MySQL is really only using one of the Full Text indices.


There's a way to rewrite the queries to make use of both Full Text indices, but I have no idea how to make that happen using C# Entity Framework and Linq. Here's a link about how MySQL can do it with temp tables:

http://stackoverflow.com/questions/42097593/mysql-slow-query-when-combining-two-very-fast-queries

If you solve this, I'd love to hear from you about how you did it. Thank you.


AngularJs Pagination with custom Total Count reuse.

I have an AngularJS application that paginates through a table. You can filter the table by column. I'm using angular-paginate-anything as my AngularJS plugin for my paginated table. My table has over a million rows, and the filtered queries are not fast. Selecting "top 10" records is much faster than "get total count", so I'd like to reuse the total count value if the table filters don't change between pages. Unfortunately, angular-paginate-anything doesn't do that out of the box. Fortunately, this article shows how to do that.


Background, I have another article that shows how to use angular-paginate-anything for an Mvc.Net project with pagination. Consider that the starting point. Go ahead and read it quick. We'll wait right here...

http://blog.novelessay.com/post/mvc-net-with-angularjs-and-pagination


Next, we'll open the paginate-anything-tpls.js file and work with that (since it's not minimized). Again, our goal is to only request a new total count when the parameters change.


When the pagination app initially loads, we need to get a total count so we'll initialize that state to true:

              controller: ['$scope', '$http', function ($scope, $http) {
                  $scope.urlParamsChanged = true;

After we get the paginated results from the server, we want to set the urlParamsChange state to false.

function requestRange(request) {
$scope.$emit('pagination:loadStart', request);
$http({
...
}).success(function (data, status, headers, config) {
$scope.urlParamsChanged = false;

Then, we need to set urlParamsChanged to true again when the filter parameters do change:

                  $scope.$watch('urlParams', function (newParams, oldParams) {
                      if ($scope.passive === 'true') { return; }
                      if (!angular.equals(newParams, oldParams)) {
                          $scope.urlParamsChanged = true;

When we do make a request to the server we need to check the urlParamsChanged and indicate if we need total count or not:

                      $http({
                          method: 'GET',
                          url: $scope.url,
                          params: $scope.urlParams,
                          headers: angular.extend(
                            {}, $scope.headers,
                            { 'Range-Unit': 'items', Range: $scope.urlParamsChanged == true ? [request.from, request.to].join('-') : [request.from, request.to, $scope.numItems].join('-') }
                          ),
That's all the java script changes we need. Next, we need to modify our server to check if the total count is passed back to the server. Originally, total count on the input paging model was always 0, so we'll check if it's non-zero in the new code.

[Route("GetSearchResults"), HttpGet()]
[WithPaging]
public IHttpActionResult Search([FromUri] PagingModel pagingModel
, [FromUri] string Name = null
, [FromUri] string Region = null
)
{
IQueryable <company> query = ...
PagingModeSearch result = new PagingModeSearch();
result.From = pagingModel.From;
result.To = pagingModel.To;
if (pagingModel.TotalRecordCount > 0)
{
result.TotalRecordCount = pagingModel.TotalRecordCount;
}

else
{
result.TotalRecordCount = query.Count();
}


Finally, we need to update our WithPagingAttribute class in our NgPaginateModel.cs file to parse the range header for the total count value like this:

    public class WithPagingAttribute : ActionFilterAttribute
    {
        private readonly string _paramName;
        private static readonly Regex RangeHeaderRegex = new Regex(@"(?<from>\d+)-(?<to>\d*)-(?<total>\d*)", RegexOptions.Compiled);
And, very lastly this:
        private static PagingModel IncomingPagingModel(HttpRequestHeaders headers, PagingModel pagingModel)
        {
            if (pagingModel == null)
                pagingModel = new PagingModel();
            IEnumerable<string> rangeUnitValues;
            IEnumerable<string> rangeValues;
            if (headers.TryGetValues("Range-Unit", out rangeUnitValues)
                && rangeUnitValues.First().Equals("items", StringComparison.OrdinalIgnoreCase)
                && headers.TryGetValues("Range", out rangeValues))
            {
                var rangeHeaderMatch = RangeHeaderRegex.Match(rangeValues.First());
                if (!string.IsNullOrWhiteSpace(rangeHeaderMatch.Groups["from"].Value))
                    pagingModel.From = int.Parse(rangeHeaderMatch.Groups["from"].Value);
                if (!string.IsNullOrWhiteSpace(rangeHeaderMatch.Groups["to"].Value))
                    pagingModel.To = int.Parse(rangeHeaderMatch.Groups["to"].Value);
                if (!string.IsNullOrWhiteSpace(rangeHeaderMatch.Groups["total"].Value))
                    pagingModel.TotalRecordCount = int.Parse(rangeHeaderMatch.Groups["total"].Value);

            }
            return pagingModel;
        }


This makes navigating through each page of a search result much faster when the total count can be reused.















Must change item for configuring MySQL for Windows

I recently installed MySQL and the MySQL workbench on my Windows 10 PC. It's working great. I loaded up about 2 GB of data in two 3 tables with roughly 2M rows total. Most of my simple (single table by ID or Full Text Index) queries ran really fast, and I was happy.


Then, I started working on queries that join tables. They were slow - like 26 seconds slow for a two table join by integer ID using a FK (simple stuff).


First, I checked my indices. I added several indices just trying things and hoping it'd work. That did not solve the slowness.


I had to tinker in Workbench quite a bit imagining what could be wrong and how do I identify the source problem, which turned out to be good learning exercise.


Eventually, I saw (on the Workbench Server's dashboard) that InnoDB buffer usage went to 100% during my long running queries. I didn't really know what that meant, but I was out of "good" ideas to try so I was happy to have any leads.


I asked the internet sages (search engines) about this "InnoDB buffer usage at 100%", and I found some leads. It turns out that buffer size is set in "my.ini" (on Windows) or "my.cnf" (Not-windows). The default was 8MB, which is amazingly low for any modern computer. (MySQL on my phone would want a bigger value than that as a default.)


I found my.ini at this path on my Windows 10 PC:

C:\programdata\MySQL\MySQL Server 5.7\my.ini

Specifically, edit that file with notepad or notepad++. Look for this value:

innodb_buffer_pool_size=8M

I changed mine to 1G first, and now I'm using 2G. I don't even think that's a lot, but this is all on my dev-box so I don't want to give it all of my RAM.

innodb_buffer_pool_size=2G

After you change that my.ini value, be sure to restart your MySQL service. 

  1. Type services.msc in to the Run box on Windows to bring up your services manager. 
  2. Find the MySQL service, and right click it. 
  3. There should be a "restart" option, so click that.
After this change, my 26 second simple table join query went to sub-second time.

C# IQueryable with a Dynamic Predicate Builder for Entity Framework and Linq

Scenario:

You have a table in a web page (or WinForms) with filterable columns. A user can add filter requirements or not, which makes your query's "where" clause very dynamic. This is easy to solve if you just have a few columns, but what if you have a dozen columns all which may or may not have filters that need to be applied to the "where" clause?


Solution:

Use the PredicateBuilder in the LinqKit Nuget package.


1) Install the LinqKit Nuget package.


2) Add the typical using statement for LinqKit

using LinqKit;


3) Create a function to populate your PredicateBuilder

IQueryable<person> GetPeoplePredicate(SearchParametersPeople searchParameters)
{
	var predicate = PredicateBuilder.True<person>();

	if (!string.IsNullOrEmpty(searchParameters.personName))
	{
		predicate = predicate.And(p => p.Name.Contains(searchParameters.personName));
	}
	if (!string.IsNullOrEmpty(searchParameters.personTitle))
	{
		predicate = predicate.And(p => p.JobTitle.Contains(searchParameters.personTitle));
	}
	if (!string.IsNullOrEmpty(searchParameters.personLocation))
	{
		predicate = predicate.And(p => p.Locations.Contains(searchParameters.personLocation));
	}
	if (!string.IsNullOrEmpty(searchParameters.personBio))
	{
		predicate = predicate.And(p => p.Bio.Contains(searchParameters.personBio));
	}
	// Entity Framework requires AsExpandable
	//return _db.companies.Where(predicate);
	return _db.people.AsExpandable().Where(predicate);
}

This example shows several Person fields that might be filterable, and does a predicate.And call to append a new requirement to the "where" clause of our query. This example shows the "and" operation, so hits on all filters must pass to be part of the result set.

Notice that Entity Framework requires the AsExpandable() . We are using EF in this example.

Let's say you are doing a Keywords collection and want to have all hits for any keyword. Then, use a PredicateBuiler.False instead of PredicateBuilder.True. Notice, in this example the conditions have the "or" operator applied with each other via the predicate.Or call.

IQueryable<Product> SearchProducts (params string[] keywords)
{
  var predicate = PredicateBuilder.False<Product>();
  foreach (string keyword in keywords)
  {
    string temp = keyword;
    predicate = predicate.Or (p => p.Description.Contains (temp));
  }
  return db.Products.Where (predicate);
}


Now, let's take our dynamic predicate IQueryable query and join with another table. Here we join a Person table with a Company table:

IQueryable<person> query = GetPeoplePredicate(searchParametersPeople);
IQueryable<company> companyQuery = from company in _db.companies
		where company.NameLower.Contains(companyName.ToLower())
		select company;
query = from peopleResult in query
		join c in companyQuery on peopleResult.Company_Id equals c.Id
		select peopleResult;

In this example, we use our PredicateBuilder function to get a dynamically created Person query. Then, we create a second query to filter our Company table by a company names. Lastly, we use LINQ to join our Person query with our Company query to get a result set that contains People that have a relationship to our Company table filtered by the company name predicate.


Now, we have a really powerful dynamic query builder that we can let our users run adhoc queries against any number of fields across multiple related tables!

Make local MySQL or ElasticSearch instance publicly available for a MVC.Net website with NgRok

Background:

I have a GoDaddy hosting account for NovelEssay.com, and I want to use and provide a large data set (like a personal copy of Wikipedia). GoDaddy's MySQL database limits are 1GB, which is good - but not awesome. I could host MySQL at home on my personal computer, but how could my GoDaddy MVC.Net project access the data without a public address on my local instance?

(Note: I use ngrok, but they are certainly not paying me to write about it. Please read the security disclaimer at the bottom of this article. Exposing local resources to the internet is awesome and will have extra security concerns.)


Introducting: NgRok

If you are thinking something like "I want to expose my local service to a public address" then ngrok is a great (free) tool that I like using. My current usage is exposing my local MySQL database service to the internet, so my public services can use its data.


I kick off ngrok on my local machine like this:

ngrok.exe tcp 3306

That starts up a local process that gives me a public address and port that I can connect to using my MVC.Net projects:

After running ngrok.exe, you should see something like the above. This shows TCP traffic at the public address 0.tcp.ngrok.io on port 123456 (actual value obfuscated) is routed to my localhost:3306, which is exactly where my local MySQL is listening.

Setup:

Head over to ngrok.com and create a free account, and download their ngrok.exe. Your free account has a secret key associated with it. The first time you run ngrok.exe you'll need to set that key like this:

ngrok.exe authtoken <YOUR_AUTHTOKEN>

After that, you're all set to spin up ngrok for exposing local services to the internet.


MVC.Net Connection String:

In order for your mvc.net connection string to hit the ngrok pubilc address, you'll need to change the data source part of the connection string.

An example server and port for connecting to an exposed MySQL instance:

server=0.tcp.ngrok.io;Port=123456
An example data source for connecting to an exposed SQL Server:
data source=0.tcp.ngrok.io,55555

Notice the comma before the port number in the SQL Server connection string format.


In my experience, I only had to change data source or server/port. All of the other fields (user, password, timeouts, etc...) did not need any changes when switching to a ngrok publicly exposed database.


Other Uses:

I also use ngrok.com to expose my local ElasticSearch and some local IIS hosted websites for various (short term) demos and other purposes.


When you publish a website to local IIS, that will be serving over some port that you can configure. We'll use 5555 for this example. You can expose that local IIS hosted website to the internet like this:

ngrok.exe http 5555

Or, a local ElasticSearch like this (with usually runs on localhost:9200):

ngrok.exe http 9200

Warning: ElasticSearch doesn't have built in authentication, so wrapping that with an nginx locally is recommended. I do that with NovelEssay.com in that I'm hosting my website via GoDaddy and use an ElasticSearch in Azure with ngrok and nginx authentication.


Security Disclaimer:

Be careful with ngrok. Exposing your local ElasticSearch directly is not recommended, because ElasticSearch doesn't have authentication built in. You can use things like nginx to create a basic authentication wrapper around ElasticSearch, and then expose the nginx service to the internet via ngrok. The same is with any other local service. Ngrok is awesome, but be careful.