NovelEssay.com Programming Blog

Exploration of Big Data, Machine Learning, Natural Language Processing, and other fun problems.

Avoid Deadlocks when Reading StandardOutput from a Process in C#

Let's start with some C# code that kicks off a new Process like this:

System.Diagnostics.Process p = System.Diagnostics.Process.Start(info);
p.WaitForExit();
Console.WriteLine(p.StandardOutput.ReadToEnd());

The process will do some Console.Write, and we wan't to capture that. In order to capture the child Process' output, we'll read StandardOutput. That will often work great... until it doesn't.


This can cause deadlocks where your parent process is stuck waiting due to a variety of scenarios.

  1. The child process never exits
  2. The child process writes so much data that it needs to be consumed with a stream pattern.

A good pattern is to set up event handlers on OutputDataReceived. After you append your event handler, then you need to call BeginOutputReadLine

using (AutoResetEvent outputWaitHandle = new AutoResetEvent(false))
using (AutoResetEvent errorWaitHandle = new AutoResetEvent(false))
{
using (Process process = new Process())
{
process.StartInfo.FileName = filename;
process.StartInfo.Arguments = arguments;
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardOutput = true;
process.StartInfo.RedirectStandardError = true;
StringBuilder output = new StringBuilder();
StringBuilder error = new StringBuilder();

process.OutputDataReceived += (sender, e) => {
if (e.Data == null)
{
outputWaitHandle.Set();
}
else
{
output.AppendLine(e.Data);
}
};
process.ErrorDataReceived += (sender, e) =>
{
if (e.Data == null)
{
errorWaitHandle.Set();
}
else
{
error.AppendLine(e.Data);
}
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
if (process.WaitForExit(timeout) &&
outputWaitHandle.WaitOne(timeout) &&
errorWaitHandle.WaitOne(timeout))
{
// Your process finished and check process.ExitCode now.
}
else
{
// Your process timed out.
}
}
}


This pattern is also very good at setting up a non-infinite wait time for your child process. Many other solution patterns for managing child processes assume an infinite wait, and that is often not a good choice for many of these advanced needs.





















Converting a text string to a Brushes Color object in C#

We want to dynamically go from a list of Colors (ie, Red, Yellow, White) to the Brush object types shown in my previous article: Adding Text to Images with C# .Net Bitmap objects - We'll show a few solutions for mapping strings to Brush Color objects.


Try this one first. Simply map your string to a Color using the FromName function like this:

Color yellowColor = Color.FromName("Yellow");

Alternatively, you can use a ColorConverter like this:

TypeConverter typeConverter1 = TypeDescriptor.GetConverter(typeof(Color));
TypeConverter typeConverter2 = new ColorConverter();
Color yellowColor2 = (Color)tc.ConvertFromString("Yellow");

Lastly, you can use Reflection on Color or Brush like this:

Color yellowColor3 = (Color)typeof(Color).GetProperty("Yellow").GetValue(null, null);



Adding Text to Images with C# .Net Bitmap objects

This article will show you some examples of how to add text to images with C# .Net Bitmap objects. This works for jpg, png, bitmap, and other image format types.

Procedure Overview:

  1. Create a Bitmap object with your source image.
  2. Create a RectangleF object around your source image.
  3. Create a Graphics object using your source Bitmap object
  4. Set several configuration values on your Graphics object that make the text look better in most cases.
  5. Draw your text string to the rectangle with all of the specified settings.
  6. Flush the changes and save your final output.

Here's some example code that implements the above procedure:

// Load the original image. Can be jpg, png, bmp, etc...
Bitmap bmp = new Bitmap("myImage.jpg");
// Create a rectangle for the entire bitmap
RectangleF rectf = new RectangleF(0, 0, bmp.Width, bmp.Height);
// Create graphic object that will draw onto the bitmap
Graphics g = Graphics.FromImage(bmp);
// ------------------------------------------
// Ensure the best possible quality rendering
// ------------------------------------------
// The smoothing mode specifies whether lines, curves, and the edges of filled areas use smoothing (also called antialiasing). One exception is that path gradient brushes do not obey the smoothing mode. Areas filled using a PathGradientBrush are rendered the same way (aliased) regardless of the SmoothingMode property.
g.SmoothingMode = SmoothingMode.AntiAlias;
// The interpolation mode determines how intermediate values between two endpoints are calculated.
g.InterpolationMode = InterpolationMode.HighQualityBicubic;
// Use this property to specify either higher quality, slower rendering, or lower quality, faster rendering of the contents of this Graphics object.
g.PixelOffsetMode = PixelOffsetMode.HighQuality;
// This one is important
g.TextRenderingHint = TextRenderingHint.AntiAliasGridFit;
// Create string formatting options (used for alignment)
StringFormat format = new StringFormat()
{
    Alignment = StringAlignment.Center,
    LineAlignment = StringAlignment.Center
};
// Draw the text onto the image
g.DrawString("Visit StyleMyImage.com", new Font("Tahoma",8), Brushes.Black, rectf, format);
// Flush all graphics changes to the bitmap
g.Flush();
// Now save or use the bitmap
image.Image = bmp;

The following are common items you may want to customize: Fonts, Size, Color, Text Position, etc...


If you want to change your font type or font size, edit the values you set in this part of the code:

new Font("Tahoma",14)

If you want the text to be Yellow, change the 

Brushes.Black 

to 

Brushes.Yellow


If you want the text to be in the bottom right corner, change the Alignment values in the StringFormat object.

StringFormat format = new StringFormat()
{
Alignment = StringAlignment.Far,
LineAlignment = StringAlignment.Far
};


Finally, if you want to change the Text drawn on to the image, change the first argument passed to g.DrawString from Visit StyleMyImage.com to whatever you'd like it to say.


Tesseract 4.0 C# .Net Wrapper Released!

This article is about the Tesseract 4.0 C# .Net Wrapper that is only a few days old as of April 2017.


You are probably familiar with the Tesseract 3.04 C# .Net Wrapper found here:

https://github.com/charlesw/tesseract

That is already available as a Nuget package and has many downloads.


Just about a week ago, an Alpha release of the Tesseract 4.0 C# .Net wrapper was published here:

https://github.com/tdhintz/tesseract4win64

This is an x64 only .Net assembly. 


Find the Tesseract 4.0 language packs here:

https://github.com/tesseract-ocr/tessdata

When I load English only language pack, it uses a reasonable 180MB of RAM. I tried to load "all languages", and it was using over 8GB of RAM. 


This build is incredibly slow for debug mode. It runs 5-8X slower in debug mode than release mode, so watch out for that.


Amazingly, the .Net wrapper API works exactly the same as the Tesseract C# .Net 3.0 wrapper! (When you read about how the engine changed a huge amount and using LTSM networks, this will be more amazing to you.)


A very simple usage example works like this:

var tessEngine = new TesseractEngine(tessdataPath, "eng");
using (Page page = tessEngine .Process(myImage))
{
    string resultText = page.GetText();


Be sure to drop these two files in your \bin\debug or \bin\release folder at a x64 sub-folder like this::

.\bin\release\x64\libtesseract400.dll
.\bin\release\x64\liblept1741.dll

When the Tesseract.dll 4.0 assembly loads, it needs to find those DLLs else it will throw an exception in your application.


There is a very nice Accuracy and Performance overview report of 3.04 versus 4.0 here:

https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance

I agree with it's findings generally, but my own personal tests are not nearly as "improved" versus 3.04. I have a regression test that contains about 2200 pages, and I'm observing plenty of slower and less precise OCR results with Tesseract 4.0. It is certainly not all "better and faster" as of April 2017. Since this is an extremely new Alpha release, I have high hopes that it will improve over time.