NovelEssay.com Programming Blog

Exploration of Big Data, Machine Learning, Natural Language Processing, and other fun problems.

Make local MySQL or ElasticSearch instance publicly available for a MVC.Net website with NgRok

Background:

I have a GoDaddy hosting account for NovelEssay.com, and I want to use and provide a large data set (like a personal copy of Wikipedia). GoDaddy's MySQL database limits are 1GB, which is good - but not awesome. I could host MySQL at home on my personal computer, but how could my GoDaddy MVC.Net project access the data without a public address on my local instance?

(Note: I use ngrok, but they are certainly not paying me to write about it. Please read the security disclaimer at the bottom of this article. Exposing local resources to the internet is awesome and will have extra security concerns.)


Introducting: NgRok

If you are thinking something like "I want to expose my local service to a public address" then ngrok is a great (free) tool that I like using. My current usage is exposing my local MySQL database service to the internet, so my public services can use its data.


I kick off ngrok on my local machine like this:

ngrok.exe tcp 3306

That starts up a local process that gives me a public address and port that I can connect to using my MVC.Net projects:

After running ngrok.exe, you should see something like the above. This shows TCP traffic at the public address 0.tcp.ngrok.io on port 123456 (actual value obfuscated) is routed to my localhost:3306, which is exactly where my local MySQL is listening.

Setup:

Head over to ngrok.com and create a free account, and download their ngrok.exe. Your free account has a secret key associated with it. The first time you run ngrok.exe you'll need to set that key like this:

ngrok.exe authtoken <YOUR_AUTHTOKEN>

After that, you're all set to spin up ngrok for exposing local services to the internet.


MVC.Net Connection String:

In order for your mvc.net connection string to hit the ngrok pubilc address, you'll need to change the data source part of the connection string.

An example server and port for connecting to an exposed MySQL instance:

server=0.tcp.ngrok.io;Port=123456
An example data source for connecting to an exposed SQL Server:
data source=0.tcp.ngrok.io,55555

Notice the comma before the port number in the SQL Server connection string format.


In my experience, I only had to change data source or server/port. All of the other fields (user, password, timeouts, etc...) did not need any changes when switching to a ngrok publicly exposed database.


Other Uses:

I also use ngrok.com to expose my local ElasticSearch and some local IIS hosted websites for various (short term) demos and other purposes.


When you publish a website to local IIS, that will be serving over some port that you can configure. We'll use 5555 for this example. You can expose that local IIS hosted website to the internet like this:

ngrok.exe http 5555

Or, a local ElasticSearch like this (with usually runs on localhost:9200):

ngrok.exe http 9200

Warning: ElasticSearch doesn't have built in authentication, so wrapping that with an nginx locally is recommended. I do that with NovelEssay.com in that I'm hosting my website via GoDaddy and use an ElasticSearch in Azure with ngrok and nginx authentication.


Security Disclaimer:

Be careful with ngrok. Exposing your local ElasticSearch directly is not recommended, because ElasticSearch doesn't have authentication built in. You can use things like nginx to create a basic authentication wrapper around ElasticSearch, and then expose the nginx service to the internet via ngrok. The same is with any other local service. Ngrok is awesome, but be careful.

Creating an Excel report with C# using a MySql data source

This blog article will share how to create a multi-worksheet Excel workbook file using C# that fetches report data from a MySql data source.


First, create a Visual Studio C# project and install the EPPlus and MySql.Data Nuget packages.




Next, get yourself a MySqlConnection object connected to your MySql server. If you have a specific database, you'll want to add that to your connection string. My case didn't need to do that. Obviously, change your user, password, and server values to whatever makes sense for your environment. 

                string connstring = string.Format("Server={0}; UID={1}; password={2};default command timeout={3};", serverName, userName, Password, commandTimeout);
                MySqlConnection connection = new MySqlConnection(connstring);
                connection.Open();

Next, you set up an ExcelPackage object and create worksheets as the code below shows. Notice how the column headers are set and styled.

Then, I fetch my Sql query from a helper class and execute that using a MySqlCommand. The results of my query are fetched using the data reader. As it iterates through the data reader, the code sets the query result values in to specific Excel cell locations using the row and column index values.


                string eppOutput = @"C:\myEppOutput.xlsx";
                FileInfo fi = new FileInfo(eppOutput);
                using (ExcelPackage package = new ExcelPackage(fi))
                {
                    // I two queries that run. Results are put on separate worksheets.
                    for (int queryIdx = 1; queryIdx <= 2; queryIdx++)
                    {
                        string worksheetName = "Worksheet" + queryIdx;
                        ExcelWorksheet ws = package.Workbook.Worksheets.Add(worksheetName);
                        // Each page has this four column headers:
                        // Set worksheet's column headers
                        ws.Cells["A1"].Value = "ColumnA";
                        ws.Cells["B1"].Value = "ColumnB";
                        ws.Cells["C1"].Value = "ColumnC";
                        ws.Cells["D1"].Value = "ColumnD";
                        // Make column headers bold
                        ws.Cells["A1:D1"].Style.Font.Bold = true;
                        query = Utils.GetDatabaseQueryString(queryIdx);
                        cmd = new MySqlCommand(query, dbCon.Connection);
                        int row = 2;
                        int col = 1;
                        using (var dataReader = cmd.ExecuteReader())
                        {
                            while (dataReader.Read())
                            {
                                col = 1;
                                ws.SetValue(row, col++, dataReader.GetString(0));
// Column B and C are always integer values:
                                ws.SetValue(row, col++, int.Parse(dataReader.GetString(1)));
                                ws.SetValue(row, col++, int.Parse(dataReader.GetString(2)));
                                ws.SetValue(row, col++, dataReader.GetString(3));
                                row++;
                            }
                        }
                    }
                    package.Save();
                }


Note that the col index resets each pass through the dataReader.Read, and the row index resets each time I create a new worksheet.

Finally, we can call package.Save to write out my ExcelPackage object to a file specified in the argument string that we passed to its constructor.



Writing a Windows Service with TopShelf

This article will show the C# code to use the TopShelf framework to write a Windows Service.


First, use Visual Studio and create a new C# console application and install the TopShelf nuget package.


Next, change your Main function to look roughly like this:

    public class Program
    {
        public static void Main()
        {
            TownCrier tc = new TownCrier();
            tc.Start();
           
            HostFactory.Run(x =>                                 //1
            {
                x.Service<TownCrier>(s =>                        //2
                {
                    s.ConstructUsing(name => new TownCrier());     //3
                    s.WhenStarted(tc => tc.Start());              //4
                    s.WhenStopped(tc => tc.Stop());               //5
                });
                x.RunAsLocalSystem();                            //6
                x.SetDescription("NovelEssayAgent v" + Assembly.GetExecutingAssembly().GetName().Version);        //7
                x.SetDisplayName("NovelEssayAgent");                       //8
                x.SetServiceName("NovelEssayAgent");                       //9
                x.AfterInstall(() => NotificationHelper.DoAfterInstall());
                x.BeforeUninstall(() => NotificationHelper.DoBeforeUninstall());
            });                                                  //10
             
            return;
        }
    }

Now, you should be asking:

  1. What is the NotificationHelper?
  2. What is the TownCrier?

The NotificationHelper is a class that lets you handle before and after install events. I'm simple writing out some log messages on those events, but you might want to automate additional deployment tasks in those events.

    static public class NotificationHelper
    {
        static public void DoAfterInstall()
        {
            File.AppendAllText("c:\\mylog.txt", "NovelEssayAgent v" + VersionHelper.GetAssemblyVersion() + " was installed.", EventLogEntryType.Information);
        }
        static public void DoBeforeUninstall()
        {
            File.AppendAllText("c:\\mylog.txt", "NovelEssayAgent v" + VersionHelper.GetAssemblyVersion() + " was uninstalled.", EventLogEntryType.Information);
        }
    }


The TownCrier is a class that I setup a never ending timer that polls looking for work. It also handles the OnStart and OnStop service events, so I can log or do other tasks when those events happen.


The EssayMgr class encapsulates work being done. EssayMgr's RunLifeCycle function is called every hbSetting seconds. You can make that amount of time configurable in many different ways (database, config file, etc...).

    public class TownCrier
    {
        private static EssayMgr mMgr = new EssayMgr();
        readonly System.Timers.Timer _timer;
        /// <summary>
        /// The constructor sets up a timer that will call RunLifeCycle when it fires.
        /// </summary>
        public TownCrier()
        {
            int hbSetting = 1;
            _timer = new System.Timers.Timer(hbSetting * 1000) { AutoReset = true };
            _timer.Elapsed += (sender, eventArgs) => mMgr.RunLifeCycle();
        }
        public void Start()
        {
            File.AppendAllText("c:\\mylog.txt", "NovelEssayAgent v" + VersionHelper.GetAssemblyVersion() + " was started.", EventLogEntryType.Information);
            _timer.Start();
        }
        public void Stop()
        {
            _timer.Stop();
            mMgr.StopAll();
            File.AppendAllText("c:\\mylog.txt", "NovelEssayAgent v" + VersionHelper.GetAssemblyVersion() + " was stopped.", EventLogEntryType.Information);
        }
    }


Add some code to do some work in the EssayMgr.RunLifeCycle function, and that's you've created your very own C# Windows Service!


Your code should build in to an exe, and you can installer and start your service using topshelf's commands on your executable via the command line:

http://docs.topshelf-project.com/en/latest/overview/commandline.html




Mvc.Net with AngularJS and Pagination

My personal current "goto" for web based pagination uses angular-paginate-anything plugin with a Mvc.Net backend. As with most web development tools, there's many ways to use something like this. This blog will show one full example of how to do Mvc.Net with AngularJS and Pagination using angular-paginate-anything.


I'm going to try and keep this very simple, so excuse some of the nuance being skipped.

0) Create a MVC.Net project and a controller/view. I'll leave that outside the scope of this article.


1) Create an AngularJS App & Controller. Nothing fancy. Just simple stuff:

Link the AngularJS:

    <script src="https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.3.15/angular.min.js"></script>
Create a JS file with the AngularJS code:
var app = angular.module('MyPersonalApp', ['bgf.paginateAnything']);
app.controller('MyPersonalCtrl', ['$scope', '$http', '$filter', '$interval', '$location', function ($scope, $http, $filter, $interval, $location) {
...
}]);
In your cshtml view, link your new javascript file, and make some divs that hold your app:
<script src="~/Scripts/Search/MyPersonalApp.js"></script>
<div id="MyPersonalAppId" ng-app="MyPersonalApp">
    <div ng-controller="MyPersonalCtrl" ng-cloak>
...


2) Download and link the javascript in your view (cshtml).

        <script src="~/Scripts/paginate-anything-tpls.js"></script>


3) In side your app divs, add a table to hold the paginated results and bgf-pagination markup like this:

                            <table>
                                <tr>
                                    <th>Name</th>
                                </tr>
                                <tr ng-repeat="result in searchResults">
                                    <td>{{result.Name}}</td>
                                </tr>
                            </table>
                            <bgf-pagination collection="searchResults"
                                            url="'/api/GetMySearchResults'"
                                            url-params="searchParams"
                                            num-items="searchResultsTotalItems">
                            </bgf-pagination>


4) You'll notice that the bgf-pagination references several things. First, lets look at the searchParams. Ideally, you'll want to hook that up to a searchbox of some sort. For now, we'll hard code a query string to the searchParams in our angular controller like this:

app.controller('MyPersonalCtrl', ['$scope', '$http', '$filter', '$interval', '$location', function ($scope, $http, $filter, $interval, $location) {
    $scope.searchParams = {};
    $scope.searchParams.queryName= "Jo";


5) Next, notice that /api/GetFreeSearchResults URL points at a web API endpoint. We'll set that up by installing the WebAPI Nuget project in our MVC.Net project.

    


6) Create new WebAPI Controller in your project, and make it look roughly like this:

    [RoutePrefix("api")]
    public class SearchApiController : ApiController
    {
        MyDbEntities _db;

        public SearchApiController()
        {
            _db = new MyDbEntities();
_db.Configuration.ProxyCreationEnabled = false; } [Route("GetMySearchResults"), HttpGet()]
[WithPaging] public IHttpActionResult Search([FromUri] PagingModeFreeSearch pagingModel , [FromUri] string queryName = null
) { string localQueryName = string.Empty; if(!string.IsNullOrEmpty(queryName ))
{ localQueryName = queryName;
} var peopleResultsTotal= _db.People.Where(t => t.Name.ToLower().Contains(localQueryName));
var peopleResultsSubset = peopleResultsTotal.OrderBy(o => o.Id).Skip(pagingModel.From).Take(pagingModel.Size);
PagingModeMySearch result = new PagingModeMySearch();
result.From = pagingModel.From; result.To = pagingModel.To; result.TotalRecordCount = peopleResultsTotal.Count();
result.People = peopleResultsSubset.ToArray();
return this.PagedPartialContent(result.People , result.From, result.To, (int)result.TotalRecordCount);
} }

Notice the routing annotations are setup. This also assumes you have a DB with a table named People as your MyDbEntities data source.


7) Where does the [WithPaging] annotation come from? Make a file named NgPaginateModels.cs, and use this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text.RegularExpressions;
using System.Threading;
using System.Threading.Tasks;
using System.Web;
using System.Web.Http;
using System.Web.Http.Controllers;
using System.Web.Http.Filters;
using System.Web.Http.Results;
namespace MyProject.MVC.Models
{
    // Found this awesome sauce here:
    // https://github.com/begriffs/angular-paginate-anything/wiki/How-To-Configure-ASP.NET-Web-API
    public class WithPagingAttribute : ActionFilterAttribute
    {
        private readonly string _paramName;
        private static readonly Regex RangeHeaderRegex = new Regex(@"(?<from>\d+)-(?<to>\d*)", RegexOptions.Compiled);
        public WithPagingAttribute(string paramName = "pagingModel")
        {
            _paramName = paramName;
        }
        public override void OnActionExecuting(HttpActionContext actionContext)
        {
            object data;
            if (actionContext.ActionArguments.TryGetValue(_paramName, out data))
            {
                //manipulate/inject paging info from headers
                var pagingModel = data as PagingModel;
                pagingModel = IncomingPagingModel(actionContext.Request.Headers, pagingModel);
                actionContext.ActionArguments[_paramName] = pagingModel;
            }
        }
        private static PagingModel IncomingPagingModel(HttpRequestHeaders headers, PagingModel pagingModel)
        {
            if (pagingModel == null)
                pagingModel = new PagingModel();
            IEnumerable<string> rangeUnitValues;
            IEnumerable<string> rangeValues;
            if (headers.TryGetValues("Range-Unit", out rangeUnitValues)
                && rangeUnitValues.First().Equals("items", StringComparison.OrdinalIgnoreCase)
                && headers.TryGetValues("Range", out rangeValues))
            {
                var rangeHeaderMatch = RangeHeaderRegex.Match(rangeValues.First());
                if (!string.IsNullOrWhiteSpace(rangeHeaderMatch.Groups["from"].Value))
                    pagingModel.From = int.Parse(rangeHeaderMatch.Groups["from"].Value);
                if (!string.IsNullOrWhiteSpace(rangeHeaderMatch.Groups["to"].Value))
                    pagingModel.To = int.Parse(rangeHeaderMatch.Groups["to"].Value);
            }
            return pagingModel;
        }
    }
    public class PartialNegotiatedContentResult<T> : OkNegotiatedContentResult<T>
    {
        readonly int _requestedFrom;
        readonly int _requestedTo;
        readonly int _totalCount;
        public PartialNegotiatedContentResult(T content, ApiController controller, int requestedFrom, int requestedTo, int totalCount)
            : base(content, controller)
        {
            _requestedFrom = requestedFrom;
            _requestedTo = requestedTo;
            _totalCount = totalCount;
            if (_requestedTo > _totalCount && _totalCount > 0)
                _requestedTo = _totalCount;
        }
        public override async Task<HttpResponseMessage> ExecuteAsync(CancellationToken cancellationToken)
        {
            var response = await base.ExecuteAsync(cancellationToken);
            const string unit = "items";
            response.Headers.Add("Accept-Ranges", unit);
            response.Headers.Add("Range-Unit", unit);
            if (_totalCount > 0)
            {
                response.StatusCode = HttpStatusCode.PartialContent;
                response.Content.Headers.ContentRange = new ContentRangeHeaderValue(_requestedFrom, _requestedTo, _totalCount)
                {
                    Unit = unit
                };
            }
            else
            {
                response.StatusCode = HttpStatusCode.NoContent;
                response.Content.Headers.ContentRange = new ContentRangeHeaderValue(0);
            }
            return response;
        }
    }
    public static class ApiControllerExtensions
    {
        public static IHttpActionResult PagedPartialContent<T>(
            this ApiController controller,
            IEnumerable<T> content, int from, int to, int count) where T : class
        {
            return new PartialNegotiatedContentResult<IEnumerable<T>>(content, controller, from, to, count);
        }
    }
}


8) What about the PagingModel class? Create a PagingModel base class, and then inherit from it like this:

    public class PagingModeMySearch: PagingModel
    {
        public PagingModeMySearch()
        {
        }
        public Person[] People { get; set; }
    }
    public class PagingModel
    {
        public PagingModel()
        {
            From = 0;
            To = 4;
        }
        public int From { get; set; }
        public int To { get; set; }
        public long TotalRecordCount { get; set; }
        public int Page { get { return (To / Size) + 1; } }
        public int Size { get { return To - From + 1; } }
    }


If you spin this up, your web page should load and the table should show Names like "Joe", "Joseph", "John", etc... with the hard coded "Jo" search string we put in our AngularJS controller.


That's really about all there is to making this work. The rest is search controls and styling up the results.


You can get angular-paginate-anything on Github here:

https://github.com/begriffs/angular-paginate-anything

Machine Learning for Network Security with C# and Vowpal Wabbit


This article will discuss a solution to the KDD99 Network Security contest using C# and the Vowpal Wabbit machine learning library. This is a very quick and naive approach that yields 97%+ accuracy. Proper data preprocessing and better feature selection should result in much better predictions. Details about that Network Security contest can be found here:

http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html


Start by creating a new Visual Studio C# project and import the Vowpal Wabbit Nuget package.


Then, we build a class that describes the data records and the target labels that we have for training.

    public class DataRecord : VWRecord
    {
        public VWRecord GetVWRecord()
        {
            return (VWRecord) this;
        }

        public string label { get; set; }

        public int labelInt { get; set; }

        public bool isKnownAttackType { get; set; }
    }


Create a VWRecord class with Vowpal Wabbit annotations. This is used by the VW to map features to the correct format. In this example, I set Enumerize=true as much as I can and lump the features in to a single feature group. I didn't try splitting up the features in to different groups, but that seems like a smart and reasonable thing to explore.

    public class VWRecord
    {
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float duration { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public string protocol_type { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public string service { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public string flag { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float src_bytes { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_bytes { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float land { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float wrong_fragment { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float urgent { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float hot { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_failed_logins { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float logged_in { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_compromised { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float root_shell { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float su_attempted { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_root { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_file_creations { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_shells { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_access_files { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float num_outbound_cmds { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float is_host_login { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float is_guest_login { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float count { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float srv_count { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float serror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float srv_serror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float rerror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float srv_rerror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float same_srv_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float diff_srv_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float srv_diff_host_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_count { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_srv_count { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_same_srv_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_diff_srv_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_same_src_port_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_srv_diff_host_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_serror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_srv_serror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_rerror_rate { get; set; }
        [Feature(FeatureGroup = 'a', Enumerize = true)]
        public float dst_host_srv_rerror_rate { get; set; }

    }


Next, we create a Vowpal Wabbit wrapper class. After the VWWrapper instantiation, call Init before any calls to Train or Predict.

    public class VWWrapper
    {
        VW.VowpalWabbit<VWRecord> vw = null;

        public void Init(bool train = true)
        {
            vw = new VW.VowpalWabbit<VWRecord>(new VowpalWabbitSettings
            {
                EnableStringExampleGeneration = true,
                Verbose = true,
                Arguments = string.Join(" "
                , "-f vw.model"
                , "--progress 100000"
                , "-b 27"
                )
            });

        }

        public void Train(DataRecord record)
        {
            VWRecord vwRecord = record.GetVWRecord();
            SimpleLabel label = new SimpleLabel() { Label = record.labelInt };
            vw.Learn(vwRecord, label);
        }

        public float Predict(VWRecord record)
        {
            return vw.Predict(record, VowpalWabbitPredictionType.Scalar);
        }

The whole program is just a few lines. Open a data source to the training or test data set. Loop through the records and call train or predict. On the predictions, compare the prediction against the actual label and score appropriately.

Notice the cutoff score of 0.4 in the prediction function. Vowpal Wabbit will give a prediction between 0 and 1. You can tune your cutoff score to meet whatever precision and recall behaviors suite your needs. A higher cutoff score will result in some more "attack" records being predicted as "normal".

        static string trainingSource = @"C:\kaggle\kdd99\kddcup.data.corrected";
        static string testSource = @"C:\kaggle\kdd99\corrected";

        static VWWrapper vw = new VWWrapper();

        static int trainingRecordCount = 0;
        static int testRecordCount = 0;
        static int evaluateRecordCount = 0;

        static int correctNormal = 0;
        static int correctAttack = 0;
        static int totalNormal = 0;
        static int totalAttack = 0;

        static void Main(string[] args)
        {
            Stopwatch swTotal = new Stopwatch();
            swTotal.Start();
            vw.Init();
            DoTraining();
            DoEvaluate();
            swTotal.Stop();
            Console.WriteLine("Done. ElapsedTime: " + swTotal.Elapsed);
        }


        static void DoEvaluate()
        {
            float cutoffScore = 0.4f;

            DataSource sourceEval = new DataSource(testSource);
            while (sourceEval.NextRecord())
            {
                if(sourceEval.Record.isKnownAttackType)
                {
                    float prediction = vw.Predict(sourceEval.Record.GetVWRecord());

                    if(sourceEval.Record.labelInt == 0)
                    {
                        totalNormal++;
                        if (prediction < cutoffScore) correctNormal++;
                    }
                    if(sourceEval.Record.labelInt == 1)
                    {
                        totalAttack++;
                        if (prediction >= cutoffScore) correctAttack++;
                    }
                    evaluateRecordCount++;
                }
            }

            Console.WriteLine("Evaluate Complete. evaluateRecordCount = " + evaluateRecordCount);
            Console.WriteLine("Evaluate totalNormal = " + totalNormal + " correctNormal = " + correctNormal);
            Console.WriteLine("Evaluate totalAttack = " + totalAttack + " correctAttack = " + correctAttack);
            Console.WriteLine("Evaluate DONE!");
        }

        static void DoTraining()
        {
            DataSource source = new DataSource(trainingSource);
            while (source.NextRecord())
            {
                vw.Train(source.Record);
            }
            Console.WriteLine("Train Complete. trainingRecordCount = " + trainingRecordCount);
        }


This solution was incredibly quick and easy to implement and yields a 99.6% correct prediction of normal records, and about 97% correct prediction of attack records.


Finally, here is the data source example so you can cut & paste to try this out yourself:

    public class DataSource
    {
        // Columns
        private static int COLUMN_COUNT = 42;
        private static int COLUMN_COUNT_TEST = 42;

        // Current Record Attributes
        public DataRecord Record = new DataRecord();


        private string sourceReport;

        private System.IO.StreamReader fileReader;
        private int sourceIndex;

        public DataSource(string sourceReport)
        {
            this.sourceReport = sourceReport;
            Reset();
        }

        public bool NextRecord()
        {
            bool foundRecord = false;
            while (!fileReader.EndOfStream)
            {
                try
                {
                    //Processing row
                    string line = fileReader.ReadLine();
                    string[] fields = line.TrimEnd('.').Split(',');

                    // Expect COLUMN_COUNT columns
                    if (fields.Count() != COLUMN_COUNT && fields.Count() != COLUMN_COUNT_TEST)
                    {
                        throw new Exception(string.Format("sourceReportParser column count [{0}] != expected COLUMN_COUNT [{1}]", fields.Count(), COLUMN_COUNT));
                    }

                    Record = new DataRecord();

                    if (fields.Count() == COLUMN_COUNT)
                    {
                        Record.duration = float.Parse(fields[0]);
                        Record.protocol_type = fields[1];
                        Record.service = fields[2];
                        Record.flag = fields[3];
                        Record.src_bytes = float.Parse(fields[4]);
                        Record.dst_bytes = float.Parse(fields[5]);
                        Record.land = float.Parse(fields[6]);
                        Record.wrong_fragment = float.Parse(fields[7]);
                        Record.urgent = float.Parse(fields[8]);
                        Record.hot = float.Parse(fields[9]);
                        Record.num_failed_logins = float.Parse(fields[10]);
                        Record.logged_in = float.Parse(fields[11]);
                        Record.num_compromised = float.Parse(fields[12]);
                        Record.root_shell = float.Parse(fields[13]);
                        Record.su_attempted = float.Parse(fields[14]);
                        Record.num_root = float.Parse(fields[15]);
                        Record.num_file_creations = float.Parse(fields[16]);
                        Record.num_shells = float.Parse(fields[17]);
                        Record.num_access_files = float.Parse(fields[18]);
                        Record.num_outbound_cmds = float.Parse(fields[19]);
                        Record.is_host_login = float.Parse(fields[20]);
                        Record.is_guest_login = float.Parse(fields[21]);
                        Record.count = float.Parse(fields[22]);
                        Record.srv_count = float.Parse(fields[23]);
                        Record.serror_rate = float.Parse(fields[24]);
                        Record.srv_serror_rate = float.Parse(fields[25]);
                        Record.rerror_rate = float.Parse(fields[26]);
                        Record.srv_rerror_rate = float.Parse(fields[27]);
                        Record.same_srv_rate = float.Parse(fields[28]);
                        Record.diff_srv_rate = float.Parse(fields[29]);
                        Record.srv_diff_host_rate = float.Parse(fields[30]);
                        Record.dst_host_count = float.Parse(fields[31]);
                        Record.dst_host_srv_count = float.Parse(fields[32]);
                        Record.dst_host_same_srv_rate = float.Parse(fields[33]);
                        Record.dst_host_diff_srv_rate = float.Parse(fields[34]);
                        Record.dst_host_same_src_port_rate = float.Parse(fields[35]);
                        Record.dst_host_srv_diff_host_rate = float.Parse(fields[36]);
                        Record.dst_host_serror_rate = float.Parse(fields[37]);
                        Record.dst_host_srv_serror_rate = float.Parse(fields[38]);
                        Record.dst_host_rerror_rate = float.Parse(fields[39]);
                        Record.dst_host_srv_rerror_rate = float.Parse(fields[40]);

                        Record.label = fields[41];
                        Record.isKnownAttackType = true;

                        switch (Record.label)
                        {
                            case "buffer_overflow":
                                Record.labelInt = 1;
                                break;
                            case "ftp_write":
                                Record.labelInt = 2;
                                break;
                            case "guess_passwd":
                                Record.labelInt = 3;
                                break;
                            case "imap":
                                Record.labelInt = 4;
                                break;
                            case "ipsweep":
                                Record.labelInt = 5;
                                break;
                            case "land":
                                Record.labelInt = 6;
                                break;
                            case "loadmodule":
                                Record.labelInt = 7;
                                break;
                            case "multihop":
                                Record.labelInt = 8;
                                break;
                            case "neptune":
                                Record.labelInt = 9;
                                break;
                            case "nmap":
                                Record.labelInt = 10;
                                break;
                            case "normal":
                                Record.labelInt = 11;
                                break;
                            case "perl":
                                Record.labelInt = 12;
                                break;
                            case "phf":
                                Record.labelInt = 13;
                                break;
                            case "pod":
                                Record.labelInt = 14;
                                break;
                            case "portsweep":
                                Record.labelInt = 15;
                                break;
                            case "rootkit":
                                Record.labelInt = 16;
                                break;
                            case "satan":
                                Record.labelInt = 17;
                                break;
                            case "smurf":
                                Record.labelInt = 18;
                                break;
                            case "spy":
                                Record.labelInt = 19;
                                break;
                            case "teardrop":
                                Record.labelInt = 20;
                                break;
                            case "warezclient":
                                Record.labelInt = 21;
                                break;
                            case "warezmaster":
                                Record.labelInt = 22;
                                break;
                            case "back":
                                Record.labelInt = 23;
                                break;
                            default:
                                //Console.WriteLine("ERROR: Invalid Label Type");
                                Record.isKnownAttackType = false;
                                break;
                        }

                        if(Record.label == "normal")
                        {
                            Record.labelInt = 0;
                        }
                        else
                        {
                            Record.labelInt = 1;
                        }
                    }
                    else
                    {


                    }

                    sourceIndex++;

                    // Getting here means we have a good record. Break the loop.
                    foundRecord = true;
                    break;
                }
                catch (Exception e)
                {
                    Console.WriteLine("ERROR: NextRecord failed for line: " + sourceIndex + " with exception: " + e.Message + " Stack: " + e.StackTrace);
                    sourceIndex++;
                }
            }
            return foundRecord;
        }

        public void Reset()
        {
            fileReader = new System.IO.StreamReader(sourceReport);
            // Burn column headers
            string line = fileReader.ReadLine();
            string[] fields = line.Split(',');
            if (fields.Count() != COLUMN_COUNT && fields.Count() != COLUMN_COUNT_TEST)
            {
                throw new Exception(string.Format("sourceReportParser column count [{0}] != expected COLUMN_COUNT [{1}]", fields.Count(), COLUMN_COUNT));
            }
            sourceIndex = 0;
        }