Evaluating artificial intelligence and machine learning-based systems for cyber security

evaluating artificial intelligenceAll indicators suggest that 2017 is shaping up to be the year of artificial intelligence and machine learning technology for cyber security. As with most trends in our industry, the available protection solutions range from elegantly-designed platforms to clumsily-arranged offerings. The big problem is that many enterprise security teams cannot always tell the difference.

I’ve spent the past few months digging in with a variety of vendors providing products and services in this important area. I’ve also been actively engaged with CISO teams of all shapes and sizes, trying to learn their experiences using AI and ML to deal with hackers and malware.

Some good recent news is that amidst my investigations, a few intensely practical suggestions have emerged – ones that I feel compelled to share with you here. These suggestions are intended for cyber security professionals who are actively engaged in the evaluation process for AI and ML offerings, presumably to deal with advanced threats. I hope my suggestions are useful to you.

First and foremost, you must demand a clear explanation of the underlying mathematics that drive your potential vendor’s offering. Whenever I sit down with an AI or ML vendor, the first thing I request to see is the basis for their algorithms, and it is astounding how varying the responses can be. Of course, you must be careful to ensure that you are not evaluating the description from a weak salesperson – so demand to speak with a technical expert.

What I suggest you do is this: For each mathematical heuristic you are shown, try to write down a simple prose description to highlight your understanding. For example, if your vendor brags about conditional predictive probabilistic algorithms based on volumetric attacks and customer usage patterns, then jot down that “the vendor checks to see if DDoS will occur when your website is busy.” This can be a sobering, but enlightening exercise. Practice and you’ll get good.

Second, you must carefully observe the degree of human assistance that might be occurring during any execution analysis – and this includes initial installation periods. If, for example, your vendor requests VPN access to your proof-of-concept AI and ML deployment, then make sure that they are not using this access to enhance the accuracy or completeness of their algorithms. You are buying automation, not hand-holding.

What I suggest you do is this: Upon installation of your new AI or ML tool – perhaps as part of your evaluation/selection/procurement cycle, demand that the system operate without any access or intervention from the vendor. If this must occur, then at least demand to have full visibility into what they are doing. We all watched The Wizard of Oz as kids, and we all know what can happen when a wizard waves his arms behind a curtain.

Third, I believe you must require that your AI and ML solution demonstrate rapid results. All too often, there is the suggestion from vendors, and an attendant belief by security teams, that the tool must have time to observe, and learn, and absorb, and soak, and so on. I simply do not understand this claim. With proper installation, tuning, administration, and configuration, the tool you select should demonstrate value quickly. That is the promise of automation.

What I suggest you do is this: During any proof of concept or execution evaluation, begin by creating a clear definition of what it is that the tool seeks to accomplish – in most cases, detection of some advanced threat condition. Then do what every scientist does: Grab your lab notebook and accurately record on a timeline when such defined accomplishments occur. If the automation works properly, you should see a distribution that makes sense.

Like many of you, I am super-excited at the prospect of employing advanced algorithmic analysis to stop hackers. This is a natural application of our best thinking in computing, a direct descendant of Knuth trying to search and sort lists. But also like many of you, I am fearful of being tricked by the syntactic sugar of marketers, and by promises that are based more on aspiration than real computing.

Please give my three suggestions above a try – and then share your experiences publicly. If my earlier prediction about 2017 being the year of AI and ML is correct, then we will need your help.