Blue Hexagon Blog

Deep Learning and Network Threat Protection FAQ

You’ve Got Questions, We’ve Got Answers

On February 5, Blue Hexagon took the wrappers off of our venture with a flourish of media attention. It generated some excitement. Artificial intelligence and cybersecurity are hot, after all, and so it makes sense that combining the two would be of interest, especially when you have respected customers and partners vouching for you and when you break related malware news on the same day.

But along with the accolades came a lot of questions on deep learning and the Blue Hexagon technology. In fact, just last week Gartner Analysts Augusto Barros and Anton Chuvakin raised some very interesting points in a recent TechTarget/Search Security article on deep learning.

We’re hearing some similar questions from the security community as well, so let’s get these questions answered:

(1) What’s the big deal with deep learning?

The application of artificial intelligence in cybersecurity is not new, so what’s the big deal?

The advent of deep learning changed all this, enabling the system to be much more autonomous. Features did not have to be defined. As long as the data are properly labeled, the system learns to define the appropriate function from the data itself. Read our recent blog on the difference between machine learning and deep learning.

When you look at the evolution of “machine learning” in the graphic below, you can see that the majority of applications (including cybersecurity) weremanual in nature. Features had to be defined by experts and fed into the machine learning system. For example, in the case of Kasparov versus Deep Blue, all the possible chess moves were predefined, and the IBM Deep Blue supercomputer crunched through all moves before selecting one.

Deep Learning and Network Threat Protection
Figure 1: Evolution towards deep learning

The possibilities of deep learning in cybersecurity are enormous. It means the ability to train a system to detect known and unknown threats without having to handcraft what constitutes a “threat.”. This makes deep learning highly scalable and if designed correctly can be very accurate.

(2) But doesn’t the accuracy of a system using deep learning depend on the training data?

We’ve heard comments like this before: “Vendors have been accumulating data on malware, nobody has been collecting malicious network traffic at scale.” To be clear, while we are “inspecting” the complete network flow for threats, this does not mean we train on Netflow. Our deep learning models are trained on the things we are trying to detect–threats in payloads (files) and threats in headers (such as C2 communications and malicious URLs). Yes, our deep learning models do rely on good threat data, which is widely available. See question 3.

(3) Clearly, the training process is a very important one for deep learning systems, but how much data do you really need to be effective?

There is a lots of threat data available–more than what is available for image and speech recognition applications. However, while data quantity is important, what is more important is the quality and diversity of data to ensure our deep learning models can detect all types of threats.  With smaller but more diverse data sets one can achieve a lot of accuracy.

(4) What about detecting threats like “fileless malware?” Can Blue Hexagon detect this?

Really great question. The concept of fileless malware is really applicable to endpoints. Fileless malware sneaks in and instead of using malicious software or executable files, it often hides in memory and is written to RAM rather than disk. This means it avoids detection by traditional endpoint security.

With Blue Hexagon’s deep learning technology, we detect the “fileless” malware because it passes through the network before it is assembled on the endpoint. We can detect any malicious protocols that may be part of the “fileless malware.”

(5) Deep learning can identify new malware that has common characteristics with what we already know as malware, but can it really identify new threats?

Most attackers will reuse parts of an attack. Therefore, even if the initial vulnerability is new, other parts of the attack kill chain may have common characteristics. This is why Blue Hexagon inspects files, malicious domains, and C2 communications. Even in the unlikely event that we miss the initial exploit, we’ll catch the attack when it tries to download additional malware or receive instructions from the attacker.

Having said that, the Blue Hexagon deep learning models have been effective at detecting novel malware variants of Anatova and TrickBot.

(6) Can deep learning inspection really be as fast as simple signature matching?

Simple signature matching is very fast and accurate for known threats. No one will dispute that. But simple signature matching only covers 1 million out of a billion known threats–and none of the unknown threats. The new threat landscape of variants renders signature-based threat detection all but obsolete. Additionally, one of the unfortunate limitations of signature-based threat detection is that only a limited number of signatures can be stored on an actual security product due to storage constraints and throughput constraints on checking traffic against these signatures at line rate. This includes both network security and endpoint security.

Our deep learning-harnessed threat protection platform may not be as fast as microsecond speeds in simple signature matching, but sub-second detection of known and unknown threats is groundbreaking.

(7) Is what Blue Hexagon is doing with deep learning inspection the same as security vendors doing network traffic analytics or UEBA?

Blue Hexagon delivers actual threat inference on network perimeter traffic. Other security vendors “assume breach,” and use machine learning to inspect internal or East/West traffic for anomalies. This means machine learning or analytics on logs, alerts, or “headers” to identify “anomalies”.

However,  the problem with the “anomaly detection” approach is that anomalies are common in network traffic, and not all anomalies are a threat indicator. If you’re focused on flagging anomalies, your AI-powered threat detection system is going to become overwhelmed by false positives. That can run your team of security analysts ragged and have a detrimental effect on your security posture.

By ensuring your network perimeter threat detection is effective, you will end up relieving the burden on your UABE, NTA products because the primary threats would already be blocked at the perimeter.

(8) With enterprise network complexity increasing over time, isn’t training a deep learning system to tell good from bad going to get much harder?

Yes! Deep learning inspection on network traffic is hard. We’re solving hard problems here at Blue Hexagon and will continue to do so as complexity increases. Why? Because we believe there are many benefits of deep learning on network threat protection.

Network traffic is immensely rich and interesting. Applying deep learning to network payloads and headers brings complexity because of the variety in the structure, but there are multiple ways to identify malicious intent (files, headers, C2 communications, malicious domains). Architected correctly, resources for advanced deep learning models can be available on network security appliances, and models can be easily updated when needed. Additionally, applying deep learning to network traffic at the perimeter of the enterprise brings the benefit of stopping the threat closest to the source of entry before it has the opportunity to move laterally through the enterprise.

Did we answer your questions?  

Hackers have adopted AI and automated techniques in their relentless campaign against the enterprise. Now the enterprise has the ability to fight fire with fire. Blue Hexagon’s team of deep learning experts recognized that cybersecurity is the ideal application for meeting the challenges that have bedeviled industries for years. Now, and—for the first time—by deploying deep learning-based prevention at the network layer, we can identify and block threats in sub-seconds, which means we’ll be ready long before the next attack arrives. That includes malware samples that have never been seen before.

This is new and exciting. We’re breaking ground and challenging assumptions, so if you’ve still got questions about how deep learning works for network threat protection, let us know. We’d be happy to tell you more. Meanwhile, we’ll keep our eyes and ears open for more questions that come up and we will answer them here again soon.