Stephen Scharf, DTCC Chief Security Officer
Artificial intelligence (AI) capabilities for defending against cyberattacks on data and banking systems has been steadily developing over the last 10 to 15 years. More appropriately described as “machine learning,” this technology is now becoming potentially effective for cybersecurity.
Historical approaches to cyber defenses provide value, but can only accomplish so much. It’s time to focus on building machine learning capabilities by choosing key criteria, studying patterns and making correlations between cybersecurity events.
Big Data and Predictive Analysis
Big data is a cornerstone of a robust and dynamic cyber defense program, built by feeding large data elements into machine learning processes to find correlations between cyber events. Connecting similar events, which look innocuous when seen individually in separate platforms, can put together a holistic picture of threatening cyber activity.
These correlations, when found, are events of the past. In order to improve upon existing cyber defense systems, we need to be able to predict cyber activity before it occurs. That is where predictive analysis enters the picture.
Machine learning can leverage cyber event patterns to provide predictive analysis. There are several key threat indicators one can program machine learning systems to monitor. If enough of these indicators show suspicious activity at the same time, then the system defending against a cyberattack can alert security professionals. Some of these indicators can also help detect inappropriate activity by a firm’s own staff.
Cyber Threat Indicators
1) Times when systems are being accessed vary from normal patterns.
2) The amount of data being downloaded or transferred is rising.
3) The number of data systems being accessed is rising.
4) The frequency at which systems are being accessed is abnormal.
5) The amount of data someone may be exporting outside the firm is unusual.
6) The types of external websites being accessed by users.
7) The appearance of new systems that users are trying to access.
These are just a few examples. A cyber defense system using machine learning may be programmed to issue a warning or halt activity if seven of ten criteria are met, or ten out of 15 criteria, for example.
Passive vs. Active Systems
Once a designated number of indicators are detected and connected, cybersecurity systems can work on either a passive or active basis. Passive security tools monitor activity and issue alerts, but fall short of acting to stop the activity. Active security tools are programmed to act when they detect bad activity and block a user’s access or stop a connection in the system.
Choosing whether to use passive or active tools depends on a firm’s tolerances or preferences for false positives and false negatives. Most professionals prefer passive cybersecurity tools because they are nervous about an active system issuing a false positive result and blocking activity that is legitimate and should be completed. But, as active security tools increase in capability and sophistication, they should issue less false positives. Once there are fewer false positives, firms can be more confident about shifting from passive, “observe and report” tools to tools that observe, then act on those observations.
In most cases, false negatives are worse than false positives, because that means you have a tool that you think is working, but bad actions are occurring and your tools are not detecting those actions or stopping them. An active system that is accurately programmed and tuned to be stricter about detecting, connecting and acting on indicators is less likely to produce false negatives or miss bad activity.
Machine Learning: A Look Ahead
What’s the next frontier for using machine learning to protect financial data and banking systems? Cloud computing, once seen as too risky for the financial industry, is now considered a resource that gives machine learning greater capability to leverage vast amounts of data to yield insights on potential cyber intrusions. Aside from greater capability, using the cloud can be far less costly than storing huge amounts of data on site and using large amounts of computing resources to work across all that data.
We may also see improvements in tools that understand data context. Not all data is equal. Having systems that can correctly profile data elements and apply dynamic rules based on those profiles will help to reduce cyber exposure. For example, a user connecting to four new systems may not be a large concern. But if those four new systems were all considered highly sensitive, then multiple new access attempts would raise higher suspicion than access attempts at multiple non-sensitive systems.
While there is great potential for AI to improve cyber defenses, there is more work to do to learn how we, as an industry, can use this technology going forward in the context of cybersecurity, both on the offensive and defensive side.