Big Data and Cybersecurity

Big Data and its Role in Cybersecurity

“Big Data” has become a bit of a catch-all term for really, really big sets of complex data.  Complicated enough that traditional methods of analysis fail, and more information is presented than a human analyst could ever dream of processing.  The idea of Big Data is important in modern business because businesses are inundated with Terabytes of data every single day.  Defining what is and is not Big Data comes down to the “5 vs”: volume, velocity, variety, veracity, and value.  Big Data is wildly variable data that comes in massive volumes and quick speeds that are of value to a business.  Some examples of Big Data include:

  • Real-time financial data across a broad business portfolio.

  • Web traffic across an enterprise of websites.

  • Transaction histories for online businesses.

  • User directories / customer address books.

While these data contain critical, actionable information, a business must understand how to work with these data to get their full benefit.

Big Data and Cybersecurity

So, where does Big Data fit into modern cybersecurity?  The volume and complexity of cyber attacks a modern business faces are growing by the day.  All the way back in 2016, users of a popular malware detection software reported 1 billion events from June to November.  Four years later, the number of attacks has only increased.  Business networks and systems face far more attacks on a daily basis than human operators could possibly detect, classify, unravel, and defeat.

So far, the jury is out on how Big Data can support cybersecurity.  Some call it the biggest threat to cybersecurity, and some call it the savior of cybersecurity.  Like most new technologies, its value is all in how you use it.  Big Data in cybersecurity takes the form of:

  • Total network traffic.

  • User data, such as sign-in time, sign-out time, and time spent on the network.

  • Details on all cybersecurity events in a day/week/month.

  • Metadata about data on the network, such as size, number of times accessed, etc.

 Some of the ways these Big Data is supporting cybersecurity operations include:

  • Using ML to recognize common threats and reduce the time to detect and defeat attacks.

  • Monitoring network traffic and detecting irregularities.

  • Classifying attacks and detecting malware/ransomware attacks.

  • Observing file system data and finding compromised or weak devices on the network.

  • Finding insider threats.

Big Data Tools

While many tools exist for working with Big Data, two tools tend to dominate the field: Hadoop and Splunk.

Hadoop is an open-source software platform for storing Big Data.  In broad strokes, Hadoop is the “backbone” of Big Data.  It uses a network of distributed hardware to store a business’s data and offers top-level tools for reasoning about this data. Splunk, on the other hand, is a software program for analyzing Big Data.  In general, Splunk provides a tool to index Big Data and turn it into actionable information.  It provides users a web-interface type GUI for storing, sorting, and searching Big Data and data analysis tools for gaining information from Big Data in real-time. 

Hadoop and Splunk actually work together in many cases.  Splunk can be used on Big Data stored and sorted using Hadoop.  To be most effective as a Big Data analyst in a cybersecurity career, you are best suited learning the ins and outs of both.

Previous
Previous

Computer Vision and Coordinate Transforms

Next
Next

Identifying Snakes: AI vs. Human