We are living in the era of IoT (Internet of Things), clouds, connected devices and digitalization of various aspects of life including mobility, shopping and even our medical care system. This makes the production of data exponentially more and more every day.
What is “Big Data”?
We all have heard about the term “big data” often, but sometimes it’s not clear what is exactly meant by it. Is it just a big amount of data or does it have other characteristics?
The term big data has been used since 1990, when John Mashey popularized it. It refers to data sets with sizes beyond the ability of commonly used by software tools to capture, curate, manage, and process data within a tolerable elapsed time. Although big data philosophy encompasses unstructured, semi-structured, and structured data, the main usage of the term is referring to unstructured data. “Volume”, “Variety”, “Velocity” and many other “V”s are being used to characterize big data. This means that when we talk about it, the amount (or volume) is not the only characteristic of the data, but also the variety, the velocity of production, and so on.
On one hand, big data has been a great help for many industries to build powerful BI and analytics tools or to improve their products. On the other hand, it has been the main challenge for businesses to safeguard this permanent and exponential growing amount of diverse data. Therefore, hearing the term cybersecurity together with big data became more regular. To safeguard the large amount of sensitive (including personal) data against cyber-attacks is challenging.
As a quick example, according to Computer World, “a medium–size network with 20,000 devices (laptops, smartphones and servers) will transmit more than 50 TB of data in a 24–hour period. That means that over 5 Gbits must be analyzed every second to detect cyber-attacks, potential threats and malware attributed to malicious hackers.”
Big data for cybersecurity
Nowadays, it has become almost impossible to do cybersecurity using traditional technologies, but there are many new technologies and advances using big data to detect attacks and anomalies. Various businesses now shifted to use big data for cybersecurity because of the advantages that it offers:
- Analyzing a large amount of data in real-time
- Learning from historical data (attacks)
- Training accurate models based on a high amount of data
Big data can define what is “normal” and any deviation from that normality can be an alert for a system. Irregularities can be detected in real-time using AI and new technologies. As an example, if historical data of a business show that there is a 70-80% decrease of online sale between 10 p.m. and 6 a.m., then a sudden increase to the expected value can alert the system for fraud. Or if login data of the employees of a big corporate from different countries show a certain pattern according to time, a deviation to that pattern can alert the system for data breaching.
It is even possible to share data between threat detection systems and new technologies to improve the detection process.
These are mostly examples of reactive processes, but big data makes it possible for proactive processes as well. Proactive processes detect threats before they really happen. Insights from big data analytics tools can be used to detect cybersecurity threats; including malware/ransomware attacks, compromised and weak devices, and malicious insider programs. This is where big data analytics looks most promising in improving cybersecurity.
Cyber-attacks using big data
The main question is if it would be possible to stay completely protected in big data era.
Most of the experts accept that “they can’t use the power of big data analytics to its full potential for several reasons, such as the overwhelming volume of data; lack of the right tools, systems, and experts; and obsolete data. Big data doesn’t provide rock-solid security due to poor mining and the absence of experts who know how to use analytics trends to fix gaps.”
While cybersecurity is benefiting from big data, hackers are benefiting from it as well! Hackers (in groups or individuals) are becoming more skilled as the technology is improving and the AI field is growing. Using AI makes it possible to get into a system and penetrate it slowly to train models based on the data of that network without breaching a sudden amount of data. It even makes it possible to change something small in big datasets that are not easily detectable but affect major decisions by CEOs or managers of companies.
Strategies for implementing cybersecurity in Big Data era
As businesses or non-profit organizations are benefiting from the big data era, they should also be aware of cybersecurity threats because a cyber-attack could affect them. Individuals, businesses, or non-profit organizations depend much more on digital data nowadays. The harm of a cyber-attack or a data breach can be hugely devastating because many aspects of our lives and businesses are digitalized.
Businesses should think about securing data using new technologies, such as big data analytics, and implement necessary steps to secure sensitive data:
- Be mindful about storing data
Mindful decisions about which data to store and how to store it, is the first step in guarding data. From a cybersecurity perspective, it is advised to store just the necessary amount of data.
- Automated process for risk detection
It is necessary to substitute old technologies with new cybersecurity technologies that use big data. Otherwise, it is almost impossible to commit to a safeguard against cyber-attacks.
- Using visualization tools
Big data analytic tools make it possible to visualize the patterns and, therefore, it makes it easier to understand and analyze an attack.
- Using AI
Predictive models can detect an attack in its entry point and can alert the system to act against it before it is late. If a business has enough data, it can build and use predictive fraud detection models against cyber-attacks.