IT Briefcase Interview: Machine Learning – the Key for Securing Big Data?
November 29, 2016 No CommentsDale Kim is the Sr. Director of Industry Solutions at MapR
The deployment of big data for fraud detection, in place of security information and event management (SIEM) systems, is increasing. However, big data has expanded the boundaries of existing information security responsibilities. While implementing security measures remains a complex process, the stakes are continually raised as the ways to defeat security controls become more sophisticated. Threat intelligence related data can be prohibitively massive, in its scale as well as its diverse nature. The data can range from server logs to application logs to network packets. No traditional database can handle the size, speed and kind of data that security teams need to plow through.
In this interview, Dale Kim, senior director of industry solutions at MapR, speaks with IT Briefcase on the critical role machine learning plays in cyber-security today. This helps with identifying non-obvious patterns that reveal anomalies that are indicative of cyber-attacks and how security teams can widen the scale and accelerate the speed of threat analysis and improve risk assessment.
- Q. Securing data has been a daunting requirement for decades, how has the explosion of big data made this objective harder?
A. As security breaches are becoming more frequent and sophisticated, traditional security solutions are not able to protect company assets. Organizations realize that just putting up walls around data is no longer enough protection. CISOs are trying to avoid incidents from impacting business operations as well as company credibility. It is estimated that 92 percent of security breaches go undetected. What’s needed today is deeper insight into the wide variety of data being generated every day from many more sources, to identify threats by monitoring and analyzing all events across the network in real time. However, this results in the generation of even larger amounts of security-related data that must be stored and analyzed. In addition, increased regulations require storing and archiving security event data for longer time periods.
- Q. What can be done in the big data era to identify threats, reduce risk, address fraud and improve compliance monitoring activities?
A. Information security functions need better analytics to proactively identify threats and reduce risk. Leading analysts estimate that by 2016 nearly 25 percent of global companies will have adopted big data analytics for security use cases, with a positive return on investment within six months. Key benefits of security analytics include reduced likelihood of fines and lawsuits, greater levels of automation to meet compliance and audit mandates, and reduced maintenance overhead for IT. Providing an agile platform is where big data comes in and where data science meets threat intelligence. Threat intelligence-related data can be prohibitively massive, in its scale as well as its diverse nature. The data can range from event logs, systems logs, and application logs to network packet information. No traditional database can handle the size, speed and kind of data that security teams need to plow through.
- Q. How does machine learning play a role in cybersecurity solutions?
A. From collection and correlation to visualization to machine learning solutions, products have emerged than can sift through data and get a better signal from the noise. For example, let’s say multiple threat intelligence feeds have the same indicator of compromise but are slightly different, making it hard for humans to see the similarities. A big data engine can find these anomalies much quicker. Using machine learning, security systems can leverage advanced analytics techniques to identify non-obvious patterns that are indicative of cyber-attacks. In addition, an adaptable analytics architecture is necessary to continue exploring new algorithms. Microservices architectures that have emerged as an ideal paradigm for evolving large-scale analytics on rapidly changing data. We can’t forget how about important automation is – with the many possible attacks, speed is essential, so the initial threat detection process should be mostly automated to remove the latency of human intervention. In many cases, threats can even be prevented with automated processes.
- Q. What does MapR provide to security teams in this era of data-driven security?
A. A highly efficient system is necessary to ensure costs don’t get out of control. If cyber security requires significant resources, then organizations are less likely to put the necessary controls in place. A system must be able to easily scale in a cost-effective manner, and quickly run a broad range of large-scale analytics to return results immediately.
Of course, cyber security involves not only stopping threats, but also putting controls in place to protect your data. To make securing big data easier and more reliable, MapR includes powerful and flexible authentication and authorization controls into the MapR Converged Data Platform to secure your business data. The built-in auditing capabilities let you analyze data accesses within the platform to help determine if any anomalous activities represent actions of a bad actor within your firewall. Delivering out-of-the-box security for big data lets customers more easily comply with security needs and avoid complicated coding or integration tasks. Our customers turn to us to help with their security needs, and we are committed to reducing the risk and the complexity of securing big data.
Author Bio
Dale Kim is the Sr. Director of Industry Solutions at MapR. His background includes a variety of technical and management roles at information technology companies. While his experience includes work with relational databases, much of his career pertains to non-relational data in the areas of search, content management, and NoSQL, and includes senior roles in technical marketing, sales engineering, and support engineering. Dale holds an MBA from Santa Clara University, and a BA in Computer Science from the University of California, Berkeley.