IT Briefcase Exclusive Interview: Applying Machine Learning (ML) and Artificial Intelligence (AI) to Data Discovery, in order to Meet Business, Legal and Regulatory Goals
March 20, 2018 No CommentsEnterprise organizations around the world continue to struggle with managing exponentially growing data volumes, while at the same time trying to manage IT infrastructures that are correspondingly growing out-of-control. Instead of fighting fires, IT’s time and expertise would be better spent leveraging new and innovative technologies and methodologies – such as machine learning, artificial intelligence, and data discovery, that could positively contribute to their organization meeting its business, legal and regulatory goals. Today, we speak with Oksana Sokolovsky, CEO of Io-Tahoe, on this critical topic.
- 1.) Why is the ability to discover data in lakes and relational data instances so important? And, until more recently how were organizations attempting to accomplish this (or weren’t they?)
The growth in data volumes and the variety of data sources means organizations have amassed data footprints which have become difficult to manage. Many companies struggle with the challenge of a growing number of platforms, databases and data lakes; and the ability to monitor or track how data moves through the enterprise. With the inability to manage these data volumes, governance processes may be inadequate, and the value of enterprise data may not be fully realized or monetized. Organizations simply aren’t able to cope with such volumes of data, and either exhaust resources on time consuming and potentially error-prone manual methodologies, or in the worse-case scenario, may unwillingly neglect their data landscape.
However, there has been a momentous leap towards data-driven decision making. The first step in becoming a data-driven enterprise starts with data discovery. Additionally, there are a number of market drivers which are helping to fuel the importance of data discovery, whether in relational database management systems (RDBMSes), data lakes or other modern repositories. For example, regulatory and audit pressures require firms have the ability to discover and understand their data; more businesses are starting to view data as an asset with measurable return; and limited resources and availability of in-house experts have reduced the dependence on manual options in favor of automated data discovery platforms.
To achieve this, every enterprise must now manage their data – they need to discover it, understand each and every instance, and how it flows through the organization.
- 2.) How has machine learning altered the data discovery landscape?
We continue to see the evolution of Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning technologies. Although ML algorithms have been around for some time, it has transformed traditional computing. The idea of machine learning is that given enough data with the associated outcomes, together with features relevant to predicting those outcomes, software can be trained to make those associations in future cases.
In the case of data discovery, ML makes it easier to discover data, extract insights, patterns and relationships that can be used to make decisions. By using machine learning algorithms and heuristics enterprises can quickly and automatically discover and untangle the complex maze of data relationships, making for a smart data discovery process.
- 3.) Are there particular verticals for which this ability is especially important?
The majority of industries with a significant data footprint have recognized the value of machine learning. Regardless of vertical, any organization with enormous volumes of data; will see the value and benefit of machine learning-driven data discovery. However, we see use cases from financial services, healthcare, oil&gas and transportation.
- 4.) The ability to apply machine learning to data discovery must make many lives easier – who in particular within organizations enjoy the greatest benefit?
Machine learning-driven data discovery introduces a level of empowerment to the organization. The automation of tasks that were previously manual and/or dependent on Subject Matter Experts (SMEs) can relieve demands and pressures on human and financial resources. From this perspective, there are enterprise-wide benefits.
However, at a functional level or for specific roles, data owners and data stewards will of course benefit from improved visibility and understanding of the data landscape. Automated discovery will enable allocation of resources to other data related activities such as data governance and data analytics.
- 5.) You have launched the GA of your product, together with a new feature enhancement – could you provide an overview of this news?
Io-Tahoe is unique as it allows the organization to conduct data discovery across heterogeneous enterprise landscapes, ranging from databases, data warehouses and data lakes, bringing disparate data worlds together into a common view which will lead to a universal metadata store. This enables organizations to have full insight into their data, in order to better achieve their business goals, drive data analytics, enhance data governance and meet regulatory demands required in advance of regulations such as GDPR.”
- 6.) Anything else? Any parting advice for organizations seeking to improve their data discovery with a solution such as Io-Tahoe?
Anything other than automated data discovery is not practical, so we encourage organizations to seek out automated platforms. Io-Tahoe’s smart data discovery platform features a unique algorithmic approach to auto-discover rich information about data and data relationships. Its machine learning technology looks beyond metadata, at the data itself for greater insight and visibility into complex data sets, across the enterprise. Built to scale for even the largest of enterprises, Io-Tahoe makes data available to everyone in the organization, untangling the complex maze of data relationships and enabling applications such as data science, data analytics, data governance and data management.
The technology-agnostic platform spans silos of data and creates a centralized repository of discovered data upon which users can enable Io-Tahoe’s Data Catalog to search and govern. Through convenient self-service features, users can bolster team engagement through the simplified and accurate sharing of data knowledge, business rules and reports. Here users have a greater ability to analyze, visualize and leverage business intelligence and other tools, all of which have become the foundation to power data processes.
To learn more, we would encourage your readers to visit: https://io-tahoe.com.