It’s an accepted fact that data is valuable. It can fuel revenue-driving activities from sales to product innovation, drive better decision making and so much more.
However, not all data has great value. Much of it may have little or none ─ particularly if it’s redundant, obsolete and trivial (ROT) data. If you’re holding on to it, it could be costing your organization – and not just in direct financial costs.
That’s the focus of this blog, the second of three-part series on managing ROT data. In Part 1, we discussed how ROT data accumulates. In part 3, we’ll cover the benefits of removing ROT data – and a recommendation for how to do it.
ROT Storage Costs are a Waste
Given the potential value of data, it’s not surprising that companies tend to hold on to as much of it as they can. The problem is that many don’t assess their data to ensure it’s something they should keep. The more data a company holds on to, the more storage that’s needed. Often times, companies don’t even know how quickly they’re accumulating data – both good and bad. Every time data is backed up – a critical action for business continuity – the total amount of data increases.
Granted, data storage is considered to be more affordable than ever. Still, it’s not free. You may be paying very little per gigabyte, but the costs can add up quickly – particularly as data volumes increase. Things like capacity stored, retrieval frequency, networking and egress fees, storage tiers and the actions taken on data can all come into play, and can add cost creep into the mix as well.
As various studies have noted, anywhere from 25 to 80 percent of unstructured data – the kind that makes up the majority of all enterprise data – is redundant, obsolete and/or trivial. It’s estimated that ROT data can cost companies thousands of dollars in monthly storage fees, and it’s easy to see why. If even 25 percent of your data is useless, you’re wasting space and money for that data.
The cost of keeping ROT data isn’t limited to storage. Data has to be backed up and protected, so there are costs associated with those activities – even when they involve data that has no value. That includes costs for any software, services, equipment or technologies required, as well as labor.
Storing large volumes of useless data also makes it harder for employees to locate the data they need to do their jobs. That affects productivity, and can lead to employee frustration and job dissatisfaction as well. An often-cited IDC brief found that data professionals were losing 50% of their time every week ─ 30% searching for, governing and preparing data plus 20% duplicating work. Wading through large volumes of ROT data doesn’t help.
In addition, it’s not just data professionals that are affected. Any employee searching for specific information amidst large volumes of ROT data could potentially be wasting valuable time that could have been spent on more strategic endeavors. They may also be more prone to making a mistake if the data includes redundant, obsolete or trivial information.
Using useless information also makes data analysis difficult and time consuming, is more likely to generate inaccurate results that could affect business decision making. Spending time on analyzing ROT data is wasteful enough for inside teams. When its included in datasets vendors are contracted to work on – and who may charge by data volume or time required for the work, the costs can go up significantly.
Data Migration Complications
Whether it’s due to cloud adoption or the need to upgrade IT systems, an increasing number of organizations are finding the need to migrate their data. That takes time – and incurs costs and risks, in terms of planning, setting up the target environments, protecting the data, testing, the actual migration process and any downtime or business disruptions that may be involved.
Including ROT data in the process can waste both time and money at all stages, and open the organizations up to greater risks.
ROT data can also hinder data integration and consolidation efforts, both associated with mergers and acquisitions. Workflows and processes can both be simplified and enhanced, with potential human errors avoided, by removing ROT data beforehand.
Matters of Compliance and Risk
ROT data doesn’t just affect productivity. It can also prevent employees from accessing specific data quickly and efficiently, putting their organizations at risk for compliance issues. For example, many data privacy regulations require that organizations deal with subject access requests (SARs) in a timely manner. For GDRP, organizations must respond to a SAR within one month of receiving the request. Failure do so can result in fines or worse.
Not being able to quickly identify specific data that may be subject to compliance regulations also exposes companies to unnecessary, potentially costly non-compliance risks and penalties. For example, organizations may not know if their ROT data contains sensitive information or personally protected information (PII) that is subject to compliance requirements and at risk of cyber theft, ransomware or other threats.
ROT data tends to not be accessed for long periods of time. So not only do companies not know about the potential presence of PII and sensitive data. They also don’t know if permissions are outdated, if the data is associated with employees or brands no longer associated with their company, or if data access is based on obsolete file security policies. The result: ROT data more likely to contain important that is highly susceptible to data breaches.
Cyberthieves know that companies are likely to be hoarding volumes of data that’s considered ROT. And they know there’s valuable data among that ROT data that they can hijack or corrupt, often with little or no attention.
With so much ROT data, it’s easy to see why some organizations may wish to just avoid the issue. But dealing with it by simply allocating more storage only makes things worse – and not just in terms of increasing costs. Failing to deal with ROT data prevents organizations from knowing what data they have, determining what they can do with it and then actually optimizing their use of it.
An IDC report noted that 45.7% of organizations believe they derive less than half of the potential value from their data due to data management deficiencies. Without understanding their data, companies are likely to miss out on opportunities to leverage new trends, make informed decisions, make optimal use of technologies like machine learning, and more.
Data that isn’t “clean” also makes it difficult to get optimal results from analyses, as well as from the use of technologies such as machine learning.
Get the ROT (Data) Out
Data may be considered the lifeblood of organizations ─ but only if it’s not clogged by information that’s useless. In the last blog in this three-part series, we’ll discuss how to remove ROT data and some of the additional benefits that will come from doing so.