How can you make sure you are leveraging all the potential insights of your company’s data, whether for regulatory compliance or legal discovery? The first step is knowing what you have.
Here are a couple of use cases that stem from a basic lack of awareness or understanding about the type of data an organization keeps, and how they might interact with that data for storage or litigation purposes.
A large bank was facing hefty fines from the SEC, not to mention endangering their reputation, for non-compliance with securities regulations. The bank managed multiple data centers with billions of files, comprising many different file types, totaling 56 petabytes (PB) of data. In addition, 90% of the data was unstructured.
One issue was data growth. While the average data growth across all organizations is 20% per year, this bank’s was higher because they had a policy of never deleting anything. All files created in the system stayed in the system forever.
In order to rectify the situation, the company had to locate billions of data files, and evaluate each file based on their importance to the company, then secure and store each file according to that file’s specific value, or remove the file from the system if necessary. In the process, the organization would consolidate their infrastructure sprawl and spending. To give you an idea of how complicated this project was, it would take a single human worker more than three thousand years to finish it.
This type of project becomes manageable with Aparavi. The Aparavi Platform allows for the search and intelligent classification of files, automation of unstructured content, and more than 140 preset classification policies by regulation. This removes human error and drastically reduces staffing concerns, by automatically classifying your data, no matter where it lives, and triggering action.
The legal discovery phase of litigation involves the exchange of information between plaintiffs and defendants, where both sides exchange the information that is “responsive” (or relevant) to the lawsuit. The exchange of electronic files is commonly known as eDiscovery, and often involves testimony from forensic specialists, to confirm that nothing relevant to the case was inappropriately accessed or deleted. Any time PII or other privileged information appears in a document, it must be redacted (meaning deleted or otherwise censored from the document) before the document is given to the opposing side.
Many organizations facing litigation decide to hand over all their data to legal counsel that might possibly be relevant and allow their attorneys to winnow down the files to those that are relevant to the case, before producing them to the other side. Those attorneys often work with legal services companies that specialize in eDiscovery and managed review. However, this process requires those attorneys to read through all of the documents to find the most relevant ones. Even using eDiscovery software, the data must first be processed by the software company and hosted on separate servers, which can quickly become incredibly costly.
The average cost to review a single document is between $8 and $12. A typical enterprise lawsuit involves 100,000 files. Do the math: the cost just to collect evidence – only the first phase of the lawsuit – is between $800,000 and $1.2 million. Ultimately, the files that are not needed for the lawsuit can be culled (or deleted), but not without first going through the review process.
As you might expect, finding and reviewing all those files to produce to opposing counsel is the most laborious part of the process. It accounts for about half the time and money companies spend on eDiscovery. And attorneys know there’s probably only one or a handful of files that will be the “smoking gun,” to prove which side is right and which side is wrong, that one file that makes or breaks the case, so finding that needle in the haystack is imperative.
Imagine if your organization could deliver to legal counsel only those documents that they know are relevant, before the attorneys even review the files? The Aparavi Platform allows your in-house team to carefully curate documents before sending them to your outside counsel for review, eliminating potentially thousands or millions of dollars in processing, hosting, and review fees. In addition, you can preemptively withhold or redact privileged information without exposing the data to additional security risks built into the transportation of data, because those documents can remain searchable within your existing storage solutions.
The Aparavi Platform can also be used to understand the strength of your case prior to litigation. If you find the smoking gun in your data before the discovery phase even starts, you may decide to settle the lawsuit before spending massive amounts of time, internal resources and money on litigation. Perhaps you are conducting an internal investigation – the ability to access and search all of your data can allow your organization to quickly conclude the investigation at a low cost.
Both of these examples require accessing, searching, and classifying data for regulatory compliance or legal purposes, to make meaningful insights.
Aparavi provides the tools for categorizing metadata and creating an index of all files. With access to an index, you will finally be able to create meaningful insights by scanning, identifying, classifying, tagging, filtering, and reporting your data, and ultimately creating rules for how to treat certain classifications of files so these practices can be carried out automatically.
With these insights, you’ll be able to discern smoking guns from red herrings. You’ll know what data has to be protected from cyber-related threats according to the SEC. You’ll know which data has no value to the organization and can be deleted, or which cold files can be removed from primary storage to a cheaper archiving solution.
The needs may vary, but the answer is the same – know your data.
For more information about how Aparavi can help with regulatory compliance and legal discovery, check out our on-demand webinar “Best Practices in Data Governance and Legal Use Cases.”