The data center is full of software and hardware designed to help solve a specific problem. Having too many of these solutions can create an operational issue for the IT team, but so can having too few. IT professionals should be meticulous when selecting purpose-built solutions. At Aparavi, we think there are plenty of good reasons that unstructured/file data is a category of data that benefits from a purpose-built solution. Here are the top 5.
It comes as no surprise to almost any IT professional that unstructured data is growing faster than any other data set. Users are creating more files than ever, and those files are increasing in size. These files are more than just office productivity files. There are video and audio recordings too. A more significant area of unstructured data growth, though, is data generated by machines. Users are getting good at creating data, but machines are the experts. The data types can range from log files to image files to video surveillance files as well as data from IoT sensors.
The result is that most data centers’ unstructured data storage footprint is now far more substantial than their transactional footprint. This growth means an ever-expanding primary storage infrastructure and a data protection architecture that is five to ten times its size.
Many organizations are looking to get out of the infrastructure business, and backup infrastructure is a great place to start. Still, many file backup software solutions don’t provide the necessary granular understanding of files to know which data it should move off-premises.
Another significant change is the attitude toward unstructured data. Most organizations consider unstructured data key to their business. In many cases, they can’t operate without it. This criticality makes protecting—and, more importantly, quickly finding data—a business imperative.
Unfortunately, as we recently discussed in a presentation with Storage Switzerland, most backup vendors treat unstructured data as a second class citizen. Its protection is a “nice to have” feature instead of a requirement. Often, they have the attitude of, “If files are important to users, they should copy them to one of the organization’s Network Attached Storage (NAS) systems or fileservers.” Not only do users not excel at copying data, but many data centers are also ill-equipped to properly protect data on their NAS systems or file servers.
Most backup solutions either back up by blindly crawling through the file system one file at a time, or they protect data by ignoring files altogether and performing an image backup. Blindly crawling the file system is so slow that many organizations can’t fit unstructured data backups into their backup windows. Image backups enable them to meet their backup windows, but they provide no visibility into the files they are protecting.
The General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) set a new bar for both data protection and data management. These regulations not only make it clear that data needs regular protection but also that organizations need to retain specific content for a predetermined time. If the backup solution uses a file by file method and the organization can live with long backup windows, they are still at risk from GDPR and CCPA requirements because these solutions only provide a limited amount of categorization and inspection to meet these new demands. Image backups of file data offer no insight or categorization. While some of these solutions provide individual file restores, most can’t search for a specific file across multiple jobs.
Both file-by-file backup and image backup products can’t meet the “right to be forgotten” requirements of these regulations. The “right to be forgotten” rules mean that a user has a right to have all their personal data that the business is storing deleted. These regulations do not exclude backup data. The problem is the way that most backup solutions store their data; they don’t allow for the removal of a specific set of files within a backup job. IT needs to remove the entire job to meet the right to be forgotten requests, but in doing so, they may violate other retention requirements.
Ransomware is an ongoing threat that can impact almost any data center at any time. Ransomware developers are becoming increasingly more sophisticated. Instead of infiltrating the environment and encrypting files as fast as possible, they either wait before attacking or encrypt files slowly to avoid detection. As a result, multiple backup jobs may have data that the ransomware attack encrypted as well as multiple copies of the ransomware itself. Recovery now means restoring select files from numerous previous jobs. Identifying the right files for recovery is difficult, as is executing the restore, so that the correct files come back from the proper backups.
Most backup solutions today claim some form of cloud backup support, but how that support plays out is suspect. A basic level of cloud storage support is using the cloud as a mirror of what’s on-premises. This method increases costs because the organization is paying for storage twice. Some vendors claim to tier to the cloud, but the lack of granularity limits just how efficient they can be in the movement of data to the cloud. As a result, most backup solutions that tier data to the cloud are only leveling old full backups. The difference between any given full backup is relatively small.
Aparavi File Protect & Insight addresses each of these reasons. It is scalable to handle the size of today’s backups, and it uses an advanced file-by-file backup that is both fast and provides details on each piece of data it is protecting. We treat unstructured data as a full citizen of the data center, and we protect it no matter where it is: file servers, NAS systems, and even user laptops.
File Protect & Insight is GDPR and CCPA ready. Because we protect data file by file and store data intelligently, each file or type of file can have different retention policies. We can also let you know what files have personally identifiable information (PII) in them. Additionally, we can remove data from anywhere within our infrastructure to meet the right to be forgotten policies.
For these same reasons, we are ready for ransomware: we can alert you to even the smallest of changes to old files, and we make it easy to find and restore groups of unencrypted files across multiple backups.
Finally, we are cloud-ready. File Protect and Insight can send data directly to the cloud or cache it on-premises first.
How about one more reason why you want a purpose-built backup for file data? Aparavi can save you money. Leveraging the cloud to store old copies of unstructured data can significantly reduce the cost of on-premises storage infrastructure. We can also help you identify files that should no longer be stored anywhere, not in the backup or on-premises. These are files that have no value to the organization. They might be copies of files or various versions of a file as users developed it. How many iterations of the corporate presentation do you need?
We can also reduce the cost of your backup software licensing fees. While File Protect & Insight doesn’t protect databases, most organizations are already double protecting those environments because they are using a replication solution, snapshotting the replicas, and still performing a backup with traditional backup software. Database backup is typically the most expensive part of a backup software solution. File Protect & Insight customers can count on their replication solution for rapid recovery and the occasional database dump to Aparavi for long term retention.
Want to learn more? We recently presented on stage with Storage Switzerland about the new challenges and requirements of protecting file data. The good news is we recorded the presentation, “Are You Treating Data Like a Second Class Citizen?”, and you can watch it here.
Ready for a demo of Aparavi’s purpose-built file backup software? Schedule a call now.