Tape Storage is THE HPC Solution !
By   |  June 30, 2016

You mean that we must reduce the number of cartridges used by SMEs. Is there not a danger for the tape industry?
What is data archiving? It is the act of making a copy from the Disk or server to tape. I have two teenage children at home. I take pictures of them because I know they will physically change and that, if I do not keep pictures of them, all that will be left for me in order to remember what they looked like at the age of 15, would be my memory. They will change physically, for sure, as much as computer data stored on hard disk. In the specific case of the hard drive, it’s even simpler: we know that the data will simply disappear. Making a second copy on tape is the equivalent of making a very precise picture of the data. Tape is a bit like these Mammoths we have found encased in a block of ice: it is just like freezing data in order to be sure that we will find them in due course when needed. It is important to put this question into perspective: if the storage medium that we are using is that solid, why multiply copies on 4-5 different sets of tapes? All those years of research on the quality of the tape should help improve the situation: simplifying the backup process was a priority for the SMBs with whom we spoke. Why deprive yourself of this pleasure now that we can afford it? On the other hand, for those who prefer tapes that are not manufactured with Barium Ferrite, I still advise them to multiply copies to secure their data.

The second criticism that you mention is the lack of technical support or hotline…
Coming back to this lack of support towards SMEs. When observing the state of the tape market today, we find that the presence or significance of the tape is described as an inverted triangle: tape is highly present in the larger companies. The smaller the company, the lower the use of tape.Yet everyone needs to keep data for over 20-30 years. Why this difference? Just because SMBs have been left out for too many years: most hotline centres know how to actuate a specific request made by the user. If you call and say you do not understand why the transfer speed of your drive dramatically decreases, you are not always certain to have an answer. On the contrary, if you call to say you need to maintain your drive head, the possible cause of your problem, they know how to press the right buttons. If you say you do not understand why you can not write more than 2TB on your LTO6 tape even though its native capacity is 2.5TB, you sometimes get crazy answers, like the fact that you have ultimately not bought the right tape brand!!! But if you ask for a firmware upgrade, since it is most likely to be your problem, they will offer you the service you are seeking. Basically, users need us to learn to listen to them. They need us to solve their problems faster. They even need to contact you for the most stupid questions if necessary. Today, we can easily solve the problem of tape users, yet not all users know this, and some have told me that they have simply stopped using tapes, even taking the risk of using only Disk, as they got tired of the smugness and passivity of the technical support they had contacted. However, I must admit that we can not put everyone in the same boat: some library manufacturers are extremely professional, others are catastrophic..

Can you give us names?
Those who are no good already know it. As surprising as it may seem, the manufacturer with which I feel most safe and supported is Oracle.

If this is such a problem, why don’t you offer a solution?
This is precisely what we do: we are a bunch of European players who have created a network of technical service centres with support from manufacturers such as IBM, Oracle and Fujifilm. What we do is that we offer a permanent and free support to those who need our help. The range of services we offer consists of two parts: there’s the preventive service that consists of simply responding to user questions, in a kind of personalized and technical hotline mode, and a healing service with interventions on tapes in the case of a real problem. Distributors who are able to offer such services are companies such as Media Resources and PMD in UK, UFP in Germany, CI90 in Spain, Cygate in Scandinavia, Diskus in Poland etc. and of course Octant in France and Benelux.

Beyond these two critcisms, there are still other sensitive points that you do not mention, starting with the fact that people often blame tape for not to being an open format, since the data is being saved with a backup software and therefore not as easily accessible as with Disk technology…
I’ve heard this kind of argument. The reason I did not mention it is simply because it is wrong. The problem is the same with HDD. What is a backup software made for? Primarily to reduce the workload. When working on a hard drive and when you do not have a lot of files to back up, you can manually do it in a ‘’copy and paste’’ mode, so to speak. When the workload is too heavy and you have a very large number of files to back up, it is wiser to use a backup software that lets you control, order and schedule your data backups in time. Nevertheless, it is obvious that different backup softwares are not compatible to each other. In addition, there’s a time lag when opening a file, since it is necessary, in a way, to ”unzip” the file in order to restore its original format. You see that this phenomenon does occur with Hard Disks. It is exactly the same with tape: you can choose to operate manually by installing this software that allows you to practice your backups in exactly the same way as with a Desktop hard drive. However, do not run backups of large amounts of small files on LTFS. That would take a crazy amount of time: it is better to use a backup software in such a case. Each system has its advantage: the backup software is easier to use when it comes to writing data, while LTFS allows easier access to data.

For example, in the pro-video or remote sensing environment, those who are saving a small number of large files, often opt for LTFS. In the other direction, a company that produces lots of small invoices will choose to go for the backup software.

Another black point on tape technology is that its data access time is longer than with disk…
There is a lot of confusion about that precise point, as much by users as by vendors. It can be answered in two ways.
The first point is about how to calculate the access time to the file saved on a tape cartridge. We can segment this access time into three stages. At first, the library must load the cartridge into the drive. This takes an average time of 30 seconds on a standard system. The second step is that we must locate the data on the tape: it is necessary to rewind the tape to the location of the file, which is the argument of the pro-Disk party. However, be aware that it does not take more than 2.30 minutes to fully rewind an LTO7 tape cartridge. In short, this second stage takes between 10 seconds and 2.30 minutes depending on the file location. The third step is to load the file or to kind of open it: the file loading time depends on the transfer rate. In this third phase tape is essentially faster than the hard drive. We can, therefore, conclude that Disk begins a sprint with a lead of 40 seconds to 3 minutes. In general, access time to data is faster on tape only for very large files of above 70GB, which are mostly used in the scientific area, remote sensing or the Broadcast market in the future. In contrast, for very small files, access time on hard disk is faster. Worse yet, when it comes to opening multiple files simultaneously, multiple rewinds can naturally widen the gap.

The second key idea is that, despite this advantage, I still think that the tape industry should not treat this problem as a priority, and, instead, continue to invest in communicating to a wider public about the features and benefits of tape technology. You should know that 85% of archived files are never accessed in time. An invoice issued in 2004 and that has been digitally saved, should always be available in case of audit, incident or litigation. Although, it’s obvious that there’s a good chance we will never need to access it. At worst, if this happens, it takes 3 to 4 minutes of patience for files saved on an LTO7 tape cartridge. Is this reason enough to invest so much money for such an optional need? On the other hand, there are companies that need frequent access to an amount of data and this is precisely why data management softwares exist, these are called HSM or tiering software. The user can save all of their data on tape and then decide to make a second copy of some specific data that needs very frequent access on a hard drive or SSD. This is not incompatible with the use of a tape.

Let’s play the devil’s advocate. You say that tape is more reliable than disk in time, but is this not precisely the role of the Disk plus RAID system to secure the data?
I call it the vicious circle of data loss or how to find a solution to a problem while creating a new problem and, finally, ending up back at the starting point. Hard Disk becomes defective after 3 to 4 years, which is the reason why the disk industry has created the RAID system: basically, we make several copies (two copies generally) hoping that if two of the three disks die, the third will survive. Beyond the surreal aspect of this idea, a natural consequence of RAID is that it greatly reduces the recording speed. To find a solution to the recording speed, which comes from the attempt to solve the problem of data loss, mankind has invented deduplication. The purpose of deduplication is to isolate the newly recorded data. Deduplication helps the hardware to save only new incremental data, and, consequently, reduces the storage capacity to save. By the way, unlike what people think, deduplication is no danger to tape technology for large users since the vast majority of them practice incremental backups to tape, and by nature, do not need a software that eventually would help them achieve what they are already doing with their tape library.

Nevertheless, you should be aware that deduplication will use your PCU cycles and significantly reduce the performance of your system. We can compare this to an individual who will download a movie on his laptop and, hence, may not be able to use his PC until the download is complete. If this user is planning to send e-mails while downloading this file, the only option left for him would be to use another PC. It is exactly the same process with deduplication: I am currently selling this type of solution and I often find that users tell me that they ended up buying a second server in order to be able to continue to work in decent conditions while saving their files. In short, it costs more money.

But the worst is to come: deduplication implies a maximum risk-taking in terms of data loss. Indeed, during the backup, the system will break the files into chunks that will spread chaotically across all available disks. Each file has a database that can locate these chunks of data, gather them together and finally restore the file, when the user wishes to access it. This is what we call ‘’data rehydration’’. However, if, unfortunately, the “database” is on the wrong hard disk, one that has passed away, you can no longer recover your data. We are back at the starting point: the problem of disk is data loss. We can invent as many tricks as possible to drive people to purchase a new IT device, it might help rotate that industry, but just how much of the interest of the user are we taking into consideration? I have serious doubts about this. Again, hard disk is a great tool. It plays a major role in the storage system, but the preservation of long-term data is not one of its attributes. The new LTO7 tape is a simple, reliable and fast solution. It is the ideal complement to hard disk.

Today, a user who holds 40TB of data and is confronted to an annual growth of 10% over the next 5 years may acquire a tape system that will cost him more or less 36,500 Eu over 5 years, full TCO. The amortization cost is 610 eu per month, over five years. This price includes the number of tape cartridges he will need over the 5 year period, the maintenance contracts, the installation of both the library and the drives, plus a real operational transfer rate of 720MB/s. 610 eu per month: this is the value of peace of mind.

Navigation

<12345>

© HPC Today 2024 - All rights reserved.

Thank you for reading HPC Today.

Express poll

Do you use multi-screen
visualization technologies?

Industry news

Brands / Products index