The Real Story Behind The Carbonite Data Loss
Carbonite FailI was catching up on a bunch of RSS feeds this evening and caught this article from MSPMentor. The gist of the story was that Carbonite, a online backup service focused on home and small business users, had lost 7500 clients data in a catastrophic RAID failure.
Then I went and checked out Carbonite's blog, and saw this post by Carbonite CEO David Friend.
Carbonite is suing a vendor over some equipment that we bought back in 2006 and 2007 (see posts below). From a news standpoint, we thought that this was an inconsequential story about a minor trade dispute. Wrong. It has turned into a PR fiasco for Carbonite, and highlights the danger of Internet "news" where every writer is just copying what he or she has read elsewhere and NOBODY is doing what a real reporter does: check the primary sources.
Hundreds of blogs sensationalized our lawsuit by implying that 7500 Carbonite customers had lost data (the real number was 54) and that it is a current ongoing problem (it was over a year ago and we no longer buy servers from Promise).
Throughout all of this, NOT ONE person bothered to pick up the phone and call me to get the facts. Few if any read what was actually in the lawsuit. The story simply passed from one blogger to another, getting juicier along the way.
He has a point, no one checked with Carbonite on just exactly how much data was lost. But I don't think that is the real story here. From what I am inferring from David' post, the loss of a single server is what caused this loss of data. It also comes across in his post that he seems to think of data being on RAID as a backup, something that any decent system engineer will tell you is NOT the case. RAID protects you from losing data if a drive fails, but not if a RAID controller goes bad, or a software error writes over your data.
This whole situation just reeks of an incredibly awful infrastructure. If Carbonite was truly looking out for its customers, it would have that data duplicated to multiple servers, preferably in multiple data centers! For a company touting themselves as backup experts, this is incredibly embarassing.
I want to be careful here, but something doesn't add up according to the TechCrunch article about this mess...
(David) Friend checked in again to state that no data was lost in the event, but a commentor says otherwise (anyone else affected who would like to weigh in?
...and what David says in his own post.
Hundreds of blogs sensationalized our lawsuit by implying that 7500 Carbonite customers had lost data (the real number was 54)
Either Robin from TechCrunch misquoted/understood David, or he lied to Robin.
This also coming from a company that,according to David Pogue (a real journalist!!), had employees write articles on Amazon without stating that they were Carbonite employees.
My personal opinion? Stay far far away from Carbonite. Check out Mozy or JungleDisk instead.
Tuesday, April 14, 2009 at 8:34PM
Reader Comments (2)
1. The Cabonite guy wasn't necessarily lying when he said no data was lost. It quite possibly comes down to perspective and spin. If you consider "backup" to mean that you still have a copy of data that was lost in another storage location, Carbonite may consider their customers' original copies to be backups for what Carbonite is storing. If that is the case and the customers still had their original copies, it may be *technically* true that no data was "lost." It might be stretching the truth or misleading, but isn't necessarily false - unless there were customers who no longer had their original files. (Of course, the customers would then be at least partially at fault since more than one copy of a file needs to exist for it to be "backed up.") I'm not defending them, but I'm also not prepared to write them off unless it can be proven that they were dishonest and/or not taking measures to ensure a similar situation does not occur in the future.
2. You imply that Mozy and Jungle Disk may provide better redundancy for the data they store, but it's not necessarily true. In my mind, Jungle Disk's redundancy is presumably better when used in conjunction with Amazon's S3 storage. Without doing some research, I wouldn't be comfortable depending on Rackspace (Jungle Disk's parent company) storage when Amazon S3 costs nearly the same.
To me, the problem here is that they are promising that your data will be there when you need it. But, if that data is only held on a single server, any type of hardware failure will take it off-line and inaccessible. Or, a software error that writes over the data. It seems that moving the data over to a secondary data center, or at least dumping things to tape once a week, would go a long ways towards preventing an issue like this.
But instead of talking about improving their infrastructure significantly, they just moved to RAID6, which does provide better redundancy, but it still isn't duplicating it.
I absolutely believe that Jungledisk, using the S3 as the back end, is a great and reliable way to store your data. I still use S3 with Jungledisk for my backups, but I would certainly have more trust in Rackspace's Mosso service than Carbonite after this failure. Mozy also has a bit more credibility in my opinion because their parent company is EMC, one of the giants in the storage industry.
Thanks for stopping by :)