Online note-saving service Evernote on Monday acknowledged that it had suffered a hardware fault at the beginning of July that resulted in potential data loss for more than 6,000 of its users worldwide.
The issue was first reported by blog Techwave, citing a report from Japanese newspaper Mainichi Shinbun. In a Monday note to Evernote users on the company's blog, Evernote CEO Phil Libin explained that the loss stemmed from bad server hardware:
"Every user's data is stored on a 'shard.' A shard is made up of a server together with a redundant fail-over server. If there is any problem with a server, the system automatically fails over to the second server in the shard. We currently have 37 shards. Shard 22 was the one that had problems last month."
Evernote's back-up system stores user data in up to six different places using both on- and off-site servers as well as locally on the user's copy of the software. Though in the case of the problem, which lasted four days, user data was simply being overwritten due to one of these systems not having a working failure routine. "Basically, the shard kept failing over back and forth between two servers over the time period causing some of the data created during that time to get overwritten," Libin explained.
In a call with CNET on Monday morning, Libin said that of the 6,323 users affected by the outage, approximately 70 percent were able to get their data back.
Evernote's software saves a copy of a work in progress before syncing it up with whatever was stored online, so the company was able to pull the complete copies of various files once the problem had been addressed and fixed. However, those who had been working purely on Evernote's site, and whose work was being stored on the faulty shard, had no such protection.
As an apology, Evernote has provided affected users with a free year of the company's $45-a-year premium service. Those who were already premium subscribers get an extra year.
As for whether this could happen again, Libin said it's extremely unlikely."This was a freak of hardware failures. But we've changed the fail-over process so it won't happen again."
Data loss on large-scale Web services is uncommon, but can be extremely hard to recover from. In 2009, social-bookmarking site Magnolia suffered a massive data corruption that resulted in the loss of all its user data. It has since started from scratch with a new version of the site. Prior to that, one of the most high-profile outages was a multi-hour downtime for Amazon's S3 cloud storage service, which many sites use as their built-in storage solution.
At a press conference three weeks ago, Evernote announced it has reached 3.7 million users since launching in June of 2008. In that time, its users have saved 145 million notes, which Libin said works out to 312 new ones every minute.… Read more