"I’m writing this post freshly after a single restore of 370 gigabytes of database that took almost a week of wrestling with broken storage, incomplete archives, interrupted transfers, stuck communication, and — most of all — Waiting for Stuff to Complete, for hours and hours. In fact, much of this post has been written during the Waiting. Many of the issues I have wrestled with were caused by mistakes on my side: misconfiguration, insufficient monitoring that should have detected issues earlier, and the fact that over last month I was not able to pay enough attention to the day-to-day maintenance, which allowed the suckage to accumulate. At the same time, the software systems should automate out the common parts, make it easy to get things right, and be easy debug when they aren’t. All of the backup systems I’ve ever used or seen fails miserably at two or more out of these three points. But let’s start from the beginning." - urbansheep@gmail.com