Small Businesses and Database Backups

 A few years ago one of the worst things imaginable happened to the software I wrote for my brother; ransomware took their database.

Luckily for them, they kept the paper trails as well. But, it was still a fairly impactful event. They had relied on that software for years and had years worth of customer data in there. Now they needed to recollect everything again.

It was an eye opener. I had always assumed that what would take the data down would be faulty hardware. And that, of course, was another nightmare always looming over us. Especially as I moved 4.5 hours away (I used to be about a 30 minute drive away). But, I kind of looked at that as less likely. My experience had been that CPU fans give out before hard drives. Yes, I know, optimism like can certainly come back to bite you.

So, after that we started making sure to take backups every time I went to get my tires changed. But, that meant just 2 backups a year. And maybe a third if I was down that way for the holidays or something else. Still meant that if the system went unrecoverable again they would likely choose to start fresh again. Not ideal.

After that, I eventually started on a new backend for them. One of the problems I had wanted to solve was this DB backup problem. I had thought about automating the SQL backup and then another process to manually push the full DB backup elsewhere. But, I felt like there were too many unknowns and fault lines. "What if the DB backup failed because of a record lock or some other issue?", "what if the upload credentials or API changed?", or worse "what if they decide that they need to shut it off because it hogs too much bandwidth?". This last was actually a MAJOR concern for me. They are running some garbage dial up or satellite connection. And when there is too much traffic their PoS system stops working. So, even uploading a few MBs could take several minutes. And they shut down the computers at night. 

But anything which left the data local meant it was susceptible to both ransomware and hardware failure. It NEEDED to get the data offsite. 

These are concerns which, I think, affect most small businesses. They aren't large enough to have IT staff and likely aren't tech savvy. They also likely lack good policies to protect their data. There are also limited guarantees on things like bandwidth both in terms of speed and amount. In short, I was dealing with what I think is a fairly pervasive issue. And I think I finally found one potential solution.

I had recently gotten my own network exposed to the internet using DuckDNS and Let's Encrypt. So, I decided that the new service should wrap every write to the DB in a call to write a "sync" record and push that to a service running a DB backup. That same service then reads the Sync data and applies it to the backup in sequential order.

The faults points here are perhaps no less numerous, but more of them are under our direct control. For instance, ensuring that every required sync is recorded and recorded in sequence and that they are applied, applied successfully and applied in sequence. That is a lot which can go wrong. But, I own the code on both sides and bugs can be worked out. The data is applied on a regular basis and only deltas so they are small and are sent much closer to real time (the process which sends the syncs acts as an aggregator and sends batches of data every minute... at their volume this is typically just one record per sync when there is data).

I can still take periodic backups when I'm in town. I just shut down the receiving end of the sync service, take a DB backup while there is no pending work to sync, restore the DB when I get back and start the sync service again. I can even backup the synced DB prior to the restore and compare it to the new backup. They should contain identical data. I can use this to validate if there are any gaps in the data being sent. On the receiving side I can see errors or even debug if there are issues.

There are a few intrinsic benefits to this approach;

  1. The backup I have will likely contain 100% of the data in the event of a catastrophic failure
  2. The backup I have should be an exact copy of the production database, ready to deploy to a local instance of the software in the event that a failover is required
And yes, that is only 2 points. But they are BIG points. For a small business this means that odds are good of losing NO data. Even if they are hacked. This is bespoke software, so the odds that malicious code will be able to pollute the data stream on my end is unlikely. And it is offsite, communicating over SSL, so no VPN tunnels or anything exposing my servers directly. It also means I can get them back online in a matter of minutes if I'm home when it happens or less than 24 hours if I'm not.

I hope it never comes to that. But, this is getting pretty close to enterprise level reliability for small businesses. And I could even automate a lot of that. For slightly larger businesses with multiple locations, they could become their own backup sites with each keeping a local copy of the data from the other location, or if they are a shared DB having both sites maintaining a sync in both sites.

Though, at that point I would need a turnkey solution for switching to the backup. However, that is also something I've considered in the past.  

Comments

Popular Posts