Well, that was close.

What I *REALLY* dislike about myself is WHEN I'm right.

I don't speak up often and I don't make a lot of strong suggestions. But, when I do, they tend to seem almost prophetic.

In reality, my brain is probably just doing that normal human brain stuff. It is finding patterns and relying on my past knowledge and experience to predict probable outcomes. The purpose of this post isn't to brag or to blow smoke up my own butt. It is to talk about the most recent such experience.

A while back I started rewriting the application for my brother's business as a web application. It was a Windows Form app before. My motivation was... weak. I just kind of wanted to learn Vue and make some use of my knowledge of microservices. So, I rewrote the backend in .Net Core (later upgraded to .Net 7) and wrote a Vue front end.

Cool. I hadn't actually told anyone about this, I had just sort of done it. It is something I do sporadically. Some of these branches get scrapped. Some end up replacing the old.

And then, my brother wanted a whole bunch of new features. So, I said "sure". But, I'm doing it in the new stack and not the old. So, a few weeks later the new was deployed. And was largely ignored. End users don't love change.

Anyway, by the time I had put it in production, one of the major features was "real-time" backups. Any update to the DB was written to a sync table which was then sent to an offsite Sync Service to apply against a recent DB backup.

And for a while this was kind of pointless. The old system didn't pass through this. But then I had an epiphany; I can just make the old freakin' software populate the Sync table as well. And Voila! Almost a full year later, I have fairly near real-time back ups.

To me this was a big deal. They had lost their data in the past due to ransom-ware. And their computers live bolted under a desk in a hot tire shop. I knew that sooner or later the hardware would crater if they didn't lose it to another virus. Oh. Also their internet is horrible. Like so bad that it demoed my plans for automated cloud uploaded backups.

The real-time thing was brilliant because the syncs were purely timestamped deltas. A few dozen to hundred KBs over the course of a day. It was painless and fast.

And then we got the first test of that system; the DB died at random this week. Not sure what happened yet. I know that the contents of at least one table are just gone. I assume that all other tables are gone as well. It was a bit of a panic. But, within about 5 minutes I had a backup running from my home server. And how much data was lost? At present, we assume 0%.

I'm still not sure on the cause. I'm pretty sure that the old software is somehow to blame. It puts me in a bit of an annoying spot though. Now the database is running on my home server. And only on my home server. I've also had to secure the app a bit further as it is no longer just running on premise. The DB is backed up regularly and to 3 different drives. But, they are all in the same physical computer.

Not sure what the broader takeaway here is. I think that I want to take things a bit further and automate a couple of offsite backups. What basically needs to happen is that another offsite DB needs to exist to dump the Sync records into. Then, a service needs to periodically take down the service for maintenance. And while offline take a backup, push the backup to a remote server, purge the offsite sync DB and spin everything back up. In between backups, the offsite sync DB provide storage for deltas between the backups. But, I also need to prove that this table can serve as a source for my Sync process. Thankfully it is a single table which holds a JSON serialized value. 

Well, this is the next adventure I think.

The cool thing though, I learned I backup and restore to S3 in Sql Server 2022. I have MinIO which is an S3 compatible storage. And I tested to night and I can use it. This means that I can make a version of the application which is self-healing.

And that is long term goal now:

  • The application syncs to a remote Sync DB. Probably my free Mongo Atlas DB.
  • Once a week, outside of business hours it runs the process above for backing up and purging the tables.
  • If the data goes sideways, I tell the server to load a service which takes the DB down, restores the DB from S3 and syncs the data back in from the remote store
  • Generates a new DB upload and purge
  • Restarts the main service

The craziest part to me is that all of this should all be low enough data usage that I can power it entirely via either on-site or completely free services. I can even broker it all through my home server to maintain a third copy of the data if necessary. Not sure what I want to do there yet. But it is exciting. This is a level of stability which is generally reserved for much larger companies and I love finding ways to bring this to smaller scale solutions. 

Also, while I hate being right that I had to worry about the data. I love seeing the solutions work. This system has been running for years. It was likely just a random blip. But I've seen enterprise customers fail to recover 100% of the data in the timespan I managed. Granted, my solution would hit other problems at scale. It is important to know your limitations.

But yeah, that is the plan. It derails my Anki replacement by a bit. Though, this is also a lot more interesting and currently more practical.

Comments

Popular Posts