Is StackOverflow truly a monolithic application?

So I watched this video earlier and was a bit shocked. Not by the claims so much as the conclusion.

Now, if you said I had to classify it as EITHER a microservices architecture or a monolith, then yes, monolith. But, unconstrained I would say it is actually a hybrid.

Am I crazy? I think not.

A few things jumped out when I looked at their architecture and claims. 

Firstly, ElasticSearch. While SO did not write ElasticSearch, it is a logging service. A small domain of the problem with its own database. It is a microservice. The fact that it is not part of their code base does not mean that it is not a part of their architecture. They COULD have written their own logging solution in-house and included it within the main monolith.

I'm sure you can guess where the next point comes; Redis. Again, this is typically for caching, another small slice of the domain problem. And once again, it's own database.

These two cases are microservices in the purest and truest sense. If you include them in your project, then your project, even if developed in a mono-repo and developed in a monolith like fashion is NOT a true monolith.

Some people might not bat an eye at their inclusion. And I think this is because both of these are solutions typically leveraged by microservices. They provide a distributed cache and centralized logging. If you're running a single instance of a monolith you may not need something like Redis and probably aren't large enough scale to get much benefit from ES beyond not needing to roll your own logging platform. But, IF you are using them in a "monolith", then you don't have a monolith.

And this might sound rather semantic. But it isn't. It is no different from using something like Auth0 for handling authentication, or a 3rd party service for localization. These are things which would CLEARLY be microservices if you had made your own and maintained the structure. Or put another way, MAINTAINED THE ARCHITECTURE. Architecture is not just about YOUR code. It as about how it is composed and deployed. And SO is clearly deployed with MULTIPLE services. Not one.

The SQL DB is at the same layer as Redis and ES in their diagram, but I would say that layer should be merged into the main service in the same fashion that the DB for ES and Redis are not called out explicitly.

But, there is one other piece which in this day and age I don't think should get a free pass anymore. Especially not when you have 2+ instances running. And since they have 9, it counts. And that is the load balancer. This, more than likely is yet another service they probably did not write themselves. But, it is likely NOT baked into that main service.

The point of this isn't to call them liars. I honestly think that most people would make the same classification. But, from an architecture perspective, Redis and ES are clearly, architecturally speaking, additional microservices which the SO application is dependent upon. So, while they have a monorepo and they have one service which is a monolithic service. StackOverflow itself is not based on a purely monolithic ARCHITECTURE. Don't mistake the code base with the architecture.

And, while the size of the site and the performance are impressive, I would also say that success does not necessarily infer the right solution. They need a SQL server instance with 1.5TB RAM to maintain those characteristics. That is quite the beast. And they say that it is able to keep 1/3 of the site in memory. It seems unlikely that anywhere near 1/3 of the site NEEDS to be in memory at any one time. Which in turn tells me that the MASSIVE amount of RAM is compensating for their architecture decisions. 

Does the solution work? Yes. Does it work well for them? Presumably. Could it deliver comparable speeds with more efficiency? We'll probably never know. But the answer seems like it really should be a yes.

And to anyone who thinks I'm insane for suggesting that they could write their own in-house ES or Redit replacement, consider that they did write their own ORM just to squeeze even more performance out of the data access layers. YES, they very much have the skill and capacity to roll their own ElasticSearch replacement. They simply chose not to. 

Comments

Popular Posts