Addressing the zombie in the room: Microservices

A while back I stumbled across the term "zombie services" to refer to Microservices that continue to exist after the developer or team that worked on them exit the team or company. I was a little confused because I've actually come across a similar phenomena in monolith applications and the problem there is FAR worse. Thus the confusion is at the fact that this term exists when it seems to exist to brand this as a Microservices issue. 

To recap WHY it is worse in Monoliths, we will first assume that we are talking about:

  1. "true" microservices
  2. "true" monoliths (IE - not hybrids)
  3. sufficiently large/complex projects
The first two points are just to ensure that the point is a bit clearer and that there is no squabbling about compromises which COULD be made in a hybrid scenario.

Now, what makes a Microservice a Microservice is that the domain of the service is much smaller than that of a monolith. It might be a single function, or it might be a small slice of the application, like authentication. In a Monolithic application, everything is in one service. On the one hand this means it is more likely that it was the work of one person or a smaller team. On the other hand... well we'll get there. 

Architecturally, Microservices provide a hard line in the sand for decoupling dependencies. Monoliths are a free for all. I will note that the same decoupling CAN be achieved in Monoliths, I'll also note that Microservices can be poorly scoped for their domains. But, that is also part of the reason for the "true" Microservices caveat. Here we are assuming that the scoping is appropriate. In this situation, the scope of the work is not much different than in a Microservice, so there are the same odds that a particular area of the application is owned by one or a small number of people.

But, the first problem with "zombie services" is identifying them. With Microservices that is easy. Go to your production environment, look at all of the services running and ask your teams who owns each one. Any that don't come back with an answer are potential zombies.

With Monoliths? Well. You only have one service. So, you're at the mercy of how the project was structured. If it was well structured you may have separate namespaces for separate teams with the services in the same or consistently named areas. Then you might be able to write a tool to find all of your services. And that is probably the best case scenario. Worst case is that the functionality in question is really just one service call buried within a larger controller. And THAT is the scenario I see more often. And while it isn't guaranteed that this is the case, the lack of architectural constraints on code structure in Monoliths do mean that this can (and does) happen.

After you've identified your zombies the next task is to deal with them. Either by replacing them, or finding someone to take over the project.

Once again, true microservices have an edge here regardless of the approach. The scope of the project is small and the project is decoupled from the rest of the application. This means a smaller code base to dig through and a focus for all of that code. The unit tests are likely all in one place and thus easier to find to aid in that comprehension. And if, in the end, you decide to rewrite, then the scope of the rewrite is small and well-defined as well. Not to mention, it is separate and you can always revert to the last known working copy of that one particular service. 

(Note: another reason for choosing true microservices  for this example is the separate database. This guarantees that we don't have to worry about breaking DB changes in other areas of the project forcing us to address the zombied service on a timeline driven by another area of the code).

With Monoliths, without the guard rails, more things tend to end up more heavily coupled for convenience. So, even if you CAN identify your zombie code, it can be a lot harder when the SME leaves to determine what the code even did or how to begin fixing the issues. Tests can be mixed in all over and so on. 

Being consistent and following good practices can address a lot of these issues. My experience however, is that any sufficiently large group of people is sufficiently difficult to control that some corners will eventually be cut and that the rate will increase as the project grows.

Architectural decisions that bake consistency into the application at a level which is harder to subvert are one of the best mechanisms for combating this tendency. Microservices are one such architectural decision. 

The only large projects I have seen succeed as Monoliths at maintaining their code standards are open source projects where MRs are reviewed and controlled by one or a small number of people with similar ideals and a high degree of expertise. 

Large commercial projects do not tend to have a single person or people hired due to similar ideals managing all code submissions. Inconsistency is baked into the process in those environments.

Another interesting byproduct of this "tyrannical" approach to code submissions is that the maintainers, at some point see and review every piece of code. And maintainers tend to stick around the longest with their departures often leading to the end of the project. As a byproduct, zombies rarely appear in such projects. But, companies tend not to operate that way because it means that the fate of the entire product likely lies on the shoulders of one or a few individuals and that is not fiscally responsible.

So, I suppose my conclusion is that human nature is the true enemy and that commercial projects should utilize strategies which enforce, architecturally, more maintainable code (including the adoption of microservices or at the least hybrids). And that open source and other smaller projects should do what feels right. But, if you get big in open source, be sure to lock down submissions and scrutinize the heck out of everything.

Comments

Popular Posts