Containerization has failed... long live Containerization.

 Recent shifts in the Docker and other container communities have lead to some headaches. In short, the "default" position once touted for containerization is dead. But, I still think that containerization is a good strategy. We just need to reset our "defaults".

One of the biggest default assumptions around containers is that if you host and pull from Docker Hub you should just be fine. And this... mostly works. But, interestingly, where it fails is actually where you might expect it least; some seriously big players. Like Microsoft.

I dabble in ML in my spare time, hoping to get good at it. My progress there not being important. What is important however is that on numerous occasions I find images no longer present in Docker Hub or tags which were updated, and thus breaking some other image dependent upon during my build process.

So what should we do about it? Just build from source? Well, that is what a lot of people suggest. But, there are a lot of companies who publish proprietary software via Docker Hub and thus don't provide the source. Once again, on the ML bandwagon, I can cite a major player here; NVidia.

Ultimately, I think that the right answer has shift substantially over the past 2+ years and I think it is becoming increasingly important to run your own container infrastructure. The whole point of containerized images was to eliminate version and environment issues. 

However, if you can't reliably build a container in the first place because the maintainers update or remove tags (or even entire image repos), then the whole argument is moot. Sure, once you have an image produced, anyone who downloads that image successfully and runs it in a support environment should get a reliable experience. 

But, as developers of software who are often relying on other 3rd party containers, we also need a way of ensuring that our targets are at least as consistent as we are expecting them to be. And I think that the only way to achieve that is to pull a container and then push it a private repo which is maintained on your own internal schedule and following your own internal policies.

Interestingly, I think that the same is slowly becoming true of compiling dockerfiles. It is becoming an increasing problem where GitHub accounts are compromised or the owner sabotages a repo. The risks have started becoming real. So, similar to the container image problem, it is becoming increasingly prudent that if you rely on a container from an open source project that you should either build the image from source or pull it, and then push it to a private repo. It may even be prudent to clone the GitHub repo to a local GitLab (or similar) version system.

The reality is that is becoming increasingly easy to self host a whole range of DevOps tools at the same time as it is becoming more important to do so. 

Comments

Popular Posts