Docker and Kubernetes | What The Heck is Container and How It Really Works?
Image: Single file that contains dependencies and configurations required to run a program
Container: Running instance of an image
Probably you’ve heard these terms a lot when you’re looking for resources to grasp how docker works. When it comes to understanding what is an image, we can imagine that it’s a file that contains all the information about the program. But it’s not easy to understand what is container and how it really works. In this article, you and I will discover what a real container is.
How do programs work onto Operating System?
Before diving into the container concept, we should first understand how programs run onto Operation System without container technology.
The diagram above shows a simple flow of the running process of a program onto OS. Let’s take a closer look at it:
Kernel: It is a running software process that governs access between all the programs that are running on your computer and all the physical hardware that is connected to your computer as well.
System Call: It’s like a system function. For example, if we try to open a file on the hard disk using Python, we don’t directly communicate with the hard disk itself. We first make a system call to the kernel and it handles our request.
So the kernel is always kind of an intermediate layer that governs access between these programs in your actual hardware. The other important thing to understand here is that these running programs interact with the kernel through things called system calls. These are essentially like function invocations. The kernel exposes different endpoints to say “hey if you want to write a file to the hard drive, call this endpoint or this function right here”. It takes some amount of information and then that information will be eventually written to the hard disk or memory or whatever else is required.
Now thinking about this entire system right here, I want to consider a situation where my Discord app works with a dependency called X, and Chrome works with a dependency called Y. These dependencies don’t work on an OS together.
In the diagram shown above, we have two different segments to run Discord and Chrome together (because we’re assuming that X and Y don’t work together). That’s where namespacing comes to the place. We can namespace a process to restrict the area of a hard drive that is available or the network devices that are available or the ability to talk to other processes or the ability to see other processes. In this situation, we’ve created both X and Y namespaces to make it run Discord and Chrome together.
namespacing and cgroups
Namespacing allows us to isolate resources per a process or a group of processes and we essentially saying that any time a particular process asks for a resource we’re going to direct it to this one little specific area of the given piece of hardware. Namespacing is not only used for hardware. It can be also used for software elements as well.
Namespacing is not the only way that enables us to resource restriction. We can also use Control Groups for resource metering and limiting.
Ok, well, what is the difference between namespacing and cgroups?
namespacing limits what you can see
cgroups limits what you can use
With namespacing, you can fool the container. Container thinks like it’s the only process that works and no other process around. So you can limit what your container actually sees.
With cgroups you can limit resources like memory, network, and CPU.
What is a Container?
As we said earlier, the container is a running instance of an image. And we walked through a little journey on what do I mean by running instance. Finally, we can talk about what a real container looks like.
This entire vertical of a running process plus this little segment of resource that it can talk to is what we refer to as a container.
When people say “oh yeah I have a container”, you really should not think of these as being like a physical construct that exists inside of your computer. Instead,
A container is a process or a set of processes that have a grouping of resources specifically.