Docker

Sat Jul 06 2019

Difference between a virtual machine and container

Container based

Infrastructure
- Host OS
  - Docker
    - Container 1
      - app A
      - bins/libs
    - Container 2
      - app B
      - bins/libs

Virtual machine based

Infrastructure
- Hypervisor
  - VM
    - Guest OS
      - app A
      - bins/libs
  - VM
    - Guest OS
      - app B
      - bins/libs

Docker architecture

The heart of docker is the DAEMON

Client (talks to DAEMON)
- Terminal
  - You can talk to your daemon
- Remote API
Host (DAEMON)
- Images (everything that is persistent in an app)
  - If you run node for example you will have an image of node and all the data
- Containers (where the images run)
  - your images will run inside a container
Registry (where we get the images from)
- Hub: hub.docker.com
  - Ex: NGINX, Alpine (node), neo4j, node etc.
  - You can also deploy your docker apps on the hub.docker.com

Using docker

`Dockerfile`

FROM node:alpine
WORKDIR /usr/src/app
RUN npm i
COPY . .
EXPOSE 4007
CMD ["node", "app"]

List of running containers:

docker container ps

Tell docker to build the file:

cd backendFolder
docker build .

now check what images we have:

docker images

Now that we have our image built, we want to make a container out of it, lets first check the list of containers:

docker containers

Let's run the image in a container:

docker run -p 4004:4007 <imageId>

Above it is really interesting what we are doing, we are saying: "expose port 3000 from port 4007", meaning that even though we are listening on port 4007 in our app, we can tell docker to pipe the traffic coming from 3000 to 4007.

See the <imageId> variable placeholder above? We need to actually replace that with an image id that we created before. So using docker images we can lookup that id and paste it.

docker run -p 3000:4007 7ea7bd93k339

So now if we go to the browser and go to localhost:3000 we should see our node app running.

IPC Communicating between containers

Remember the docker logo. A big whale (the docker engine), with a set of containers on top, each one containing an images for a containerized app. One can have nodejs, the other neo4j, which is the database for the former. But now how do we let docker know that these two separate containers actually are meant as part of the same app and thus meant to be run at the same time?

That is where docker compose comes into play.

It basically consists of a docker-composer.yaml:

version: '3.4'
services:
  backend_node_yt:
    build: ./backend
    command: node app
    ports:
      - "2000:4007"
    networks:
      - myytnetwork
  neo4j:
    images: neo4j:3.5
    ports:
      - "17475:7474"
      - "7688:7687"
    networks:
      - myytnetwork
  networks:
    myytnetwork:
      driver: bridge

then let's run docker-compose up or docker-compose down

Definition

Docker runs on top of a kernel, and can run any OS that shares the same kernel. If the os does not share the kernel it can still run it but it utilizes a VM to bridge.

The purpose of docker is to allow a developer to define the kind of environment that they need. Create an image of it, and then pass that image to the Operations team which will simply run that image and start the container.

Previously the dev had to explain to the ops what exactly they needed for their app to work, and the ops had to try to recreate it. Whereas with Docker, if the developer is able to create that image, then the ops is guaranteed to be able to run the exact same image as long as it shares the kernel with his machine. And even when it does not, it can use a native docker CE edition for the os and let docker handle the thing with virtualization.

image: a file specifying the software that will be run in the container
container: a running image. Can start many containers from a single image.
repository / hub: a place for sharing premade images using the os that allows to run a certain software etc.

Commands

docker run : start a container from the specified image (will attempt to pull the image from hub if not already available in host)
```
docker run kodekloud/simple-webapp
```
When you run a docker image like this it runs in foreground or in attached mode: it will be attached to the console, or standard output, so we will see the output of the container on the screen. So Ctrl+C will stop it
- docker run -d will run the container in the background
```
docker run -d heroku/simple-webapp
# a4dfi3023ik0idkd0823383h8347857501dkdjf9iu333
```
- docker run redis:4.0 the :4.0 is a tag that allows specifying the version you want to run. The default tag (when none is specified) is :latest.
- docker run -p 80:5000 my_image_accessible_outside map the incoming requests to the docker host at port 80 to the internal docker container port 5000. We can use the IP of the host and the port 80 to access the internal container at port 5000. Allows running many containers on different ports.
- docker run -v /opt/myhostdir:/var/lib/mycontainerdir mysql persist data generated by a container in the host. This prevents loosing data at run.
docker ps : list all containers along with the random id and name image and command used to start it, and the ports and time created and the name
- docker ps -a : diplays the active and stopped
docker stop my_container_name_or_id : stops a container. To see that it has stopped confirm by ps -a which should display it in the list
docker rm my_container_name_or_id : stops it and even removes it from the list of stopped, so frees more space than with a simple stop command
docker images: get the list of available images in the host
docker rmi some_image_name: removes the image from the localhost. Warning: we need to make sure that no container is running on that image before we delete it.
docker run some_image_name : if the image is not available in the host, it will attempt to pull it from the hub.
docker history some_image_name outputs the layers and sizes of an image
docker build . builds an image based on a Dockerfile in the current directory. It will also output an id for every step to allow you to rebuild starting off of that id if it somehow fails.
docker build Dockerfile -t mmumshad/my-custom-app build an image based on the docker file and name it mmumshad/my-custom-app the layers in the build process are cached in case something fails and not everything needs to be rebuilt from scratch.
```
docker run ubuntu
docker ps
#nothing
```
- you will notice that when you run an OS image, the container immediately stops, so it is not displayed by docker ps. Because ubuntu is just an image of an operating system that is used as the base image for our applications, there is no process.
  
  The container only lives as long as the process inside is alive
- docker run ubuntu sleep 5: If an image is an OS, you need to instruct docker a command for it to actually stay alive. Here we pass sleep 5, but any other command would work.
- docker run -i some_image_requiring_stdin the -i parameter maps the standard input to the container : interactive mode. But even though it runs in interactive mode, the input passed will not be passed to the docker container, because we have not attached it to the containers terminal, use -t for that.
- docker run -it some_prompt interactive mode plus attached terminal mode.
docker pull some_hosted_image_name : use this to pull an image from the central hub/repo and avoid waiting when you run docker run that_image to download.
docker exec : allows you to run a command on a running docker container. Use docker ps to list the running containers and then use the name to exec:
```
docker exec my_running_container_name cat /etc/hosts
```
- see how above we call the command cat /etc/hosts on our running container named my_running_container_name.
docker attach someID will reattach a specified container by id to the current console, as if we had run the container in attached mode.
```
docker attach a4dfi
```
docker inspect blissful_hopper output the docker container's information in a JSON format. Gives you more details than the -ps.
docker image inspect myimagename same as above.
docker logs blissful_hopper outputs the logs of a container (useful when running in detached mode).
docker run -e APP_COLOR=blue simple-webapp-color allows setting environment variables. To run multiple containers with different environment variables we would run the container multiple times with a different environment variable value as parameter to the -e. Use the docker inspect blisfull_hopper to find out more about a container's environment variables.

Run STDIN

If we have a script that waits for some standard input, like prompt asking for our name, for ex: ./app.sh and we dockerize it, then run it:

docker run kodekloud/simple-prompt-docker
# Hello and Welcome  !

It will not wait for our answer on the prompt, it will immediately output.

Important: by default, Docker does not listen to stdin

If you want docker to wait for the input, you need to map the standard input of your host the docker container using the -it parameters

Run PORT mapping

Remember, the underlying host where docker is installed is called the Docker Host / Docker Engine and within it, we run Docker Containers based on Docker images.

docker run kodekloud/webapp
# running on http://0.0.0.0:5000/ (Press CTRL + C to quit)

As you can see above, the app is running on port 5000, but what IP should we use to access it?

Use the IP of the docker container, and the container gets an IP assigned by default (here IP: 172.17.02). It is an internal IP only accessible within the docker host/engine. So if we open a browser within the docker host, we can go to http://172.17.0.2:5000 and access the app. Users outside the docker host cannot access it. For them we could use the IP of the docker host, which is 192.168.1.5, but for that to work, we must have mapped the port inside the docker container, to a free port on the docker host. For example if we want to allow access from outside through port 80 we could map: 80:5000

docker run -p 80:5000 kodecloud/simple-webapp
# user can access 192.168.0.5:80 to access internal 172.17.0.2:5000

This way you can run multiple instances of you container in the same host through different ports.

You can also map to the same port

docker run 3306:3306 mysql

Run Volume mapping (Persisting data into the container)

When running for example docker run mysql, the data is stored within the docker container in /var/lib/mysql

IMPORTANT each docker container has its own isolated filesystem, and any changes to any files, happen within the container.

Let's assume we dump a lot of data into the database, what happens if you want to delete the mysql container and remove it?

docker stop mysql
docker rm mysql

IMPORTANT as soon as you rund docker rm mysql all the data inside the container running the image of mysql will get thrown away. ALL DATA IS GONE

If you want to persist data, you need to map a directory outside the container to a directory within the container:

docker run -v /opt/datadir:/var/lib/mysql mysql

In this case above, we are sending all data that will be created within the container in /var/lib/mysql to our host directory /opt/datadir. This will ensure no data is deleted when we rm the container.

Creating docker images: containerizing

Why would you create an image? Aka why would you containerize an application?

Because you cannot find an image suitable to your needs on docker hub
Your app will be dockerized for ease of shipping and deployment

Steps

To conceive a docker image, first you need to think of the steps required to setup your application environment properly.

For example:

OS - Ubuntu
Update apt repo
Install dependencies using apt
Install python dependencies using pip
Copy source code to /opt folder
Run the web server using "flask" command

We would achieve these steps using a Dockerfile:

FROM Ubuntu

RUN apt-get update
RUN apt-get install python

RUN pip install flask
RUN pip install flask-mysql

COPY . /opt/source-code

ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run

Once this file is written, build the image:

docker build Dockerfile -t mmumshad/my-custom-app

If you decide to make it available on the public docker hub registry, use the push command specifying the name of the image you just created (here mumshad/my-custom-app, i.e. mumshad which is the account name, and my-custom-app which is the image name):

docker push mmumshad/my-custom-app

Dockerfile

Is a file that contains a set of instructions that docker can understand. It is in a <instruction> <argument> format.

IMPORTANT every docker image must be based on another image (using FROM <image_name>): either an OS or an other image that was created based on an OS. You can find official releases of all supported OS on docker hub.

IMPORTANT all docker files must start with a FROM instruction.

The RUN command tells docker to run the particular command on those images (e.g. RUN apt-get update: runs apt-get update on the Ubuntu docker image).

The COPY instruction, copies files from the local system onto the docker image (e.g. COPY . /opt/source-code: copy files in current dir onto the /opt/source-code dir of the docker image)

The ENTRYPOINT command allows us to specify the command that will be run when the image is run as a container.

Layered architecture

When we create a docker image based on a docker file, every new instruction, will create a new layer based on the previous layer. So if we start with FROM Ubuntu we have a first layer of size 120 MB, then the RUN apt-get update && apt-get -y install python creates a second layer that increases the size by 306 MB, then RUN pip install flask flask-mysql will create a third layer that increases the size by 6 MB, then copying the source code with COPY . /opt/source-code will create a fourth layer increasing the size by 229 B and finally the ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run will create a fifth layer that increases the size by 0 B.

You can find out the size footprint for every layer in an image running the history command followed by the image name:

docker history mmumshad/simple-webapp

What can we containerize

Everything! No one will install anything anymore, instead they will run it using docker. And when they don't need it anymore, they will remove it without having to clean too much.

Docker commands vs entryponit

A container lives as long as the process inside is running. If for some reason the process inside the container stops, the container exits aswell.

When we run an ubuntu container, it will automatically exit. Because by default, the Dockerfile for the ubuntu image runs the command bash as it starts (see the instruction at the bottom of Dockerfile: CMD ["bash"]). bash is not really a process like a webserver or a database server. It is a shell that listens for inputs from a terminal. And if it cannot find a terminal it exits.

Overriding the default command in the docker file

When you start a docker container, you can override the CMD ["some-command", "with", "params"], to do so you simply append them at call time docker run ubuntu <command>:

docker run ubuntu sleep 5

With the above instruction, the ubuntu container will no longer run it's CMD ["bash"] that was specified in the Dockerfile. Instead it will run the sleep 5 command.

To make this persistent, you can extend the Ubuntu image and specify your own commands:

FROM Ubuntu

CMD sleep 5
# or more flexible is
CMD ["sleep", "5"]

With the above Dockerfile, we use the base Ubuntu image and run our own command instead.

Note: You cannot every element in the CMD json array should not contain a space, therefore CMD ["sleep 5"] is NOT allowed.

So now we just build our image and let's call it ubuntu-sleeper and run it.

docker build -t ubuntu-sleeper .
docker run ubuntu-sleeper

But what if we want to make the 5 overridable at call time? We can pass the parameters at call time and they will replace CMD.

docker run ubuntu-sleeper sleep 10

But it doesn't look really good. Use ENTRYPOINT ["<command>"] instead and combine it with CMD ["parameter"]. The ENTRYPOINT will always be called, and the CMD will be used as default parameter when none is provided at call time:

FROM Ubuntu
ENTRYPOINT ["sleep"]
CMD ["5"]

You can now do:

docker build -t ubuntu-sleeper .
docker run ubuntu-sleeper 10

Notice how we are not passing sleep at call time anymore, since the ENTRYPOINT already has it. If we did, the sleep would be appended to what's in ENTRYPOINT so we would end up with say: sleep sleep 10.

IMPORTANT: for all this to work, remember to always pass the commands in the Dockerfile to ENTRYPOINT and CMD in a JSON format.

NOTE: if you REALLY want to override the ENTRYPOINT, you can use the --entrypoint parameter:

docker run --entrypoint sleep2.0 ubuntu-sleeper 10
# The final command at startup will be
# sleep2.0 10

Networks

There are 3 default networks created by default:

bridge (default):
none
host

By default docker run Ubuntu will attach the docker container to bridge. If you want to attach it to other networks you can use --network parameter:

docker run Ubuntu --network=none
# or
docker run Ubuntu --network=host

1. bridge network (default)

bridge is a private internal network created by docker on the host. All containers are attached to this network by default and get assigned an internal IP address in range: 172.17.0.x. The containers can access each other using this internal IP if required. To access any of these containers from the outside world, map the ports of these containers to ports on the docker host (as seen before).

In docker compose fashion:

my-service:
  ports:
    - 5000:80

Or using docker directly with the --publish ports option -p HP:DP:

docker run -d --name=vote -p 5000:80 voting-app

Above we are mapping TCP port 80 of the container's network (Docker Port) to the Host Port 5000.

2. host

Another way to access the containers externally, is to associate them to the host network. This takes away any network isolation between the docker host and the docker container. Meaning: if you worked around a web server on port 5000 in a web app container, it is automatically accessible on the same port externally, without requiring any port mapping as the web container uses the host network.

IMPORTANT: using the host network also means that you cannot run multiple instances of the same image using the same port, since ports would be already in use.

3. none

With the none network, the containers are not attached to any network and do not have any access to the external network or other containers.

User defined network

The bridge network assigns all containers to the same private network, but what if we wanted to have a few containers in a private network1 and some others in a private network2?

We can create our own internal network using command:

docker network create \
  --driver bridge
  --subnet 182.18.0.0/16
  custom-isolated-network

docker network ls

Inspecting a network

How do we see the network settings and the IP address assigned to an existing container.

docker inspect blissful_hopper

Will output a JSON formatted object containing NetworkSettings entry with Networks etc.

Embedded DNS

Containers can reach each other using their names. For example let's say we are running two containers on the same node:

web (172.17.0.2)
mysql-container (172.17.0.3)

How can I access mysql-container container from the web server with mysql.connect(?)?

Through mysql container's IP mysql.connect(172.17.0.3). But the drawback is that the IP is NOT guaranteed to stay the same when the system reboots
Through the container name mysql.connect(mysql-container), docker runs an internal DNS server.

IMPORTANT: the container name is basically a hostname assigned internally for a specific container. Therefore you can use the host name at all times to talk to a container.

The DNS server always runs on IP 127.0.0.11

How does docker implement networking? Aka how come I can use a container name instead of a proper host name? How are the containers isolated within the host?

Docker created network namespaces that create separate namespaces for each container, it then uses virtual ethernet pairs to connect containers together.

Summary on Network drivers

User-defined bridge networks are best when you need multiple containers to communicate on the same Docker host.
host networks are best when the network stack should not be isolated from the Docker host, but you want other aspects of the container to be isolated.
overlay networks are best when you need containers running on different Docker hosts to communicate, or when multiple applications work together using swarm services.
macvlan networks are best when you are migrating from a VM setup or need your containers to look like physical hosts on your network, each with a unique MAC address.
Third-party network plugins allow you to integrate Docker with specialized network stacks.

Sorage: File System

How does Docker store data on the local file system?

On installation, Docker will create a directory under:

/var/lib/docker: where docker stores all its data
- ./aufs
- ./containers: files related to containers
- ./image: files related to images
- ./volumes: any volumes created by the docker containers

By data we mean all data concerning containers and images running on the docker host.

Where and what format does docker store files

Layered Architecture: remember that each line of instructions in a Dockerfile creates a new layer in the docker image. With just the changes from the previous layers.

FROM Ubuntu
RUN apt-get update && apt-get -y install python
RUN pip install flask flask-mysql
COPY . /opt/source-code
ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run

docker build Dockerfile -t gbili/my-custom-app

So with the above docker file we have 5 layers (5 lines):

Layer 1 for Ubuntu base (120MB)
Layer 2 Changes in apt packages (306MB)
Layer 3 Changes in pip packages (6.3MB)
Layer 4 Source code (229B)
Layer 5 Update Entrypoint (0B)

If we create another Dockerfile by just changing the last two lines, Docker will reuse all previous layers from cache:

# ...same as above until line 3
COPY app2.py /opt/source-code
ENTRYPOINT FLASK_APP=/opt/source-code/app2.py flask run

docker build Dockerfile -t gbili/my-custom-app-2

Docker will reuse the first 3 layers from cash, which will therefore not use any additional space in the host.

Layer 1 for Ubuntu base (0B)
Layer 2 Changes in apt packages (0B)
Layer 3 Changes in pip packages (0B)
Layer 4 Source code (229B)
Layer 5 Update Entrypoint (0B)

NOTE: whenever you update the source code and build, Docker will reuse the previous layers from cache and quickly rebuild the application image by updating the latest source code. Thus saving us a lot of time during rebuilds and updates.

Read Only

Whenever we build the image, all layers are readonly. Once the build is complete, you cannot modify the contents of these layers. The only way is to build again.

When we run a container based on this image, Docker creates a new writable layer Container Layer on top of the image layers. The writable layer is used to store data generated by the container, such as log log files, any type of files generated by the containers or any file modified by the user on that container.

IMPORTANT the life of the writable layer is only as long as the container runs. As soon as we kill it, the layer is destroyed.

Volume mount

We can create volumes that live in the /var/lib/docker/volumes/<my_volume_name>.

docker volume create data_volume
# creates a volume in /var/lib/docker/volumes/data_volume

Later on, when we run the docker run command, we can mount this volume data_volume inside the docker containers' read-write layer using the -v option:

docker run -v data_volume:/var/lib/mysql mysql
# New style
docker run --mount type=volume,name=data_volume,target=/var/lib/mysql

where /var/lib/mysql is the location inside my container where we want to mount that volume. Finally mysql is the image name. Here we use /var/lib/mysql because it is the default location where MySQL stores data. Effectively, MySQL will be writing data into the volume data_volume

Note: if the volume dooes not preexist, Docker will automatically create it for us. Note: we can list the contents of the volumes using standard: ls /var/lib/docker/volumes/<my_volume_name>

Bind Mount (External storage)

Let's say we want to mount some external storage that is outside /var/lib/docker/volumes?

We can use the : to specify the mapping:

docker run -v /some/other/external/location:/var/lib/mysql mysql
# New style
docker run --mount type=bind,source=/data/mysql,target=/var/lib/mysql mysql

So the above will create the container and mount the folder to the container. This is called bind mount.

Who is responsible for:

maintaining the layered architecture
creating a writable layer
moving files across layers to enable copy and write
etc.

It is the storage drivers. The selection of the storage driver depends on the underlying OS, for example with Ubuntu the storage driver is AUFS. Docker will choose the best based on the OS.

Compose

docker run mmushad/simple-webapp
docker run mongodb
docker run redis:alpine
docker run ansible

Instead of running the above in series, we can bunch them up in a docker-compose.yml file such that they always work in tandem:

services:
  web:
    image: "mmushad/simple-webapp"
  database:
    image: "mongodb"
  messaging:
    image: "redis:alpine"
  orchestration:
    image: "ansible"

We can therefore put the services and the option specifics to run them in this file, then to bring up the entire application stack:

docker-compose up

IMPORTANT this is applicable to run the entire stack on a single docker host.

Example Application

The application will be composed of several interacting pieces:

Python web app that lets people vote
Redis in-memory storage to store votes temporarily
Worker that will take the votes from redis and persist them in the permanent storage
DB for permanent storage
Another web app in Nodejs to display the vote results

Start by running each container in order

docker run -d --name=redis-app redis
docker run -d --name=db postgres:9.4
docker run -d --name=vote -p 5000:80 voting-app
docker run -d --name=result -p 5001:80 result-app
docker run -d --name=worker worker

This will produce an error since we have not linked them together: we haven't told the voting-app to use this particular redis instance. There could be multiple redis instances running. We haven't told the worker and the result app to use the particular postgre db that we created. That is where we use links

def get_redis():
  if not hasattr(g, 'redis'):
    g.redis = Redis(host='redis-host', db=0, socket_timeout=5)
  return g.redis

Link containers together. Instead of above run like this
```
docker run -d --name=vote -p 5000:80 --link redis-app:redis-host voting-app
```
- Specifying the --link <container_name>:<host_name> option followed by the name of the container, followed by the name of the host that the voting-app is looking for.
- Note: notice how we specified the --name=redis-app while creating the redis image based container. This is what allows us to use that name in place of <container_name> in the --link option. When we use that --link redis-app:redis-host, what it actually does is create an entry in the /etc/hosts file in the voting-app container with an entry for the redis-host referenced in the voting-app with the internal IP of the redis container:
```
172.17.0.2 redis-host 89cdb8eb563da
```
- We do the same for the result-app such that it can reference the host of the postgre db:
```
docker run -d --name=result -p 5001:80 --link db:my-db-host result-app
```
```
pg.connect('postgress://postgress@my-db-host/postgress', function (err, client, done) {
  if (err) console.error('Waiting for db');
  callback(err, client);
});
```
- Finally the worker application requires access to both, the redis and postgress containers:
```
try {
 Jedis redis = connectToRedis("redis-host");
 Connection dbConn = connectToDb("db");
 System.err.println("Watching vote queue");
```
```
docker run -d --name=worker --link db:db --link redis:redis-host
```

The above --link notations are outdated, because there exist better ways of achieving the same with swarm and networking.

But let's just finish the example above to see how we can convert these bash run instructions with --link into a docker-compose.yml.

docker run -d --name=redis-container redis
docker run -d --name=db postgres:9.4
docker run -d --name=vote-container -p 5000:80 --link redis-container:redis-host voting-image
docker run -d --name=result-container -p 5001:80 --link db:my-db-host result-app
docker run -d --name=worker --link db:db --link redis-container:redis-host worker-image

Note: --name=<container_name> and at the end of the run command we have <image_name>

For that we will have to create an entry with each --name=<container_name>, here:

redis-container:
  image: redis
db:
  image: postgress:9.4
vote-container:
  image: voting-image
  ports:
    - 5000:80
  links:
    - redis-app
result-container:
  image: result-app
  ports:
    - 5001:80
  links:
    - db:my-db-host
worker:
  image: worker-image
  links:
    - db
    - redis-container:redis-host

Now bringing up the entire stack up is really easy:

docker-compose up

Docker compose - build

While creating our compose file, we assumed that all images were already available. Still if they are not, we can instruct docker to build them within the docker-compose.yml file.

To do so, we can simply replace the image: line with a build:<./directory/containing/app/code/and/dockerfile/with/instructions> line.

Docker compose - versions

There are many versions of docker-compose and therefore the format of the files changes.

IMPORTANT: for docker-compose v2 and up, you need to specify the docker-compose version:

version: 2
services:
  redis-container:
    image: redis
  #...

version 1 docker-compose attaches all the containers to the default bridge network and then uses links to enable inter container communication.

version 2 docker-compose automatically creates a dedicated bridge network for the application, and then attaches each container to this new network. All containers are able to communicate with each other using each other's service name. Basically you don't need to use links. Version 2 also has a depends_on feature to allow you to define an order of start. So if vote-app depends on redis we can easily say:

version: 2
services:
  redis-container:
    image: redis
  vote-container:
    image: voting-image
    depends_on:
      - redis-container
    # ...

version 3 similar to version 2, but comes with support for docker swarm

version: 3
services:
  redis-container:
    image: redis
  vote-container:
    image: voting-image
    depends_on:
      - redis-container
    # ...

Docker compose - networks

So far we have been deploying all containers on the default bridge network. Let's modify the architecture a little bit such that we contain the traffic from the different sources.

Let's separate the front-end traffic from the back-end. So we will create two networks:

front-end: dedicate to traffic from users
- voting-container
- result-container
back-end: dedicated for internal traffic from the app
- redis-container
- worker
- db
- voting-container
- result-container

We still need communication between the voting-app and result-app to the back-end network, therefore we will adde these two to both the front-end and the back-end. Let's do this in the docker-compose.yml file:

version: 2
services:
  redis-container:
    image: redis
    networks:
      - back-end
  db:
    image: postgress:9.4
    networks:
      - back-end
  vote-container:
    image: voting-image
    ports:
      - 5000:80
    links:
      - redis-app
    networks:
      - front-end
      - back-end
  result-container:
    image: result-app
    ports:
      - 5001:80
    links:
      - db:my-db-host
    networks:
      - front-end
      - back-end
  worker:
    image: worker-image
    links:
      - db
      - redis-container:redis-host
    networks:
      - back-end

networks:
  front-end:
  back-end:

Basically what we do, is create a global entry with networks specifying each created network name. Then we assign each network to the corresponding service in a networks entry inside the service config. If a service needs access to many networks, we just add them.

Repository

When we run:

docker run nginx

what we are actually doing is calling:

docker run docker.io/nginx/nginx

where we are referencing an image docker.io/nginx/nginx being <registry_host>/<user_name>/<image_name>.

docker.io is the default registry, but there are many others, and you can create your own.

For example gcr.io for google, where you will find kubernetes images etc.

Private Registry

When you have a project where images should not be accessible to the public, creating a private repository may be a great idea. AWS, Azure or GCP provide a private registry by default, when you open an account with them.

From docker's perspective to use an image from a private registry, you first login to your private registry:

docker login my-private-registry.io
Username: my-name
Password: ****
# WARNING! Your password will be stored unencrypted in /home/vagrant/.docker/config.json
# login successful

IMPORTANT: if you forget to login to your private registry, it will tell you that your image cannot be found. So NEVER FORGET TO LOGIN.

Once you are logged in, you can easily run a private registry image:

docker run my-private-registry.io/apps/internal-app

Deploy private registry

The Docker registry is of course just another image, and it is available under the name registry:

docker run -d -p 5000:5000 --name registry registry:2

Now that we have it running under port 5000, how do we push our own image to it? Use the tag command to tag your image with the private registry url in it.

docker image tag my-image localhost:5000/my-image

NOTE: here we use localhost in place of the repository host because we are running it on the same machine. But make sure to replace it with the proper private repository url and its port.

Once we have tagged it, we can push the image to the repository and also pull it

docker push localhost:5000/my-image
# then we can pull  it
docker pull localhost:5000/my-image
# or
docker pull 192.168.56.100:5000/my-image

Docker engine

Let's understand docker's architecture.

Docker engine simply refers to a host with docker installed on it. When you install docker on a linux host we are actually installing 3 parts contained in the engine:

Engine
- Docker CLI: command line interface that we have been using so far to perform actions such as start container etc. It actually uses the REST API to interact with the Docker Daemon.
- REST API server: is the interface that programs can use to talk to the Docker Daemon
- Docker Daemon: background process that manages docker objects such as images, containers, volumes and networks
NOTE: the Docker CLI may not necessarily be on the same host, it could be on another system, like a laptop, and can still work with a remote docker REST API, to do so simply use the -H option to specify the remote docker engine address: docker -H=remote-docker-engine:2375. For example to run a container based on nginx on the host, you can run this instruction on the laptop:
```
docker -H=10.123.2.1:2375 run nginx
```

Containerization

Docker uses namespaces to isolate workspace:

Process ID
Network
InterProcess
Mount
Unix Timesharing

All of the above are created in their own namespace in order to provide isolation between containers.

Namespace - PID

Is one of the isolation techniques.

Whenever a Linux System starts, it does so by starting with a single process with a PID of 1 this is the root process and kicks off all the other processes in the system:

PID: 1
- PID: 2
- PID: 3
- PID: 4

Now, if we were to create a container (as a child system withing the linux), the child system needs to think that it is an independent system on it's own, and that it has its own set of processes with a root process with a PID of 1. But the reality is that the contained processes are still running on the host, and there cannot be 2 processes with the same PID:1. This is where namespaces come into play.

Whenever a process starts in the Child system (Container) it gets assigned the next Linux System PID available. In this case it would be PID: 5, PID: 6 etc. BUT at the same time it gets assigned a PID:1 within the container using the container namespace (this PID is only visible within the container). Thus the container thinks that it has it's own process tree, and so it thinks it is an independent system.

So even though the container is running as process on the main host, the container thinks (thanks to namespaces) that the processes within itself have these PID:

PID: 1
- PID: 2

But the reality, assuming the container started as a process with PID: 5 in the main system, these namespaced processes would look like the following in the main host:

PID: 1
- PID: 2
- PID: 3
- PID: 4
- PID: 5 (container believes that it is PID:1)
- PID: 6 (container believes that it is PID:2)

cgroups

Each Docker container running on the same host, will have to share CPU and Memory resources of the host with the other containers.

How are these resources allocated between containers?

By default there are no restrictions on resources usable by a container. Hence a container may end up using all the resources of the underlying host. But there is a way to restrict those using cgroups.

docker run --cpus=.5 --memory=100m ubuntu

The above starts an ubuntu based container that is not allowed to use more than 50% of CPU and 100MB of memory at any given time.

Container Orchestration

Is required when you have to start running multiple instances of the same application to handle an increase in traffic.

If a container was to fail, you should be able to know about it and start off another container with the same image to replace it.

If the host itself crashes, the containers on that host become inaccessible as well.

To solve these issues without orchestration, you would need a dedicated engineer that would sit an watch over the performance of health the containers.

Container Orchestration is a set of tools and scripts that can help host containers in a production environment. Typically a container orchestration solution consists of multiple hosts, if one fails the application is still served by the others:

docker service create --replica=100 nodejs

see the --replica, it allows you to deploy hundreds or thousands of instances of your application with a single command. This is the command used for docker swarm

Docker Swarm

With docker swarm, you can combine multiple machines together into a single cluster

Docker will take care of distributing your different application instances into separate hosts for high availability and for load balancing across different systems and hardware.

Setup swarm

To setup swarm you need to must first have a set of hosts with docker installed on it.

Then you must designate one of them to be the manager / master or Swarm Master and the others as slaves or workers.

Once you are done with that, execute:

docker swarm init --advertise-addr 102.168.1.12
# Swarm initialized: current node (0j3k3j3893d9dhdkj9j95857hnvbvn) is now a manager
# to add a worker to this swarm run the following command

To add a worker to the swarm you do the following:

docker swarm join --token 0j3k3j3893d9dhdkj9j95857hnvbvn

Docker service

as we already know, when we want to run say a web server for which we have built an image named "my-web-ser" we run:

docker run my-web-ser

So now that we have learned how to create a swarm cluster, how do we utilize my cluster to run multiple instances of my web server?

One (not ideal) way to do this would be to run the docker run command on each worker node. But this would require lots of repetition if there are 100s of nodes, we'd have to manage the instances ourselves, if they fail restart manually etc.

The key component of swarm orchestration is the docker as service. Docker as service are one or more instances of a single application that run across the nodes in a swarm cluster, which in conjunction form a service. To do so we use:

docker service create --replicas=3 my-web-ser

IMPORTANT: the service command has to be run on the manager node and not on the worker node.

The docker service command is similar to the docker run command in terms of options passed (-p, --network, etc.).

Kubernetes

Kubernetes cluster consists of a set of nodes. A node is a machine: physical or virtual, on which the kubernetes software tools are installed. A node (or minion) is a worker machine, and that is where the containers will be launched by kubernetes.

But what if a node fails? Obviously our application will go down.

A cluster is a set of nodes grouped together. So whenever one of the app or node fails, you still have the other nodes withing the cluster running and serving your app.

Who is responsible for managing the cluster? Where are the information about the members of the cluster stored? How do the nodes monitor when one of them fails? How do we monitor the workload etc? That is where the master comes in. The master lives inside the cluster as well.

Master components:

etcd
scheduler
kubelet
controller
container runtime
Api server

`kubectl`

kubectl run hello-minikube
kubectl cluster-info
kubectl get nodes
kubctl run my-web-app --image=my-web-app --replicas=1000