OpenShift 4 in an Air Gap (disconnected) environment (Part 2 — installation)

Oren Oichman
22 min readJan 18, 2020

--

The Installation

In this part we will focus on the deployment part and what we need to do for the installation process to be successful in our air Gapped (disconnected) environment.
If you need any help with setting up the Infrastructure prior to the installation you can refer to the first part article (should be published by the end of February)

Author’s advice

Copy this article to a PDF and make sure it is available in your Air Gapped environment to avoid typos or any other misconfigurations

Scenario

For a very basic scenario we will need to following servers :

  • 1 external Server (prefered RHEL7/8)
  • 1 installation Server (prefered RHEL7/8)
  • 1 guest VM without OS for the bootstrap (will be RHCOS)
  • 3 guest VM without OS for the Masters (will be RHCOS)
  • 3 guest VM without OS for the Workers (will be RHCOS)

All Server must be in the minimum requirements according to OpenShift documentation with is 4 VCPU and 16G memory for each node.

I assume that you have the external and internal server already installed at this point.

Creating the registry

Before we begin as I always recommend, lets start a screen session in case we are disconnect for any reason:

$ screen -S ocp
OR (tmux new-session -s ocp)

First let’s create a base directory for the repository on the external server.
For the purpose of this document I will refer to this server as “external”

On the external server run the following command :

$ mkdir /opt/registry
$ export REGISTRY_BASE="/opt/registry"

Now lets create the directories we need for the repository and everything we will want to take to the internal server

$ mkdir -p ${REGISTRY_BASE}/{auth,certs,data,downloads}
$ mkdir -p ${REGISTRY_BASE}/downloads/{images,tools,secrets}

A simple but a tricky part , here we will want to call the registry the same name as we would in the internal LAN but we probably do not want to write our internal domain in an external server so we will use the hostname and not FQDN.

We will edit the /etc/hosts file of the external Server and add the “registry” record to it:

$ vi /etc/hosts
$ 127.0.0.1 registry

From now on our registry will be named “registry”.

packages :

Both the external and internal server will be needing a few package to make our work easier, so let’s go ahead and install them :
(you will need to enable EPEL in order to obtain all the tools)

$ yum install -y jq openssl podman p7zip httpd-tools curl wget screen nmap telnet ftp tftp openldap-clients tcpdump wireshark xorg-x11-xauth tmux net-tools nfs-utils sg3_utils bind-utils rlwrap uucp

NOTE

If you choose to use docker (and the docker service) instead of podman you do need to make sure you are consistent with this decision trough our tutorial.

Next we will create some kind of “answer file” to our self signed certificate:

$ cd ${REGISTRY_BASE}/certs/
$ cat >csr_answer.txt << EOF
[req]
default_bits = 4096
prompt = no
default_md = sha256
distinguished_name = dn
[ dn ]
C=US
ST=New York
L=New York
O=MyOrg
OU=MyOU
emailAddress=me@working.me
CN = registry
EOF

change the values under the DN section as you see fit (here it does not really matter)

Now lets generate the self signed certificate:

$ openssl req -newkey rsa:4096 -nodes -sha256 -keyout domain.key -x509 -days 365 -out domain.crt -config <( cat csr_answer.txt )

The output of this command will be 2 new files which we will use for our registry’s SSL certificate:

$ ls -al
total 20
drwxr-xr-x. 2 root root 4096 Jan 8 13:49 .
drwxr-xr-x. 7 root root 4096 Jan 8 09:57 ..
-rw-r — r — . 1 root root 175 Jan 8 13:48 csr_answer.txt
-rw-r — r — . 1 root root 1972 Jan 8 13:49 domain.crt
-rw-r — r — . 1 root root 3272 Jan 8 13:49 domain.key

Also, if needed and you haven’t done so already, make sure you trust the self-signed certificate. This is needed in order for oc to be able to login to your registry during the mirror process.

$ cp ${REGISTRY_BASE}/certs/domain.crt /etc/pki/ca-trust/source/anchors/$ update-ca-trust extract

Generate a username and password (must use bcrypt formatted passwords), for access to your registry.

$ htpasswd -bBc ${REGISTRY_BASE}/auth/htpasswd myuser mypassword

first get your firewalld zone:

$ export FIREWALLD_DEFAULT_ZONE=`firewall-cmd --get-default-zone`
$ echo ${FIREWALLD_DEFAULT_ZONE}
public

My output is “public” but it can be “dmz” , “internal” or “public” for you.

Make sure to open port 5000 on your host, as this is the default port for the registry.

$ firewall-cmd --add-port=5000/tcp --zone=${FIREWALLD_DEFAULT_ZONE} --permanent$ firewall-cmd --reload

Now you’re ready to run the container. Here I specify the directories I want to mount inside the container. I also specify I want to run on port 5000 and that I want it in daemon mode.
I would recommend you put this in a shell script under ${REGISTRY_BASE}/downloads/tools so it will be easy to run it again in the internal server:

$ echo 'podman run --name my-registry --rm -d -p 5000:5000 \
-v ${REGISTRY_BASE}/data:/var/lib/registry:z \
-v ${REGISTRY_BASE}/auth:/auth:z -e "REGISTRY_AUTH=htpasswd" \
-e "REGISTRY_AUTH_HTPASSWD_REALM=Registry" \
-e "REGISTRY_HTTP_SECRET=ALongRandomSecretForRegistry" \
-e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
-v ${REGISTRY_BASE}/certs:/certs:z \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
docker.io/library/registry:2' > ${REGISTRY_BASE}/downloads/tools/start_registry.sh

I am using “echo” here instead of “cut” because I want to preserve our variables with in the command.
The reason for that is to allow us to select a different Directory base for our internal registry.

Now change the file permission and run it :

$ chmod a+x ${REGISTRY_BASE}/downloads/tools/start_registry.sh$ ${REGISTRY_BASE}/downloads/tools/start_registry.sh

Verify connectivity to your registry with curl. Provide it the username and password you created.

$ curl -u myuser:mypassword -k https://registry:5000/v2/_catalog 
{"repositories":[]}

This should return an “empty” repository for now

syncing the repository

First lets grep the build version we are going to download.
This is located on the release.txt file in the openshift download directory.
By the way lets put the output in and “env” file to use it in our internal server

$ export OCP_RELEASE=$(curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/release.txt | grep 'Name:' | awk '{print $NF}')
$ echo "export OCP_RELEASE=${OCP_RELEASE}" >> ${REGISTRY_BASE}/downloads/tools/env_ocp

First lets download and untar the oc binary which we will use for syncing the repositories :

$ wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-${OCP_RELEASE}.tar.gz -P ${REGISTRY_BASE}/downloads/tools/$ tar -xzf ${REGISTRY_BASE}/downloads/tools/openshift-client-linux-${OCP_RELEASE}.tar.gz -C ${REGISTRY_BASE}/downloads/tools/$ ln -s ${REGISTRY_BASE}/downloads/tools/oc /usr/local/bin/oc

Next we will download the ISOS , the kernel and the initramfs file we need for the PXE installation.

$ export OCP_VERSION=4.6
$ echo "export OCP_VERSION=4.6" >> ${REGISTRY_BASE}/downloads/tools/env_ocp

Grab the ISO version :

$ OCP_ISO_VERSION=$(curl -s https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/sha256sum.txt | grep live| awk -F\- '{print $2}' | head -1)
$ echo ${OCP_ISO_VERSION}

now lets start the downloads:

$ wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-${OCP_ISO_VERSION}-x86_64-live-initramfs.x86_64.img -P ${REGISTRY_BASE}/downloads/images/$ wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-${OCP_ISO_VERSION}-x86_64-live-kernel-x86_64 -P ${REGISTRY_BASE}/downloads/images/$ wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-${OCP_ISO_VERSION}-x86_64-metal.x86_64.raw.gz -P ${REGISTRY_BASE}/downloads/images/

obtaining the pull secret

In order to obtain the pull secret we will need to go to :

  1. Login in with our Red Hat credentials
  2. Click on the “create cluster” button
  3. Select “Openshift container platform”
  4. Select our installation type (select Bare Metal even if you are planning to deploy it on VM)
  5. Download the pull secret ( or click on the “copy Pull Secret”) as seen in the following image

Make sure you place the secret in a file called pull-secret under the “${REGISTRY_BASE}/downloads/secrets/” directory

For convenience we will add the JSON prefix to it

$ cd ${REGISTRY_BASE}/downloads/secrets/
$ cat > pull-secret.json << EOF
(CTRL+V)
EOF

Make sure the file is valid we will use the jq command. this command will also take the one line file and spread it in a human visible layout.

$ cat pull-secret.json | jq

If the output of the previous command printed to the screen a clean JSON then we can now add the registry credentials to the file with the following command :

First we will generate a base64 output from our user+password string

$ echo -n ‘myuser:mypassword’ | base64 -w0

and now lets put it in a variable :

$ REG_SECRET=`echo -n 'myuser:mypassword' | base64 -w0`

and now lets create a bundle json file with all the registries.

$ cat pull-secret.json | jq '.auths += {"registry:5000": {"auth": "REG_SECRET","email": "me@working.me"}}' | sed "s/REG_SECRET/$REG_SECRET/" > pull-secret-bundle.json

now test the file again to make sure it is valid:

$ cat pull-secret-bundle.json | jq

And we will need a small output of the registry login for the openshift-install.yaml file once we are internally so we will create it now.

$ echo '{ "auths": {}}' | jq '.auths += {"registry:5000": {"auth": "REG_SECRET","email": "me@working.me"}}' | sed "s/REG_SECRET/$REG_SECRET/" | jq -c .> pull-secret-registry.json

NOTE!!!

If you are using your own CA at this point make sure to copy it and not the domain.crt file

let’s export a few more variable for the mirroring process :

$ export LOCAL_REGISTRY='registry:5000'$ export OCP_RELEASE="${OCP_RELEASE}-x86_64" $ export LOCAL_REPOSITORY='ocp/openshift4' $ export PRODUCT_REPO='openshift-release-dev' $ export LOCAL_SECRET_JSON="${REGISTRY_BASE}/downloads/secrets/pull-secret-bundle.json" $ export RELEASE_NAME="ocp-release" 

We want to save those variable as well

$ echo "export LOCAL_REGISTRY='registry:5000'" >> ${REGISTRY_BASE}/downloads/tools/env_ocp$ echo '[[ ! ${OCP_RELEASE} =~ 'x86_64' ]] && export OCP_RELEASE="${OCP_RELEASE}-x86_64"' >> ${REGISTRY_BASE}/downloads/tools/env_ocp$ echo "export LOCAL_REPOSITORY='ocp/openshift4'" >> ${REGISTRY_BASE}/downloads/tools/env_ocp$ echo "export PRODUCT_REPO='openshift-release-dev'" >> ${REGISTRY_BASE}/downloads/tools/env_ocp$ echo 'export LOCAL_SECRET_JSON="${REGISTRY_BASE}/downloads/secrets/pull-secret-bundle.json"' >> ${REGISTRY_BASE}/downloads/tools/env_ocp$ echo 'export RELEASE_NAME="ocp-release"' >> ${REGISTRY_BASE}/downloads/tools/env_ocp

Start the Mirroring

Now let’s start the “oc release” mirroring :

$ oc adm -a ${LOCAL_SECRET_JSON} release mirror \
--from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE} \
--to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} \
--to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE} \
2>&1 | tee ${REGISTRY_BASE}/downloads/secrets/mirror-output.txt

This process should take a between an hour or two , depending on your internet bandwidth.

Generating the openshift-install binary

This part is the most important part of the installation so don’t skip it !!!
In order to create an installation program which is based on the content and name of the registry you’ve just mirrored we will run the “oc” command which in result will generate the “openshift-install” binary to our needs.

$ cd ${REGISTRY_BASE}/downloads/tools/$ oc adm -a ${LOCAL_SECRET_JSON} release extract --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}"$ echo $?

This binary named “openshift-install” will be the command for the installation itself.
(basically we are telling openshift-install to work with our internal registry)
The “echo $?” at that point should print the “0” output which tell us that the command was succeeded.

Install config

Now we can create our install-config.yaml file which will be needed for our installation process, the reason that we are doing it now is to save us a few typos and to make sure we have everything we need from the internet to our Air Gaped environment

NOTE!!!

The file name must be "install-config.yaml".
This is the file our installation command expects to read from.
This is how the file should look like:

$ cd ${REGISTRY_BASE}/downloads/tools$ cat > install-config.yaml << EOF
apiVersion: v1
baseDomain: example.com
controlPlane:
name: master
hyperthreading: Enabled
replicas: 3
compute:
- name: worker
hyperthreading: Enabled
replicas: 3
metadata:
name: test-cluster
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 172.18.0.0/16
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
pullSecret: '{"auths": ...}'
sshKey: 'ssh-ed25519 AAAA...'
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
<...base-64-encoded, DER - CA certificate>
-----END CERTIFICATE-----
EOF

That is all we need for now , the rest we will generate from the output of our mirroring command and from our internal CA certificate and our SSH public key.

Saving the Registry

After we completed the export and generated the binary files the only thing that is left is making sure we are working with the same registry on the internal Server as we work with the the external server so far.
In order to achieve that we simple export the registry to a tar file and save it in our REGISTRY_BASE directory but first we will stop the registry:

$ podman stop my-registry

$ podman rm --force my-registry
$ podman save docker.io/library/registry:2 -o ${REGISTRY_BASE}/downloads/images/registry.tar

Generating the TAR files

Since we put everything under one directory, all we need to do is TAR it to a single file, create a checksum of this file and split the file so it will be easier to import it to the air gaped environment :

$ cd ${REGISTRY_BASE}$ tar -zcf ocp43-registry.tar.gz *$ md5sum ocp43-registry.tar.gz
8103e6d50b622c0879b602f187d26327 ocp43-registry.tar.gz

In order to avoid handling huge file, it is recommended to split the 10G file into 10 x 1G files.

$ split -b 1G ocp43-registry.tar.gz "ocp43-registry.tar.gz.part"

Now take all the files and put them in a CD/USB and bring them into your environment.

Generating the 7zip files

It is much more easier to split the file with 7zip

$ 7za a -t7z -v1g -m0=lzma -mx=9 -mfb=64 -md=32m -ms=on ocp43-registry.7z ${BASE_REGISTRY}

That will generate a 1G files for the registry directory.

Deploying the Registry internally

This tutorial is continuing from the point after all the files are in the “Air Gap” environment.
Everything we do from now on will be done on the internal Server!!!
Which will also be referred as the installation (or bastion) server

As we are now working in a disconnect environment the installation would expect the following DNS records are resolved (I use example.com but you should make sure it is in your own scope domain) :

$ORIGIN exapmle.com.
ntp A RECORD
registry A RECORD
bastion A RECORD
bootstrap A RECORD
master-01 A RECORD
master-02 A RECORD
master-03 A RECORD
worker-01 A RECORD
worker-02 A RECORD
worker-03 A RECORD
$ORIGIN ocp4.exmaple.com.
control-plane-0 A RECORD (master-01)
control-plane-1 A RECORD (master-02)
control-plane-2 A RECORD (master-03)
etcd-0 A RECORD (master-01)
etcd-1 A RECORD (master-02)
etcd-2 A RECORD (master-03)
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-0
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-1
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-2
ocp4-bootstrap A RECORD (bootstrap)
bootstrap-0 A RECORD
api A RECORD (Load Balancer VIP)
api-int A RECORD (Load Balancer VIP)
$ORIGIN apps.ocp4.example.com.
* A RECORD (Load Balancer VIP)

First lets take all our files and bring them together :

$ cat ocp43-registry.tar.gz.part* > ocp43-registry.tar.gz
#(you can skip this part if you running 7zip)

In order to verify that the files were correctly transferred to the air-gaped environment compare the md5sum command output that was run before with the md5sum current output command.

$ md5sum ocp43-registry.tar.gz
8103e6d50b622c0879b602f187d26327 ocp43-registry.tar.gz

We need to make sure the internal server has the same tools that the external has so lets install all the relevant packages :

$ yum install -y jq ftp openssl iperf3 weechat p7zip curl wget tftp telnet podman httpd-tools tcpdump nmap net-tools screen tmux bind-utils nfs-utils sg3_utils nmap-ncat rlwrap uucp openldap-clients xorg-x11-xauth wireshark unix2dos unixODBC policycoreutils-python-utils vim-*

Now let’s create our base registry here and untar all the content to it :

NOTE

Make sure you have at list 15G of space available in your destination directory:

$ mkdir /opt/registry$ export REGISTRY_BASE=/opt/registry$ tar -zxvf ocp43-registry.tar.gz -C ${REGISTRY_BASE}#(or '7za x ocp43-registry.7z*’)

If you are planing to use a different DNS name then registry then you need to run the process again with the different name.
What I do recommend for you to do is to generate a new certificate and it will make your life easier if you will use your Organization CA from now on because that will save you a lot of pain in the advanced configuration if you take care of it (we will see how later on) during the Installation and be persistent with your CA signer , as long as you don’t change it (or move away from the CA chain) you will just find.

We need to source the environment variable we defined in the env_ocp file

$ source ${REGISTRY_BASE}/downloads/tools/env_ocp

Now let’s remote the old certificates :

$ cd ${REGISTRY_BASE}/certs
$ rm -f domain.*

Now lets edit the csr_answer.txt and add alternative names with your internal domain (I will use example.com and example.local)

$ cat > csr_answer.txt << EOF
[req]
default_bits = 4096
prompt = no
default_md = sha256
x509_extensions = req_ext
req_extensions = req_ext
distinguished_name = dn

[ dn ]
C=US
ST=New York
L=New York
O=MyOrg
OU=MyOU
emailAddress=me@working.me
CN = registry

[ req_ext ]
subjectAltName = @alt_names

[ alt_names ]
DNS.1 = registry
DNS.2 = registry.example.com
DNS.3 = registry.example.local
EOF

Now let’s generate the csr (certificate request)

$ openssl -newkey rsa:4096 -nodes -sha256 -keyout domain.key -out \ domain.csr -config<( cat csr_answer.txt)

You can make sure the DNS alternative names are in the request by running:

$ openssl req -in domain.csr -noout -text | grep DNS

Now all you need to do is to sign the certificate, make sure you make a copy of your CA (or chain) and called it ca.crt , we will need it for the install-config.yaml file.

The file which you received from your internal CA needs to be named domain.crt (and we will concatenate the two (2) to a single file)

$ cat new-cert.crt ca.crt > ${REGISTRY_BASE}/certs/domain.crt

Now sense we believe in an easy life , all we need to do is run the same start command that we used on the external Server to import and start the registry container, grep the image ID and retag it to our required path (same as the external path):

$ podman load -i ${REGISTRY_BASE}/downloads/images/registry.tar $ podman image list REPOSITORY   TAG      IMAGE ID       CREATED       SIZE
<none> <none> 708bc6af7e5e 6 weeks ago 26.3 MB
$ podman tag 708bc6af7e5e docker.io/library/registry:2$ ${REGISTRY_BASE}/downloads/tools/start_registry.sh

Open the Firewall rules like you did on the external server :

$ firewall-cmd --add-port=5000/tcp --zone=${FIREWALLD_DEFAULT_ZONE} --permanent$ firewall-cmd --reload

Service

Because it is our internal server and OpenShift needs the registry available

NOTE!

If you are getting an errors at this point make sure that docker is not running OR you can run the same command with docker if you wish.
This part can be a bit tricky but is very important in this tutorial.

Now let’s use the same curl from before to make sure the registry is available :

$ curl -u myuser:mypassword -k https://registry:5000/v2/_catalog 
{"repositories":[ocp/openshift4]}

I’m sure you notice that now we see there is a repository in our output which is our synced repository

Installation

For the installation it self I recommend creating a new user rather then using root , mostly because it is bast practice and to be more specific the rest of the installation process does not require root privileges so there is no need of using it.

$ useradd ocp

Before we continue I suggest making sure that our newly created user will be able to access everything it needs.
For that we will use the Linux ACL and a tools called alternative to make sure we have everything we need available.

ACL

The first thing that we need to do is to make sure our user “ocp” has access to the directories (and sub directories) we created in our infrastructure (read part 1 for more information)

So first lets grant read, write and execute permission to out download directory and make sure it will be the default option for future directories :

$ setfacl -m u:ocp:rwx -R ${REGISTRY_BASE}
$ setfacl -d -m u:ocp:rwx -R ${REGISTRY_BASE}

And we will do the same for our httpd pub directory which we created in part1 (or we can create it now) :

$ mkdir /var/www/html/pub
$ mkdir /var/www/html/pub/{pxe,ign}
$ setfacl -m u:ocp:rwx -R /var/www/html/pub
$ setfacl -d -m u:ocp:rwx -R /var/www/html/pub

alternatives

In our world development is moving fast forward, so we need a simple tool to keep us (operators) focus on the jobs and not on the versioning , which is where alternative comes into play.
The tools gives us the ability to run multiple versions of the same binary and set a different default depending on our needs. more so it is a great tracking mechanism for what has been installed and used.

We will use this tool for our openshif-install , oc and kubectl binaries.
First lets build a directory where we will store them.

$ mkdir /opt/openshift
$ mkdir /opt/openshift/bin

Copy all our binaries to the directory

$ source ${REGISTRY_BASE}/downloads/tools/env$ cp ${REGISTRY_BASE}/downloads/tools/oc /opt/openshift/bin/oc-${OCP-RELEASE}$ cp ${REGISTRY_BASE}/downloads/tools/kubectl /opt/openshift/bin/kubectl-${OCP-RELEASE}$ cp ${REGISTRY_BASE}/downloads/tools/openshift-install /opt/openshift/bin/openshift-install-${OCP-RELEASE}

Wow lets create an alternative representation for each of the binaries

$ alternatives --install /usr/bin/oc oc /opt/openshift/bin/oc-${OCP-RELEASE} 10$ alternatives --install /usr/bin/kubectl kubectl /opt/openshift/bin/kubectl-${OCP-RELEASE} 10$ alternatives --install /usr/bin/openshift-install openshift-install /opt/openshift/bin/openshift-install-${OCP-RELEASE} 10

Bash auto completion

To make our life easier the tools are been deployed with a set os templates to enable use to use those tools with bash auto completion.
To generate the bash auto completion scripts run the following command :

$ yum install -y bash-completion.noarch bash-completion-extras.noarch$ oc completion bash > /etc/bash_completion.d/oc$ openshift-install completion bash > /etc/bash_completion.d/openshift-install
(Log out and Login for usage)

Install config

To get the installation running we need to switch to our new user and start a screen session:

$ su - ocp 
$ screen -S ocp

Generate SSH key

One of the keys we need to add to the installation template is a public ssh key of the user which will be able to login with the “core” users to our cluster servers.
In order to generate the key run the ssh-keygen coomand :

$ ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa

Now under our new user’s home directory , we will create a new directory named (what ever you want) “install” and switch to it:

$ mkdir ocp4
$ cd ocp4

Let’s set a few variable that will come in handy.
if we want we can catch the repository name into an environment variable which we will use later on :

$ INTERNAL_REPO_NAME=`curl -u myuser:mypassword -k https://registry:5000/v2/_catalog | jq .repositories | grep ocp`
#(or export INTERNAL_REPO_NAME="ocp/openshift4" )

You should see your repository in the variable :

$ echo $INTERNAL_REPO_NAME
$ ocp/openshift4

Optional

If you have other repositories and you see a comma at the end of the output , then you can remove it with the following command :

$ INTERNAL_REPO_NAME=`echo ${INTERNAL_REPO_NAME} | sed 's/\,//'`

If you remember (and you do) we create a template for our “install-config.yaml” file at the “${REGISTRY_BASE}/downloads/tools/” directory so lets copy it from there

$ cp ${REGISTRY_BASE}/downloads/tools/install-config.yaml ~/ocp4/

Now we need to add to our template file our internal CA (which signed to Certificate for our registry) and redirect the container repository to our registry.
You can use to following template by adding the ca certificate (must be DER in base64 format) and cat the ~/.ssh/id_rsa.pub file and add it to the sshKey section:

apiVersion: v1
baseDomain: example.com
controlPlane:
name: master
hyperthreading: Enabled
replicas: 3
compute:
- name: worker
hyperthreading: Enabled
replicas: 3
metadata:
name: test-cluster
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 172.18.0.0/16
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
pullSecret: '{"auths":{"registry:5000":{"auth":"bXl1c2VyOm15cGFzc3dvcmQ=","email":"me@working.me"}}}'
sshKey: '< Your Public SSH Key>'
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
<YOUR CA certificate>
-----END CERTIFICATE-----
imageContentSources:
- mirrors:
- registry:5000/ocp/openshift4
source: quay.io/openshift-release/ocp-release
- mirrors:
- registry:5000/ocp/openshift4
source: quay.io/openshift-release/ocp-v4.0-art-dev

NOTE !!!

  1. The Domain that should contian all our A and SRV records is “test-cluster.example.com” but if your DHCP is setting the hostnames with the format of “host.example.com” that is valid as well.
  2. The “imageContentSources” is taken from the mirror-output.txt file and should be exactly as it shown in the file (make sure you fix the indentation part of the file).
  3. When we update the custom CA we need to make sure we are using the right indentation which means 5 spaces from the left.
  4. your pullSecret should now be your pull secret file of your internal registry only.
    if you changed the username/password of the registry then you should regenerate the pull-secret-registry.json as you did on the external server. In case you didn’t you can just take it as a oneline output

Lazzy

If you are lazzy like and you want to create the file quickly just run the following command:

Create the install-config.yaml skeleton :

$ export CLUSTER_NAME="test-cluster"$ export CLUSTER_DOMAIN="example.com"$ cat > install-config.yaml << EOF
apiVersion: v1
baseDomain: ${CLUSTER_DOMAIN}
controlPlane:
name: master
hyperthreading: Disabled
replicas: 3
compute:
- name: worker
hyperthreading: Disabled
replicas: 3
metadata:
name: ${CLUSTER_NAME}
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 172.18.0.0/16
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
EOF

Now add the registry pull-secret

$ REG_SECRET=`echo -n 'myuser:mypassword' | base64 -w0`$ echo -n "pullSecret: '" >> install-config.yaml && echo '{ "auths": {}}' | jq '.auths += {"registry:5000": {"auth": "REG_SECRET","email": "me@working.me"}}' | sed "s/REG_SECRET/$REG_SECRET/" | jq -c . | sed "s/$/\'/g" >> install-config.yaml

Attaching the ssh key:

$ echo -n "sshKey: '" >> install-config.yaml && cat ~/.ssh/id_rsa.pub | sed "s/$/\'/g" >> install-config.yaml

Adding the Registry CA:
(make sure you obtain the registry CA and you save it in a base64 format in a file named ca.crt)

$ echo "additionalTrustBundle: |" >> install-config.yaml$ cat ca.crt | sed 's/^/\ \ \ \ \ /g' >> install-config.yaml

And finally, adding the “imageContentSources” extension :

$ cat ${REGISTRY_BASE}/downloads/secrets/mirror-output.txt | grep -A7 imageContentSources >> install-config.yaml

backup

There is a very probable chance you will need to run this installation more then once. in order to save time keep a backup of your install-config.yaml file in your home directory:

$ cp install-config.yaml ~/install-config.yaml.bck

Installation begin

Generate the Kubernetes manifests for the cluster:

$ openshift-install create manifests --dir=./

Modify the manifests/cluster-scheduler-02-config.yml Kubernetes manifest file to prevent Pods from being scheduled on the control plane machines:
A. Open the manifests/cluster-scheduler-02-config.yml file.
B. Locate the mastersSchedulable parameter and set its value to False.
C. Save and exit the file.

Our next step is to generate the ignision files for the bootstrap , the masters and for the workers. in order to do that we need to run the openshift-install with the following arguments :

$ openshift-install create ignition-configs --dir=./

After running this command you will see the list of the relevant folders and ignition files:

openshift-install — ignition

Next we need to copy our ignition file to our Apache directory so they will be available over HTTP during the installation

$ cp *.ign /var/www/html/pub/ign/
$ chmod a+r /var/www/html/pub/ign/*.ign

PXE Install

Now create a new directory under the tftpboot directory

$ mkdir /var/lib/tftpboot/rhcos/
$ mkdir /var/lib/tftpboot/pxelinux.cfg/

(look at the first Part for more information about PXE)

And the kernel and initramfs files to it :

$ cp ${REGISTRY_BASE}/downloads/images/rhcos-${OCP_ISO_VERSION}-x86_64-live-initramfs.x86_64.img /var/lib/tftpboot/rhcos/rhcos-initramfs.img$ cp ${REGISTRY_BASE}/downloads/images/rhcos-${OCP_ISO_VERSION}-x86_64-live-kernel.x86_64 /var/lib/tftpboot/rhcos/rhcos-kernel

For the RAW file we will put it in our directory for our HTTPD server

$ cp ${REGISTRY_BASE}/downloads/images/rhcos-${OCP_ISO_VERSION}-x86_64-metal.x86_64.raw.gz /var/www/html/pub/pxe/rhcos-metal.raw.gz

I am keeping the SElinux in enforcing so let’s make sure it is available (As root):

$ semanage fcontext -a -t httpd_sys_content_t "/var/www/html/pub(/.*)?"
$ restorecon -R -v /var/www/html/pub

Now we need to create a file with the corresponding name of each MAC with the 01- at the beginning of it and a dash between the letters/number under /var/lib/tftpboot/pxelinux.cfg/ directory and add the necessary boot option to it.
A simple why is creating 3 file of bootstrap master and worker and link the MAC for those files.

For the bootstrap server the file would look like :

$ cat > /var/lib/tftpboot/pxelinux.cfg/bootstrap << EOF
DEFAULT pxeboot
TIMEOUT 5
PROMPT 0
LABEL pxeboot
KERNEL rhcos/rhcos-kernel
APPEND ip=dhcp rd.neednet=1 initrd=rhcos/rhcos-initramfs.img console=tty0 console=ttyS0 coreos.inst=yes coreos.inst.install_dev=sda coreos.inst.image_url=http://<HTTP_server>/pub/pxe/rhcos-metal.raw.gz coreos.inst.ignition_url=http://<HTTP_server>/pub/ign/bootstrap.ign
EOF

For the masters:

$ cat > /var/lib/tftpboot/pxelinux.cfg/master << EOF
DEFAULT pxeboot
TIMEOUT 5
PROMPT 0
LABEL pxeboot
KERNEL rhcos/rhcos-kernel
APPEND ip=dhcp rd.neednet=1 initrd=rhcos/rhcos-initramfs.img console=tty0 console=ttyS0 coreos.inst=yes coreos.inst.install_dev=sda coreos.inst.image_url=http://<HTTP_server>/pub/pxe/rhcos-metal.raw.gz coreos.inst.ignition_url=http://<HTTP_server>/pub/ign/master.ign
EOF

For the worker:

$ cat > /var/lib/tftpboot/pxelinux.cfg/worker << EOF
DEFAULT pxeboot
TIMEOUT 5
PROMPT 0
LABEL pxeboot
KERNEL rhcos/rhcos-kernel
APPEND ip=dhcp rd.neednet=1 initrd=rhcos/rhcos-initramfs.img console=tty0 console=ttyS0 coreos.inst=yes coreos.inst.install_dev=sda coreos.inst.image_url=http://<HTTP_server>/pub/pxe/rhcos-metal.raw.gz coreos.inst.ignition_url=http://<HTTP_server>/pub/ign/worker.ign
EOF

Now to link the MAC I have created a simple BASH function

mac-pxe-update() {
ln -s $1 $(echo "$2" | sed 's/^/01-/g' | sed 's/:/-/g')
}

Now we can link the MAC for example :

mac-pxe-update bootstrap <BOOTSTRAP MAC> #(and so on for all the MACs)

Boot the Server

now that everything is set you can boot all the server by the following order

  1. bootstrap
  2. masters
  3. workers

NOTE!!!

Do not continue to the next server unless you make sure the server booted with no errors.

One you boot up all the server you need to make sure you can SSH to them , if you are successful that means that your ignition file was loaded successfully.

$ ssh core@bootstrap

the installation process will NOT continue at this point but we do have access to our server so lets make sure the relevant changes are occuring

timezone

Check that all RHCOS and installer machines timezone is set to Asia/Jerusalem or to any other Timezone, and that time synchronized

$ timedatectl
Local time: Tue 2019–12–24 15:10:05 IST
Universal time: Tue 2019–12–24 13:10:05 UTC
RTC time: Tue 2019–12–24 13:10:04
Time zone: Asia/Jerusalem (IST, +0200)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no

If you need to change the time zone execute:

$ sudo timedatectl set-timezone Asia/Jerusalem

registry testing

this is a very important point , make sure you are able to access your registry from the bootstrap server, this test will save you a lot of time (and frustration) later on .

$ curl -u myuser:mypassword https://registry:5000/v2/_catalog

Static IP

In case your environment has an issue with DHCP then this is the point where you need to configure it locally before you continue with the installation.

If everything goes well we can continue with the installation so first we need to exit the bootstrap server

$ exit

Openshift-install (continue)

now we will first run the bootstrap installation :

$ openshift-install --dir=./ wait-for bootstrap-complete --log-level debug

We will see an output similer to this one :

INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.example.com:6443...
INFO API v1.13.4+b626c2fe1 up
INFO Waiting up to 30m0s for the bootstrap-complete event…

You can follow the installation process on the bootstrap server.
I suggest you look for errors in the journal , they can be very explainatory and help you understand what goes worng

$ ssh core@bootstrap "journalctl -xe"

After bootstrap process is complete, remove the bootstrap machine from the load balancer.

IMPORTANT
You must remove the bootstrap machine from the load balancer at this
point. You can also remove or reformat the machine itself !!!

Logging into the cluster

$ export KUBECONFIG=/home/ocp/ocp/auth/kubeconfig
$ oc whoami
system:admin

Approving the CSRs for your machines

When you add machines to a cluster, two pending certificate signing request (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself.
Confirm that the cluster recognizes the machines:

$ oc get node
NAME STATUS ROLES AGE VERSION
master-0 Ready master 63m v1.13.4+b626c2fe1
master-1 Ready master 63m v1.13.4+b626c2fe1
master-2 Ready master 64m v1.13.4+b626c2fe1
worker-0 NotReady worker 76s v1.13.4+b626c2fe1
worker-1 NotReady worker 70s v1.13.4+b626c2fe1

NOTE

if you only see the masters that means you need to approve the CSR for the worker nodes. Once you approve them you will see the Workers in “NotReady” state at the beginning.
This is a normal behavior.

let’s list the CSR

$ oc get csr
NAME AGE REQUESTOR CONDITION
csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending

If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:
To approve them individually, run the following command for each valid CSR:

$ oc adm certificate approve <csr_name>

If all the CSRs are valid, approve them all by running the following command:

$ oc get csr -o name | xargs oc adm certificate approve

Initial Operator configuration

$ watch -n5 oc get clusteroperators

In this phase you have to wait up to 15 min to all operators to go to Available True state, maybe except of image-registry. For Image-registry you need to provide patch that will create pvc, — For production environment please follow the procedure from Installation guide

For non-production environment you can create empty dir as pvc by running the next command:

$  oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{“spec”:{“storage”:{“emptyDir”:{}}}}'

After running this command please wait until image-registry cluster operator will become Available True state and make sure all the certificates are signed by running the command again.
Completing installation on user-provisioned infrastructure
After you complete the Operator configuration, you can finish installing the cluster on infrastructure that you provide.

Confirm that all cluster components are online

completing the installation

Now to complete the installation run :

$ openshift-install --dir=./ wait-for install-complete | tee install-complete

This will give you an output of you console login with the admin user and credentials to login.

If you have any question feel free to responed/ leave a comment.
You can find on linkedin at : https://www.linkedin.com/in/orenoichman
Or twitter at : https://twitter.com/ooichman

HAVE FUN !!!

--

--