Commit b677bb97 authored by Muhammad Umair Khan's avatar Muhammad Umair Khan
Browse files

fix kubernetes taint role & add bash script for automating prerequisites

parent 641ba237
Loading
Loading
Loading
Loading
+73 −143
Original line number Diff line number Diff line
@@ -8,26 +8,47 @@ This folder provides an **Ansible-based automation framework** to set up a multi

Before running the playbooks, ensure:

1. You have **Ansible** installed on your control machine.
2. You have **SSH access** to all remote nodes (master & workers, if applicable).
3. Both repositories are cloned as siblings in the same parent directory:
1. **Ubuntu OS** (required by the setup script)
2. **Python 3** with `python3-venv` and `python3-pip` packages
3. Both repositories cloned as siblings:
   - `etsi-mec-sandbox` (backend)
   - `etsi-mec-sandbox-frontend` (frontend)
4. You have a **GitHub OAuth application** configured (Client ID & Secret for MEC Sandbox authentication).
4. A **GitHub OAuth application** configured (Client ID & Secret)

   > **Note:** If your playbooks are running on `localhost` (control machine itself), **SSH is not required**. SSH setup is only necessary for remote worker or master nodes.  
> **Note:** SSH setup is only required for remote worker nodes, not for localhost deployments.

   For remote worker nodes, follow these steps:
---

## Environment Setup (Required)

Before running any playbooks, set up the Ansible environment:

```bash
chmod +x ~/etsi-mec-sandbox/playbooks/setup_ansible_env.sh
cd ~/etsi-mec-sandbox/playbooks
./setup_ansible_env.sh
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate
```

---

## Quick Start

```bash
# Generate a new SSH key (ED25519)
ssh-keygen -t ed25519 -C "<your-username>@<your-local-host>"
# Activate virtual environment
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate

# Copy the public key to the remote host
ssh-copy-id -i ~/.ssh/id_ed25519.pub <your-username>@<remote-host-ip>
# Run the playbook
cd ~/etsi-mec-sandbox/playbooks
ansible-playbook -i inventories/dev/hosts.ini site.yml
```

> Replace `<remote-host-ip>` with the IP of your worker node.
You will be prompted for:
- Sudo password
- MEC host IP/domain
- GitHub OAuth Client ID & Secret

> **For detailed deployment instructions**, see [RUNBOOK.md](RUNBOOK.md)

---

@@ -35,154 +56,63 @@ ssh-copy-id -i ~/.ssh/id_ed25519.pub <your-username>@<remote-host-ip>

```
playbooks/
├── ansible.cfg                # Ansible configuration
├── requirements.yml           # External role/collection dependencies
├── setup_ansible_env.sh        # Environment setup script (run first!)
├── site.yml                    # Main playbook entrypoint
├── inventories/
│   └── dev/
│       ├── hosts.ini          # Inventory file (IP addresses & groups)
│       └── group_vars/
│           └── all.yml        # Global variables
└── roles/
    ├── common/                # Base setup (packages, users, system prep)
    ├── kernel/                # Kernel tuning & modules for Kubernetes
    ├── containerd/            # Install & configure containerd runtime
    ├── docker/                # Install Docker runtime & daemon configs
    ├── cni_calico/            # Deploy Calico CNI plugin
    ├── kubernetes/
    │   ├── common/            # Common Kubernetes configs
    │   ├── master/            # Master node setup (API server, etcd, controller)
    │   └── worker/            # Worker node join configuration
    ├── helm/                  # Install Helm package manager
    ├── dev_env/
    │   ├── golang/            # Install Go environment
    │   └── node/              # Install Node.js environment (via NVM)
    └── mec_sandbox/
        ├── mec_config/        # Configure MEC Sandbox (charts, secrets, OAuth)
        └── mec_deploy/        # Build & deploy MEC Sandbox (meepctl, frontend, core)
├── ansible.cfg                 # Ansible configuration
├── collections/requirements.yml
├── inventories/dev/
│   ├── hosts.ini               # Inventory (hosts & groups)
│   └── group_vars/all.yml      # Variables
└── roles/                      # Ansible roles (see below)
```

---

## Roles & Tasks Overview
## Roles Overview

| Role                         | Purpose                                   |
| ---------------------------- | ---------------------------------------------------- |
| **common**                   | Base system setup and dependencies                   |
| **kernel**                   | Kernel modules, sysctl, network tuning               |
| **containerd**               | Install & configure containerd runtime               |
| **docker**                   | Docker runtime setup & daemon configuration          |
| **cni\_calico**              | Deploy Calico CNI networking                         |
| **kubernetes/common**        | Install kubeadm, kubelet, kubectl                    |
| **kubernetes/master**        | Initialize master & control-plane setup              |
| **kubernetes/worker**        | Join worker nodes (requires SSH)                     |
| **helm**                     | Install Helm for package management                  |
| **dev\_env/golang**          | Setup Go development environment (conditional)       |
| **dev\_env/node**            | Setup Node.js/NVM environment (conditional)          |
| **mec\_sandbox/mec\_config** | Configure MEC Sandbox charts, secrets & OAuth        |
| **mec\_sandbox/mec\_deploy** | Build & deploy MEC Sandbox (meepctl, frontend, core) |
| ---------------------------- | ----------------------------------------- |
| **common**                   | Base system packages                      |
| **kernel**                   | Kernel modules & sysctl tuning            |
| **containerd**               | Containerd runtime                        |
| **docker**                   | Docker engine                             |
| **cni\_calico**              | Calico CNI networking                     |
| **kubernetes/master**        | Initialize Kubernetes control plane       |
| **kubernetes/worker**        | Join worker nodes to cluster              |
| **helm**                     | Helm package manager                      |
| **dev\_env/golang**          | Go development environment (conditional)  |
| **dev\_env/node**            | Node.js/NVM environment (conditional)     |
| **mec\_sandbox/mec\_config** | Configure MEC Sandbox                     |
| **mec\_sandbox/mec\_deploy** | Build & deploy MEC Sandbox                |

---

## Running the Playbooks

### Interactive Prompts
## Key Variables

When running the playbook, you will be prompted for:
- **Sudo password** for privilege escalation
- **MEC host IP/domain** (e.g., `192.168.1.100` or `mec.example.com`)
- **GitHub OAuth Client ID** and **Client Secret**
Variables are defined in `inventories/dev/group_vars/all.yml`.

### Worker Nodes (Optional)
| Variable              | Default      | Description                        |
|-----------------------|--------------|------------------------------------|
| `kubernetes_version`  | `v1.35.1`    | Kubernetes version                 |
| `calico_version`      | `v3.31.4`    | Calico CNI version                 |
| `install_dev_env`     | `true`       | Enable Go & Node.js setup          |
| `install_mec_sandbox` | `true`       | Enable MEC Sandbox deployment      |

If you want to add worker nodes to your cluster, you need to **uncomment both the inventory entries and the worker play in `site.yml`**.

1. Update the **inventory file** with your node IPs:

`inventories/dev/hosts.ini`

```ini
[k8s_masters]
#localhost ansible_connection=local ansible_python_interpreter=auto_silent ansible_user=xflow
<control-node> ansible_connection=local ansible_python_interpreter=auto_silent ansible_user=<username>
# Optional: define worker nodes here. Uncomment to enable workers
#[k8s_workers]
#worker1 ansible_host=192.168.40.59 ansible_user=ubuntu # change ansible_user if needed
#worker2 ansible_host=192.168.56.12 ansible_user=ubuntu
```

2. Uncomment the worker play in `site.yml`:

```yaml
# Uncomment to run worker setup
#- hosts: k8s_workers
#  become: true
#  vars_prompt:
#    - name: ansible_become_pass
#      prompt: "Enter sudo password for workers"
#      private: true
#  roles:
#    - common
#    - kernel
#    - containerd
#    - kubernetes/common
#    - kubernetes/worker
```

3. Run the main playbook:

```bash
ansible-playbook -i inventories/dev/hosts.ini site.yml -K
```

> `-K` prompts for sudo password if required. You can also export `ANSIBLE_BECOME_PASSWORD` or configure passwordless sudo.

## Variables

### Kubernetes & Container Runtime
* `kubernetes_version`: `"v1.35"`
* `pod_network_cidr`: `"92.68.0.0/16"`
* `service_cidr`: `"10.96.0.0/12"`
* `calico_version`: `"v3.30.0"`

### Development Environment
* `install_dev_env`: `true` → set to `false` to disable Node/Go tooling
* `go_version`: `"1.17"`
* `node_version`: `"12.19.0"`

### MEC Sandbox
* `install_mec_sandbox`: `true` → set to `false` to skip MEC Sandbox deployment
* `mec_sandbox_dir`: Path to the etsi-mec-sandbox repository
* `mec_frontend_dir`: Path to the etsi-mec-sandbox-frontend repository

## Tags

You can run just parts of the setup with `--tags` or skip parts with `--skip-tags`. (The roles here are intentionally simple and do not define custom tags; feel free to add them if you want finer control.)
---

## MEC Sandbox Role Details
## Documentation

### mec_sandbox/mec_config
Configures the MEC Sandbox environment:
- Patches chart security contexts (uid/gid 1001 → 1000)
- Updates GitHub OAuth credentials in secrets
- Configures ingress host address
- Adds Kubernetes CA to system trust store
| Document                     | Description                                          |
| ---------------------------- | ---------------------------------------------------- |
| **[RUNBOOK.md](RUNBOOK.md)** | Step-by-step deployment guide, troubleshooting, multi-node setup, and detailed configuration |

### mec_sandbox/mec_deploy
Builds and deploys MEC Sandbox components:
1. Installs `meepctl` CLI tool
2. Configures meepctl with IP and gitdir
3. Builds and deploys frontend
4. Deploys dependencies (`meepctl deploy dep`)
5. Builds all components (`meepctl build --nolint all`)
6. Dockerizes all components (`meepctl dockerize all`)
7. Deploys core (`meepctl deploy core`)
---

## Notes

* Ensure worker nodes have SSH access configured before running.
* Use `--tags` if you want to run specific roles (e.g. `--tags kubernetes,helm`).
* The MEC Sandbox roles require both `etsi-mec-sandbox` and `etsi-mec-sandbox-frontend` repositories to be present as siblings.
* Thanos/Prometheus deployment failures during `deploy dep` are expected and ignored.
* Run `setup_ansible_env.sh` first before executing any playbooks
* Both `etsi-mec-sandbox` and `etsi-mec-sandbox-frontend` repositories must be siblings
* Thanos/Prometheus failures during deployment are expected and ignored

---
+140 −45
Original line number Diff line number Diff line
@@ -8,12 +8,43 @@ This runbook provides step-by-step instructions for deploying the ETSI MEC Sandb

Before running the playbooks, ensure you have:

1. **Ansible** installed on your control machine
2. **Both repositories** cloned as siblings:
1. **Ubuntu OS** (required by the setup script)
2. **Python 3** with `python3-venv` and `python3-pip` packages
3. **Both repositories** cloned as siblings:
   - `~/etsi-mec-sandbox` (backend)
   - `~/etsi-mec-sandbox-frontend` (frontend)
3. **GitHub OAuth Application** credentials (Client ID & Client Secret)
4. A **target IP address or domain** for your MEC Sandbox installation
4. **GitHub OAuth Application** credentials (Client ID & Client Secret)
5. A **target IP address or domain** for your MEC Sandbox installation

---

## Environment Setup (Required First Step)

Before running any playbooks, you must set up the Ansible environment:

```bash
# Make the setup script executable
chmod +x ~/etsi-mec-sandbox/playbooks/setup_ansible_env.sh

# Navigate to the playbooks directory
cd ~/etsi-mec-sandbox/playbooks

# Run the setup script
./setup_ansible_env.sh

# Activate the virtual environment
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate
```

The setup script:
- Creates a Python virtual environment (`ansible-venv`)
- Installs `pip`, `ansible`, and `kubernetes` Python packages
- Installs Ansible collections from `collections/requirements.yml`:
  - `community.general`
  - `ansible.posix`
  - `community.docker`
  - `kubernetes.core`
- Updates `.gitignore` to exclude the virtual environment

---

@@ -25,7 +56,7 @@ Before running the playbooks, ensure you have:
Example `inventories/dev/hosts.ini`:
```ini
[k8s_masters]
localhost ansible_connection=local ansible_python_interpreter=auto_silent
localhost ansible_connection=local ansible_python_interpreter=auto_silent ansible_user=<your-username>

[k8s_workers]
# worker1 ansible_host=192.168.1.11 ansible_user=ubuntu
@@ -40,15 +71,19 @@ ansible_become_method=sudo

## Quick Start (Single-Node Deployment)

### Step 1: Install Ansible Collections
### Step 1: Setup Environment (if not done)

```bash
ansible-galaxy collection install -r requirements.yml
chmod +x ~/etsi-mec-sandbox/playbooks/setup_ansible_env.sh
cd ~/etsi-mec-sandbox/playbooks
./setup_ansible_env.sh
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate
```

### Step 2: Run the Playbook

```bash
cd ~/etsi-mec-sandbox/playbooks
ansible-playbook -i inventories/dev/hosts.ini site.yml
```

@@ -73,16 +108,17 @@ The playbook executes the following roles in order:

| Order | Role                         | Description                                      |
|-------|------------------------------|--------------------------------------------------|
| 1     | common                       | Base packages and system configuration           |
| 2     | kernel                       | Kernel modules and sysctl tuning                 |
| 3     | docker                       | Docker runtime installation and configuration    |
| 4     | kubernetes/master            | Initialize Kubernetes control plane              |
| 5     | cni_calico                   | Deploy Calico CNI networking                     |
| 6     | helm                         | Install Helm package manager                     |
| 7     | dev_env/golang (conditional) | Go development environment                       |
| 8     | dev_env/node (conditional)   | Node.js/NVM environment                          |
| 9     | mec_sandbox/mec_config       | Configure MEC Sandbox (charts, secrets, OAuth)   |
| 10    | mec_sandbox/mec_deploy       | Build and deploy MEC Sandbox components          |
| 1     | common                       | Base packages, APT keyring setup                 |
| 2     | kernel                       | Disable swap, kernel modules, sysctl tuning      |
| 3     | containerd                   | Install & configure containerd with SystemdCgroup|
| 4     | docker                       | Docker engine installation & daemon config       |
| 5     | kubernetes/master            | Initialize Kubernetes control plane (kubeadm init)|
| 6     | cni_calico                   | Deploy Calico CNI via Tigera operator            |
| 7     | helm                         | Install Helm package manager (via snap)          |
| 8     | dev_env/golang (conditional) | Go development environment + GolangCI-Lint       |
| 9     | dev_env/node (conditional)   | Node.js/NVM environment                          |
| 10    | mec_sandbox/mec_config       | Configure MEC Sandbox (charts, secrets, OAuth)   |
| 11    | mec_sandbox/mec_deploy       | Build and deploy MEC Sandbox components          |

---

@@ -93,35 +129,40 @@ The playbook executes the following roles in order:
This role configures the MEC Sandbox environment:

1. **Adds kubectl bash completion** to `.bashrc`
2. **Updates /etc/hosts** with docker registry entry
3. **Copies Kubernetes CA** to system trust store
4. **Patches chart values**:
   - Updates `fsGroup` and `runAsUser` from 1001 → 1000 in:
2. **Updates /etc/hosts** with docker registry entry (`meep-docker-registry`)
3. **Copies Kubernetes CA** to system trust store (`/usr/local/share/ca-certificates/`)
4. **Runs `update-ca-certificates`** to refresh system CA store
5. **Restarts docker and containerd** daemons
6. **Patches chart values** (uid/gid 1001 → 1000):
   - `charts/postgis/values.yaml`
   - `charts/redis/values.yaml`
   - `charts/docker-registry/values.yaml`
5. **Updates GitHub OAuth credentials** in `secrets.yaml`
6. **Updates ingress host address** in `.meepctl-repocfg.yaml`
7. **Patches frontend config** (uid/gid 1001 → 1000):
   - `etsi-mec-sandbox-frontend/config/.meepctl-repocfg.yaml`
8. **Updates GitHub OAuth credentials** in `secrets.yaml`
9. **Updates ingress host address** in `.meepctl-repocfg.yaml`

### mec_sandbox/mec_deploy Role

This role builds and deploys all MEC Sandbox components:

1. **Install meepctl**: Runs `install.sh` from `go-apps/meepctl`
2. **Configure meepctl**:
2. **Verify meepctl**: Checks `meepctl version` is available
3. **Configure meepctl**:
   ```bash
   meepctl config ip <mec-host-address>
   meepctl config gitdir <sandbox-dir>
   ```
3. **Build & Deploy Frontend**:
4. **Build & Deploy Frontend**:
   ```bash
   cd etsi-mec-sandbox-frontend && bash build.sh && bash deploy.sh
   ```
4. **Configure Secrets**: Runs `configure-secrets.py`
5. **Deploy Dependencies**: `meepctl deploy dep` (with retries)
6. **Build All**: `meepctl build --nolint all`
7. **Dockerize All**: `meepctl dockerize all`
8. **Deploy Core**: `meepctl deploy core`
5. **Configure Secrets**: Runs `configure-secrets.py set`
6. **Deploy Dependencies**: `meepctl deploy dep` (with up to 3 retries using `-f` flag)
7. **Build All**: `meepctl build --nolint all`
8. **Dockerize All**: `meepctl dockerize all` (runs via `sg docker`)
9. **Prune Docker Images**: `docker image prune -f`
10. **Deploy Core**: `meepctl deploy core`

---

@@ -129,7 +170,7 @@ This role builds and deploys all MEC Sandbox components:

If you want to add worker nodes (separate machines), follow these steps:

1. On each worker node prepare SSH access and ensure Ansible can reach them (or run the play locally on that host).
1. On each worker node, ensure SSH access is configured and Ansible can reach them.

2. Edit `inventories/dev/hosts.ini` and add entries under `[k8s_workers]`:
   ```ini
@@ -138,22 +179,36 @@ If you want to add worker nodes (separate machines), follow these steps:
   worker2 ansible_host=192.168.56.12 ansible_user=ubuntu
   ```

3. Uncomment the worker play in `site.yml`.
3. Uncomment the worker play in `site.yml`:
   ```yaml
   - hosts: k8s_workers
     become: true
     vars_prompt:
       - name: ansible_become_pass
         prompt: "Enter sudo password for workers"
         private: true
     roles:
       - common
       - kernel
       - containerd
       - kubernetes/common
       - kubernetes/worker
   ```

4. Run the playbook for master first (to initialize control plane and produce join script):
   ```bash
   ansible-playbook -K -l k8s_masters site.yml
   ansible-playbook -l k8s_masters site.yml
   ```
   After successful run, a join command will be generated on the master at `/tmp/kube_join_cmd.sh`.
   After successful run, a join command will be generated at `/tmp/kubeadm_join.sh`.

5. Copy the `/tmp/kube_join_cmd.sh` to each worker node:
5. Copy the `/tmp/kubeadm_join.sh` to each worker node:
   ```bash
   scp /tmp/kube_join_cmd.sh user@worker1:/tmp/kube_join_cmd.sh
   scp /tmp/kubeadm_join.sh user@worker1:/tmp/kubeadm_join.sh
   ```

6. Run the worker play:
   ```bash
   ansible-playbook -K -l k8s_workers site.yml
   ansible-playbook -l k8s_workers site.yml
   ```

---
@@ -172,10 +227,21 @@ To skip MEC Sandbox deployment:
ansible-playbook -i inventories/dev/hosts.ini site.yml -e "install_mec_sandbox=false"
```

To skip development environment setup:
```bash
ansible-playbook -i inventories/dev/hosts.ini site.yml -e "install_dev_env=false"
```

---

## Troubleshooting

### Virtual Environment Not Activated
If you see "ansible: command not found", activate the virtual environment:
```bash
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate
```

### Thanos/Prometheus Deployment Failures
During `meepctl deploy dep`, thanos and prometheus failures are **expected and ignored**. The deployment will continue.

@@ -187,7 +253,10 @@ Ensure both repositories are cloned as siblings:
```

### Permission Issues (uid/gid 1001)
The `mec_config` role automatically patches chart values from uid/gid 1001 → 1000. If you still encounter issues, verify the patches were applied.
The `mec_config` role automatically patches chart values from uid/gid 1001 → 1000. If you still encounter issues, verify the patches were applied:
```bash
grep -r "runAsUser\|fsGroup" ~/etsi-mec-sandbox/charts/*/values.yaml
```

### Docker Group Issues
If dockerize fails, ensure your user is in the docker group:
@@ -196,20 +265,46 @@ sudo usermod -aG docker $USER
newgrp docker
```

### Kubernetes Collection Errors
If you see errors about `kubernetes.core.k8s`, ensure collections are installed:
```bash
source ~/etsi-mec-sandbox/playbooks/ansible-venv/bin/activate
ansible-galaxy collection install -r collections/requirements.yml
```

---

## Logs

Deployment logs are saved to `/tmp/`:
- `/tmp/meepctl_deploy_dep.log`
- `/tmp/meepctl_deploy_dep.log` (and retry logs)
- `/tmp/meepctl_build.log`
- `/tmp/meepctl_dockerize.log`
- `/tmp/meepctl_deploy_core.log`

---

## Key Variables

Default values from `inventories/dev/group_vars/all.yml`:

| Variable              | Default Value        |
|-----------------------|----------------------|
| `kubernetes_version`  | `v1.35.1`            |
| `calico_version`      | `v3.31.4`            |
| `containerd_version`  | `1.7.27-1`           |
| `pod_network_cidr`    | `192.168.0.0/16`     |
| `go_version`          | `1.17`               |
| `node_version`        | `12.19.0`            |
| `npm_version`         | `6.14.8`             |
| `eslint_version`      | `5.16.0`             |

---

## Notes

- Worker nodes will only run `common`, `kernel`, `containerd`, `kubernetes/common`, and `kubernetes/worker` roles.
- The `kubernetes/worker` role expects a join script (created on master) at `/tmp/kube_join_cmd.sh`.
- The MEC Sandbox deployment requires significant resources; ensure adequate CPU, memory, and disk space.
 No newline at end of file
- **Always run `setup_ansible_env.sh` first** and activate the virtual environment
- Worker nodes will only run `common`, `kernel`, `containerd`, `kubernetes/common`, and `kubernetes/worker` roles
- The `kubernetes/worker` role expects a join script (created on master) at `/tmp/kubeadm_join.sh`
- The MEC Sandbox deployment requires significant resources; ensure adequate CPU, memory, and disk space
- The playbook uses `vars_prompt` for interactive input; for automation, pass variables via `-e` flag
 No newline at end of file
+10 −10
Original line number Diff line number Diff line
@@ -123,18 +123,18 @@
  register: calico_installation
  become: true

- name: Remove master/control-plane taints to allow scheduling on single-node
  command: kubectl taint nodes {{ inventory_hostname }} {{ item }}-
# - name: Remove master/control-plane taints to allow scheduling on single-node
#   command: kubectl taint nodes {{ target_user }} {{ item }}-
#   loop:
#     - node-role.kubernetes.io/master
#     - node-role.kubernetes.io/control-plane
#   failed_when: false
#   changed_when: false

- name: Remove control-plane taint
  command: kubectl taint nodes --all {{ item }}-
  loop:
    - node-role.kubernetes.io/master
    - node-role.kubernetes.io/control-plane
  failed_when: false
  changed_when: false

# - name: Display CNI installation notice
#   debug:
#     msg: |
#       CNI (Calico) is being installed — this involves downloading container images and may take several minutes.
#       You can check the status in another terminal by running:
#         kubectl get po -A
#       Wait until every pod (especially coredns, calico-node, tigera-operator) shows Running/Ready.
+85 −0

File added.

Preview size limit exceeded, changes collapsed.