Commit 34e9bd3f authored by Sergio Gimenez's avatar Sergio Gimenez
Browse files

docs(ansible): refresh dual-oop workflow

Document the working dual-host deployment and e2e smoke flow, and make quick undeploy remove legacy Kind clusters so the documented commands match the current VM setup.
parent 2c3152ba
Loading
Loading
Loading
Loading
+16 −10
Original line number Diff line number Diff line
@@ -2,17 +2,18 @@

Deploy Operator Platform environments with Ansible.

This repository is mainly used to:
Repository scope:

- prepare target hosts
- create Kind-based Kubernetes clusters
- prepare remote hosts
- create Kind clusters
- deploy Operator Platform components
- run single-host or dual-host federation scenarios
- deploy two-host federation setups
- fetch kubeconfigs back to local machine

## Start here

- `docs/getting-started.md`: environment, inventory, secrets, first deployment
- `docs/deployment.md`: scenario selection, commands, kubeconfigs, undeploy
- `docs/deployment.md`: working deployment, undeploy, verification, smoke test flow

## Quick start

@@ -23,14 +24,13 @@ source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yml

ansible-playbook playbooks/scenarios/all_in_one/deploy.yml -e @secrets.yml
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_2,openop_3
```

## Main scenarios
## Working scenarios

- `playbooks/scenarios/all_in_one/deploy.yml`: one host, easiest test setup
- `playbooks/scenarios/full_oop/deploy.yml`: one complete Operator Platform
- `playbooks/scenarios/dual_oop/deploy.yml`: two hosts, real federation
- `playbooks/scenarios/dual_oop/deploy.yml`: two complete Operator Platforms on two hosts
- `playbooks/scenarios/full_oop/deploy.yml`: one complete Operator Platform on one host

## Kubeconfigs

@@ -45,6 +45,12 @@ Examples:
- `~/kind-cluster-configs/openop_3/op1-kubeconfig.yaml`
- `~/kind-cluster-configs/openop_2/op2-kubeconfig.yaml`

## Notes

- Run playbooks from `ansible/`.
- `ansible/secrets.yml` is local-only and gitignored.
- `dual_oop` and `full_oop` load `secrets.yml` automatically.

## Docs preview

```bash
+8 −16
Original line number Diff line number Diff line
@@ -4,20 +4,12 @@ This directory contains the main end-to-end playbooks.

## Scenarios

### `all_in_one`

One host, easiest way to test the stack.

```bash
ansible-playbook playbooks/scenarios/all_in_one/deploy.yml -e @secrets.yml
```

### `full_oop`

One complete Operator Platform on one host.

```bash
ansible-playbook playbooks/scenarios/full_oop/deploy.yml -e @secrets.yml
ansible-playbook playbooks/scenarios/full_oop/deploy.yml --limit openop_3
```

### `dual_oop`
@@ -25,14 +17,14 @@ ansible-playbook playbooks/scenarios/full_oop/deploy.yml -e @secrets.yml
Two complete Operator Platforms on two hosts.

```bash
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_2,openop_3
```

Use tags to deploy only one side:
Deploy one side only:

```bash
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml --tags op1
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml --tags op2
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_3
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_2
```

## Undeploy
@@ -50,8 +42,8 @@ Graceful undeploy removes components before removing the cluster.
Examples:

```bash
ansible-playbook playbooks/scenarios/all_in_one/quick_undeploy.yml
ansible-playbook playbooks/scenarios/dual_oop/graceful_undeploy.yml
ansible-playbook playbooks/scenarios/full_oop/quick_undeploy.yml --limit openop_3
ansible-playbook playbooks/scenarios/dual_oop/quick_undeploy.yml --limit openop_2,openop_3
```

## Inventory expectations
@@ -63,5 +55,5 @@ ansible-playbook playbooks/scenarios/dual_oop/graceful_undeploy.yml
## Notes

- Run from the `ansible/` directory.
- Pass `-e @secrets.yml` for private images.
- `secrets.yml` is loaded by `dual_oop` and `full_oop`.
- For remote deployments, kubeconfigs are fetched to `~/kind-cluster-configs/`.
+44 −6
Original line number Diff line number Diff line
@@ -18,16 +18,35 @@
        cluster_name: "op1"
        kind_cluster_delete_name: "op1"

    - name: Delete OP1 Kind Cluster
    - name: Get existing OP1 Kind clusters
      ansible.builtin.command:
        cmd: kind delete cluster --name {{ kind_cluster_delete_name }}
        cmd: kind get clusters
      register: op1_kind_clusters
      changed_when: false
      failed_when: false

    - name: Select OP1 clusters to delete
      ansible.builtin.set_fact:
        op1_clusters_to_delete: >-
          {{
            [kind_cluster_delete_name, 'operator-platform']
            | select('in', op1_kind_clusters.stdout_lines | default([]))
            | unique
            | list
          }}

    - name: Delete OP1 Kind clusters
      ansible.builtin.command:
        cmd: kind delete cluster --name {{ item }}
      loop: "{{ op1_clusters_to_delete }}"
      register: delete_op1_result
      changed_when: delete_op1_result.rc == 0
      failed_when: false

    - name: Display OP1 deletion result
      ansible.builtin.debug:
        msg: "OP1 cluster deletion ({{ kind_cluster_delete_name }}) {{ 'successful' if delete_op1_result.rc == 0 else 'failed or cluster not found' }}"
        msg: >-
          OP1 deleted clusters: {{ op1_clusters_to_delete | join(', ') if op1_clusters_to_delete | length > 0 else 'none found' }}

    - name: Clean up OP1 kubeconfig directory
      ansible.builtin.file:
@@ -56,16 +75,35 @@
        cluster_name: "op2"
        kind_cluster_delete_name: "op2"

    - name: Delete OP2 Kind Cluster
    - name: Get existing OP2 Kind clusters
      ansible.builtin.command:
        cmd: kind get clusters
      register: op2_kind_clusters
      changed_when: false
      failed_when: false

    - name: Select OP2 clusters to delete
      ansible.builtin.set_fact:
        op2_clusters_to_delete: >-
          {{
            [kind_cluster_delete_name, 'operator-platform']
            | select('in', op2_kind_clusters.stdout_lines | default([]))
            | unique
            | list
          }}

    - name: Delete OP2 Kind clusters
      ansible.builtin.command:
        cmd: kind delete cluster --name {{ kind_cluster_delete_name }}
        cmd: kind delete cluster --name {{ item }}
      loop: "{{ op2_clusters_to_delete }}"
      register: delete_op2_result
      changed_when: delete_op2_result.rc == 0
      failed_when: false

    - name: Display OP2 deletion result
      ansible.builtin.debug:
        msg: "OP2 cluster deletion ({{ kind_cluster_delete_name }}) {{ 'successful' if delete_op2_result.rc == 0 else 'failed or cluster not found' }}"
        msg: >-
          OP2 deleted clusters: {{ op2_clusters_to_delete | join(', ') if op2_clusters_to_delete | length > 0 else 'none found' }}

    - name: Clean up OP2 kubeconfig directory
      ansible.builtin.file:
+2 −0
Original line number Diff line number Diff line
@@ -17,6 +17,8 @@
- name: Deploy Full OOP (Single Federation Manager)
  hosts: k8s_clusters
  gather_facts: true
  vars_files:
    - ../../../secrets.yml
  
  vars:
    # Scenario-specific configuration
+60 −40
Original line number Diff line number Diff line
# Deployment Guide

## Choose a scenario
## Working scenarios

### `all_in_one`
### `dual_oop`

Use this when you want the simplest setup on one host.
Use this for real federation across two hosts.

```bash
ansible-playbook playbooks/scenarios/all_in_one/deploy.yml -e @secrets.yml
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_2,openop_3
```

Good for:

- first deployment
- local validation
- quick federation-related testing on one VM

### `full_oop`

Use this when you want one complete Operator Platform on one host.

```bash
ansible-playbook playbooks/scenarios/full_oop/deploy.yml -e @secrets.yml
```
This deploys:

### `dual_oop`
- OP1 on `op1_nodes`
- OP2 on `op2_nodes`

Use this when you want two separate Operator Platforms on two hosts.
You can deploy one side only if needed:

```bash
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_3
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml --limit openop_2
```

This deploys:

- OP1 on `op1_nodes`
- OP2 on `op2_nodes`
### `full_oop`

You can also deploy one side only:
Use this for one complete Operator Platform on one host.

```bash
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml --tags op1
ansible-playbook playbooks/scenarios/dual_oop/deploy.yml -e @secrets.yml --tags op2
ansible-playbook playbooks/scenarios/full_oop/deploy.yml --limit openop_3
```

## Where kubeconfigs go

For remote deployments, the playbooks fetch kubeconfigs to your local machine:
For remote deployments, playbooks fetch kubeconfigs locally to:

```text
~/kind-cluster-configs/<inventory-host>/
@@ -66,41 +52,75 @@ kubectl get pods -A

## Access points

Default service URLs:
Useful service URLs:

- OEG northbound: `http://<host-ip>:32263/oeg/1.0.0`
- Federation Manager: `http://<host-ip>:30989`
- Homer: `http://<host-ip>:30088`
- Zot: `http://<host-ip>:30050`
- Federation Manager: `http://<host-ip>:30989`
- Prometheus: `http://<host-ip>:30090`
- Grafana: `http://<host-ip>:30091`

## End-to-end smoke test

After `dual_oop` is green, run smoke from `local-docker-deployment/` in repository root.

Command:

```bash
bash smoke-test.sh \
  --image "docker.io/nginx:latest" \
  --app-name "test-app" \
  --local-url "http://192.168.123.155:32263/oeg/1.0.0" \
  --remote-url "http://192.168.123.178:32263/oeg/1.0.0" \
  --local-provider "Local Operator" \
  --federated-provider "Remote Operator"
```

Notes:

- use OEG northbound URL on port `32263`, not Federation Manager on `30989`
- current smoke script auto-resolves local and remote zone IDs
- current smoke script auto-normalizes invalid app names like `test-app` to `test_app`
- script deploys and then cleans up app instances in one run

Expected success signals:

- local app instance reaches `ready`
- federated app instance reaches `ready`
- cleanup leaves `Remaining matching app instances: []`

## Undeploy

Quick removal deletes the cluster directly:
Quick removal deletes clusters directly:

```bash
ansible-playbook playbooks/scenarios/all_in_one/quick_undeploy.yml
ansible-playbook playbooks/scenarios/full_oop/quick_undeploy.yml
ansible-playbook playbooks/scenarios/dual_oop/quick_undeploy.yml
ansible-playbook playbooks/scenarios/dual_oop/quick_undeploy.yml --limit openop_2,openop_3
ansible-playbook playbooks/scenarios/full_oop/quick_undeploy.yml --limit openop_3
```

Graceful removal undeploys components first:

```bash
ansible-playbook playbooks/scenarios/all_in_one/graceful_undeploy.yml
ansible-playbook playbooks/scenarios/full_oop/graceful_undeploy.yml
ansible-playbook playbooks/scenarios/dual_oop/graceful_undeploy.yml
ansible-playbook playbooks/scenarios/dual_oop/graceful_undeploy.yml --limit openop_2,openop_3
ansible-playbook playbooks/scenarios/full_oop/graceful_undeploy.yml --limit openop_3
```

## Troubleshooting

### Port already in use

The Kind role now checks port conflicts before cluster creation. If a deployment fails early, free the reported ports or remove old containers/clusters on the target host.
If deployment fails before Kind cluster creation, old clusters or containers still own NodePort mappings. Remove stale Kind clusters on target hosts, then retry.

### SRM or another component stays in `FAILED - RETRYING`

### Deployment succeeds but a component is slow
Read pod status with fetched kubeconfig:

`FAILED - RETRYING` during readiness checks is normal if the task later ends with `ok`.
```bash
export KUBECONFIG=~/kind-cluster-configs/openop_3/op1-kubeconfig.yaml
kubectl get pods -A
kubectl describe pod -n oop <pod-name>
```

### Verify cluster health

Loading