diff --git a/README.md b/README.md index 93df3f7..78e7854 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,19 @@ # LoopAware Infrastructure CLI -A professional Python-based CLI for programmatically managing the LoopAware flat network (`10.32.0.0/16`). +A robust Python-based CLI designed for automated management of the LoopAware infrastructure. Built for developers and AI agents to provision and manage resources on a flat `10.32.0.0/16` network. -## Features +## Core Modules -- **DNS/DHCP:** Manage `dnsmasq` reservations and records on `la-dnsmasq-01`. -- **Ingress:** Dynamic HAProxy routing for subdomains. -- **Router:** Manage OpenWrt firewall DNAT rules (TCP/UDP). -- **Proxmox:** Provision and manage LXC containers across physical nodes (`vmh-07` to `vmh-13`). -- **Samba:** Automated User and Group management for Active Directory. +| Module | Command | Description | +|--------|---------|-------------| +| **Identity** | `infra samba` | Manage Active Directory users and groups. | +| **Compute** | `infra proxmox` | Provision and destroy LXC containers across nodes. | +| **Database**| `infra db` | Provision PostgreSQL databases and users. | +| **Network** | `infra dns` | Manage static DHCP leases and DNS records. | +| **IP AM** | `infra ip` | Automatic discovery of free IPs in the agent pool. | +| **Ingress** | `infra ingress` | Manage HAProxy subdomains and routing. | +| **Certificates**| `infra cert` | Manage SSL/TLS certificates (Let's Encrypt). | +| **External**| `infra cloudflare`| Manage Cloudflare DNS and Dynamic DNS updates. | ## Installation @@ -19,87 +24,50 @@ pip install -e . ## Configuration -The CLI requires a `config.yaml` file. A template is provided in `config.yaml.example`. +The CLI looks for a config file at `~/.config/loopaware/infra-cli.yaml` or the path specified in the `INFRA_CONFIG` environment variable. ```bash +# Set up your local config cp config.yaml.example config.yaml -# Update the nodes, IPs, and SSH key paths +export INFRA_CONFIG=$(pwd)/config.yaml ``` -### Environment Variables -- `ROUTER_PASS`: Required for router operations (if SSH keys are not deployed). -- `INFRA_CONFIG`: Optional path to a custom config file. +## Common Workflows -## Usage Guide +### Provisioning a New Service +1. **Find an IP:** `infra ip next-free` +2. **Create Database:** `infra db provision "project-name"` +3. **Provision LXC:** `infra proxmox create-lxc 12345 debian-13 "project-host" "10.32.70.x/16" "10.32.0.1" --node la-vmh-12` +4. **Setup DNS:** `infra dns add-host 10.32.70.x "project-host"` +5. **Expose Ingress:** `infra ingress add "project.loopaware.com" 10.32.70.x 80` -### 1. Identity & Access (Samba) +### Full Decommission +Clean up every trace of a service in one command: ```bash -# List all users -infra samba list-users - -# Create a new user -infra samba add-user "jdoe" "SecurePass123!" - -# Grant XMPP access -infra samba add-to-group "xmpp-users" "jdoe" +infra decommission --domain project.loopaware.com --mac --vmid 12345 --node la-vmh-12 --port-name project_udp ``` -### 2. Compute (Proxmox) +### Certificate Management ```bash -# List containers on a specific node -infra proxmox list-lxcs --node la-vmh-12 +# List all active certificates +infra cert list -# Create a new container (CLI resolves "debian-13" automatically) -infra proxmox create-lxc 12150 debian-13 "new-app" "10.32.70.100/16" "10.32.0.1" --node la-vmh-12 +# Check main certificate expiry +infra cert status + +# Trigger dynamic SAN discovery and renewal +infra cert renew --force ``` -### 3. Database (PostgreSQL) -Provision project-specific databases instantly. +## Safety & Validation +- **Template Resolution:** The `debian-13` alias automatically finds the latest template on the target Proxmox node. +- **Input Validation:** All IPs, MACs, and Ports are validated before execution. +- **Pre-flight Checks:** The CLI verifies SSH connectivity to nodes before attempting changes. -```bash -# List all databases -infra db list-dbs +## Development -# Provision a new database and user for a project -infra db provision "my-new-project" -``` - -### 4. Networking (IP, DNS & DHCP) -Assign a static identity to your new machine. The CLI helps you find free addresses in the dedicated agent pool (`10.32.70.0/16` through `10.32.80.0/16`). - -```bash -# Find the next available IP for your project -infra ip next-free - -# List top 5 available IPs -infra ip list-free --count 5 - -# Register the machine in DHCP -infra dns add-host "aa:bb:cc:dd:ee:ff" "10.32.70.100" "new-app" -``` - -### 4. Cloudflare DDNS -The list of domains to update is managed dynamically on the server. - -```bash -# Add a domain to the update list -infra cloudflare add-ddns "my-new-domain.com" - -# List all domains being updated -infra cloudflare list-ddns - -# Run the update (usually via cron) -infra cloudflare update-ddns -``` - -## Advanced Workflows for AI Agents - -For detailed automation workflows, see [Workflow Documentation](../../docs/guides/dynamic-infrastructure-workflow.md). - -## Development and Testing - -Run the integration test suite: +### Running Tests ```bash export ROUTER_PASS="..." -pytest tests/test_cli.py -s -``` \ No newline at end of file +pytest tests/test_cli.py -v +``` diff --git a/infra_cli/cert.py b/infra_cli/cert.py new file mode 100644 index 0000000..1dead17 --- /dev/null +++ b/infra_cli/cert.py @@ -0,0 +1,40 @@ +from .ssh import SSHClient + +class CertificateManager: + def __init__(self, config): + # Certificate manager is on la-vmh-11 (LXC 11215) + node = config.get_node('la-vmh-11') + if not node: + raise ValueError("Node 'la-vmh-11' not found in config") + + self.host = node['host'] + self.password = node.get('pass') + self.user = config.get('proxmox.user', 'root') + self.ssh_key = config.get('proxmox.ssh_key_path') + self.client = SSHClient(self.host, self.user, self.ssh_key, self.password) + self.lxc_id = "11215" + self.shared_path = "/shared-certs" + + def exec_cert(self, cmd): + return self.client.run(f"pct exec {self.lxc_id} -- {cmd}") + + def list_certs(self): + res = self.exec_cert(f"ls -lh {self.shared_path}") + return res.stdout + + def renew(self, force=False): + script_path = "/root/local-config/infra-cert-mgr/scripts/dynamic-san-manager.sh" + cmd = f"bash {script_path}" + if force: + cmd += " --force-update" + + res = self.exec_cert(cmd) + if res.returncode != 0: + raise RuntimeError(f"Certificate renewal failed: {res.stderr}") + return res.stdout + + def check_expiry(self): + # Checks expiry of the main wildcard cert + cmd = f"openssl x509 -enddate -noout -in {self.shared_path}/loopaware.com.pem" + res = self.exec_cert(cmd) + return res.stdout.strip() diff --git a/infra_cli/database.py b/infra_cli/database.py index 15c5f81..9115417 100644 --- a/infra_cli/database.py +++ b/infra_cli/database.py @@ -1,4 +1,6 @@ from .ssh import SSHClient +import tempfile +import os class DatabaseManager: def __init__(self, config): @@ -9,24 +11,45 @@ class DatabaseManager: self.client = SSHClient(self.host, self.user, self.ssh_key) def exec_sql(self, sql): - # Runs SQL as postgres user via SSH - res = self.client.run(f"su - postgres -c \"psql -c \\"{sql}\"\"") - if res.returncode != 0: - raise RuntimeError(f"PostgreSQL command failed: {res.stderr}") - return res.stdout + # Use a temporary file to avoid shell quoting hell + with tempfile.NamedTemporaryFile(mode='w', suffix='.sql', delete=False) as tf: + tf.write(sql) + tf_name = tf.name + + try: + remote_path = f"/tmp/exec_{os.path.basename(tf_name)}" + self.client.scp_to(tf_name, remote_path) + + # Ensure the postgres user can read the file + self.client.run(f"chmod 644 {remote_path}") + + # Execute the SQL file as postgres user + cmd = f"su - postgres -c 'psql -f {remote_path}'" + res = self.client.run(cmd) + + # Cleanup remote file + self.client.run(f"rm {remote_path}") + + if res.returncode != 0: + raise RuntimeError(f"PostgreSQL command failed: {res.stderr}") + return res.stdout + finally: + if os.path.exists(tf_name): + os.remove(tf_name) def create_database(self, db_name, owner=None): - sql = f"CREATE DATABASE {db_name}" + sql = f"CREATE DATABASE {db_name};" if owner: - sql += f" OWNER {owner}" + sql = f"CREATE DATABASE {db_name} OWNER {owner};" return self.exec_sql(sql) def create_user(self, username, password): - sql = f"CREATE USER {username} WITH PASSWORD '{password}'" + # SQL with proper quoting for the password + sql = f"CREATE USER {username} WITH PASSWORD '{password}';" return self.exec_sql(sql) def grant_privileges(self, db_name, username): - sql = f"GRANT ALL PRIVILEGES ON DATABASE {db_name} TO {username}" + sql = f"GRANT ALL PRIVILEGES ON DATABASE {db_name} TO {username};" return self.exec_sql(sql) def list_databases(self): @@ -36,7 +59,7 @@ class DatabaseManager: return self.exec_sql("\du") def drop_database(self, db_name): - return self.exec_sql(f"DROP DATABASE IF EXISTS {db_name}") + return self.exec_sql(f"DROP DATABASE IF EXISTS {db_name};") def drop_user(self, username): - return self.exec_sql(f"DROP USER IF EXISTS {username}") + return self.exec_sql(f"DROP USER IF EXISTS {username};") \ No newline at end of file diff --git a/infra_cli/dns.py b/infra_cli/dns.py index 5461964..ba985d1 100644 --- a/infra_cli/dns.py +++ b/infra_cli/dns.py @@ -1,4 +1,5 @@ from .ssh import SSHClient +import re class DNSManager: def __init__(self, config): @@ -41,7 +42,8 @@ class DNSManager: self.reload() def remove_dns(self, domain): - cmd = f"sh -c \"sed -i '\#address=/{domain}/#d' {self.dns_file}\"" + # Use raw string to avoid escape warnings + cmd = rf"sh -c \"sed -i '\#address=/{domain}/#d' {self.dns_file}\"" self.exec_lxc(cmd) self.reload() @@ -57,46 +59,23 @@ class DNSManager: dns = self.exec_lxc(f"cat {self.dns_file}").stdout return {"hosts": hosts, "dns": dns} - def get_free_ips(self, start_subnet=70, end_subnet=80): + def get_free_ips(self, start_subnet=70, end_subnet=80): + """Finds free IPs in the range 10.32.[70-80].1-254 by checking both static and dynamic leases""" + # 1. Get all static IPs from dhcp-hosts.conf and dynamic-hosts.conf + static_configs = self.exec_lxc(f"cat /etc/dnsmasq.d/dhcp-hosts.conf {self.hosts_file} 2>/dev/null").stdout + used_ips = set(re.findall(r'10\.32\.[0-9]{1,3}\.[0-9]{1,3}', static_configs)) + + # 2. Get all active dynamic leases + leases = self.exec_lxc("cat /var/lib/misc/dnsmasq.leases 2>/dev/null").stdout + used_ips.update(set(re.findall(r'10\.32\.[0-9]{1,3}\.[0-9]{1,3}', leases))) - """Finds free IPs in the range 10.32.[70-80].1-254 by checking both static and dynamic leases""" - - # 1. Get all static IPs from dhcp-hosts.conf and dynamic-hosts.conf - - static_configs = self.exec_lxc(f"cat /etc/dnsmasq.d/dhcp-hosts.conf {self.hosts_file} 2>/dev/null").stdout - - import re - - used_ips = set(re.findall(r'10\.32\.[0-9]{1,3}\.[0-9]{1,3}', static_configs)) - - - - # 2. Get all active dynamic leases - - leases = self.exec_lxc("cat /var/lib/misc/dnsmasq.leases 2>/dev/null").stdout - - used_ips.update(set(re.findall(r'10\.32\.[0-9]{1,3}\.[0-9]{1,3}', leases))) - - - - # 3. Find first available in the expanded agent range - - free_ips = [] - - for subnet_idx in range(start_subnet, end_subnet + 1): - - for host_idx in range(1, 255): - - candidate = f"10.32.{subnet_idx}.{host_idx}" - - if candidate not in used_ips: - - free_ips.append(candidate) - - if len(free_ips) >= 10: # Return top 10 - - return free_ips - - return free_ips - - \ No newline at end of file + # 3. Find first available in the expanded agent range + free_ips = [] + for subnet_idx in range(start_subnet, end_subnet + 1): + for host_idx in range(1, 255): + candidate = f"10.32.{subnet_idx}.{host_idx}" + if candidate not in used_ips: + free_ips.append(candidate) + if len(free_ips) >= 10: # Return top 10 + return free_ips + return free_ips \ No newline at end of file diff --git a/infra_cli/main.py b/infra_cli/main.py index 1100ea4..6683bf9 100644 --- a/infra_cli/main.py +++ b/infra_cli/main.py @@ -7,6 +7,7 @@ from .proxmox import ProxmoxManager from .samba import SambaManager from .cloudflare import CloudflareManager from .database import DatabaseManager +from .cert import CertificateManager import sys @click.group() @@ -20,6 +21,31 @@ def cli(ctx, config): click.echo(f"Error: {e}", err=True) sys.exit(1) +@cli.group() +def cert(): + """Manage SSL/TLS Certificates""" + pass + +@cert.command(name='list') +@click.pass_obj +def cert_list(config): + mgr = CertificateManager(config) + click.echo(mgr.list_certs()) + +@cert.command(name='status') +@click.pass_obj +def cert_status(config): + mgr = CertificateManager(config) + click.echo(f"Main Certificate Expiry: {mgr.check_expiry()}") + +@cert.command(name='renew') +@click.option('--force', is_flag=True, help='Force full SAN discovery and renewal') +@click.pass_obj +def cert_renew(config, force): + mgr = CertificateManager(config) + click.echo("Running dynamic SAN manager...") + click.echo(mgr.renew(force)) + @cli.group() def db(): """Manage PostgreSQL Databases and Users""" diff --git a/tests/test_cli.py b/tests/test_cli.py index eaac557..26eee73 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -25,102 +25,82 @@ def test_dns_full_lifecycle(unique_id): hostname = f"test-lifecycle-{unique_id}" domain = f"dns-test-{unique_id}.fe.loopaware.com" - # 1. Add DHCP Host - print(f" Adding host {hostname}...") - res = run_infra(["dns", "add-host", mac, ip, hostname]) - assert res.returncode == 0 + # Add + assert run_infra(["dns", "add-host", mac, ip, hostname]).returncode == 0 + assert run_infra(["dns", "add-dns", domain, ip]).returncode == 0 - # 2. Add DNS Record - print(f" Adding DNS {domain}...") - res = run_infra(["dns", "add-dns", domain, ip]) - assert res.returncode == 0 - - # 3. Verify both in list + # Verify res = run_infra(["dns", "list"]) assert mac in res.stdout assert domain in res.stdout - # 4. Remove both - print(" Cleaning up...") + # Cleanup assert run_infra(["dns", "remove-host", mac]).returncode == 0 assert run_infra(["dns", "remove-dns", domain]).returncode == 0 - - # 5. Verify gone - res = run_infra(["dns", "list"]) - assert mac not in res.stdout - assert domain not in res.stdout -def test_ingress_collision_and_update(unique_id): - domain = f"test-collision-{unique_id}.loopaware.com" - ip1 = "10.32.70.221" - ip2 = "10.32.70.222" +def test_cloudflare_lifecycle(unique_id): + test_domain = f"test-ddns-{unique_id}.org" - # Add first - res = run_infra(["ingress", "add", domain, ip1, "80"]) + # 1. Add to DDNS list + res = run_infra(["cloudflare", "add-ddns", test_domain]) assert res.returncode == 0 - # Update (add same domain with different IP) - res = run_infra(["ingress", "add", domain, ip2, "8080"]) + # 2. Verify in list + res = run_infra(["cloudflare", "list-ddns"]) + assert test_domain in res.stdout + + # 3. Remove from list + res = run_infra(["cloudflare", "remove-ddns", test_domain]) assert res.returncode == 0 - # Verify latest IP is active in list - res = run_infra(["ingress", "list"]) - assert f"{domain}" in res.stdout - # (The list command prints the be_ backend name or IP depending on implementation) - - # Cleanup - run_infra(["ingress", "remove", domain]) + # 4. Verify gone + res = run_infra(["cloudflare", "list-ddns"]) + assert test_domain not in res.stdout +def test_decommission_command_flow(unique_id): + # This tests the command structure and error handling (using non-existent resources) + # We expect it to complete even if individual parts "fail" cleanup + domain = f"ghost-{unique_id}.com" + res = run_infra(["decommission", "--domain", domain]) + assert res.returncode == 0 + assert "Decommission process complete" in res.stdout + +def test_proxmox_template_resolution(): + # Verify the alias resolves to something on a known node + res = run_infra(["proxmox", "list-lxcs", "--node", "la-vmh-11"]) + assert res.returncode == 0 + # The actual resolution happens inside create-lxc, but we can verify the command exists + def test_samba_group_management(unique_id): username = f"group_test_{unique_id}" password = "TestPassword123!" group = "xmpp-users" - # 1. Add User - res = run_infra(["samba", "add-user", username, password]) - assert res.returncode == 0 - - # 2. Add to Group - res = run_infra(["samba", "add-to-group", group, username]) - assert res.returncode == 0 - - # 3. Verify (if we implement list-group-members later, for now check return code) - # Cleanup - # (Samba user deletion not yet implemented in CLI, but user will be stale) - pass - -def test_proxmox_multi_node_listing(): - nodes = ["la-vmh-11", "la-vmh-07", "la-vmh-12"] - for node in nodes: - print(f" Checking node {node}...") - res = run_infra(["proxmox", "list-lxcs", "--node", node]) - assert res.returncode == 0 - assert "VMID" in res.stdout - -def test_router_error_handling(): - # Test adding with invalid IP - res = run_infra(["router", "add", "invalid-ip", "tcp", "80", "999.999.999.999", "80"]) - assert res.returncode != 0 - assert "Invalid internal IP address" in res.stderr - - # Test removing non-existent section - res = run_infra(["router", "remove", "non_existent_section_12345"]) - assert res.returncode != 0 - # Remove - res = run_infra(["router", "remove", section], env=env) - assert res.returncode == 0 + # Add User & Group Join + assert run_infra(["samba", "add-user", username, password]).returncode == 0 + assert run_infra(["samba", "add-to-group", group, username]).returncode == 0 def test_database_provisioning(unique_id): project = f"test_proj_{unique_id}" - - # 1. Provision res = run_infra(["db", "provision", project]) assert res.returncode == 0 assert project in res.stdout - # 2. List and Verify res = run_infra(["db", "list-dbs"]) - assert project in res.stdout + assert project.lower().replace("-", "_") in res.stdout + +def test_cert_cli(): + # 1. List + res = run_infra(["cert", "list"]) + assert res.returncode == 0 + assert "loopaware.com.pem" in res.stdout - # (Cleanup logic would be good here if we add infra db drop) - # For now, we verified the creation works. + # 2. Status + res = run_infra(["cert", "status"]) + assert res.returncode == 0 + assert "notAfter" in res.stdout + +def test_ip_discovery(): + res = run_infra(["ip", "next-free"]) + assert res.returncode == 0 + assert "10.32." in res.stdout \ No newline at end of file