feat: Add professional hierarchical documentation
Some checks are pending
Build and Release / build-sign-package (push) Waiting to run
Some checks are pending
Build and Release / build-sign-package (push) Waiting to run
- Created comprehensive README.md with Mermaid diagrams, badges, and TOC - Added docs/ directory with 7 sections and 14 markdown files - Included architecture diagrams, flowcharts, and sequence diagrams - All documentation is fully interlinked with cross-references - Added ISO storage location on Proxmox development server - Included troubleshooting guide and evaluation management docs - All config files (Packer, Terraform, Ansible, Forgejo) documented - Added icons and visual elements throughout documentation
This commit is contained in:
parent
faf04d69f8
commit
e4f03427b7
24 changed files with 3844 additions and 2 deletions
273
docs/07-advanced/troubleshooting.md
Normal file
273
docs/07-advanced/troubleshooting.md
Normal file
|
|
@ -0,0 +1,273 @@
|
|||
# 🔧 Troubleshooting Guide
|
||||
|
||||
[]()
|
||||
|
||||
## Overview
|
||||
|
||||
This guide covers common issues and their solutions for the Windows automation pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Quick Fix Index
|
||||
|
||||
| Issue | Phase | Quick Fix |
|
||||
|-------|-------|-----------|
|
||||
| Packer timeout | Build | Check Autounattend.xml WinRM config |
|
||||
| VM won't boot | Provision | Verify ISO paths in Packer |
|
||||
| Ansible connection | Test | Disable Windows firewall |
|
||||
| Code signing fails | Build | Verify PFX password |
|
||||
| Template expired | All | Rebuild with Packer |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Packer Issues
|
||||
|
||||
### Timeout Waiting for WinRM
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
==> proxmox-iso.windows-11: Timeout waiting for WinRM.
|
||||
```
|
||||
|
||||
**Cause:** Windows not fully booted or WinRM not configured.
|
||||
|
||||
**Solution:**
|
||||
1. Verify `Autounattend.xml` has WinRM configuration
|
||||
2. Check boot command timing
|
||||
3. Increase `boot_wait` duration
|
||||
|
||||
```hcl
|
||||
# Increase boot wait
|
||||
boot_wait = "30s"
|
||||
|
||||
# Check boot command
|
||||
boot_command = [
|
||||
"<wait><wait><wait><wait><wait>",
|
||||
"<enter><wait><wait>",
|
||||
"<enter>"
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ISO Not Found
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
==> proxmox-iso.windows-iso: ISO file not found: local:iso/...
|
||||
```
|
||||
|
||||
**Cause:** Wrong ISO path or storage.
|
||||
|
||||
**Solution:**
|
||||
1. Verify ISO location on Proxmox
|
||||
2. Check storage name (local vs local-lvm)
|
||||
|
||||
```bash
|
||||
# On Proxmox host
|
||||
ls -la /mnt/pve-07-iso-nvme/template/iso/
|
||||
qm storage
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: OpenTofu Issues
|
||||
|
||||
### Clone Failed
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
Error: resource is not a cloneable VM
|
||||
```
|
||||
|
||||
**Cause:** Wrong VM ID or template not found.
|
||||
|
||||
**Solution:**
|
||||
1. Verify template VM exists
|
||||
2. Check VM ID is correct
|
||||
|
||||
```bash
|
||||
# On Proxmox host
|
||||
qm list | grep template
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Permission Denied
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
Error: permission denied (400)
|
||||
```
|
||||
|
||||
**Cause:** Proxmox API token lacks privileges.
|
||||
|
||||
**Solution:**
|
||||
1. Add VM.Admin role to token
|
||||
2. Verify token is not expired
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Ansible Issues
|
||||
|
||||
### WinRM Connection Timeout
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
fatal: [10.0.0.5]: UNREACHABLE! => {"msg": "Connection timeout"}
|
||||
```
|
||||
|
||||
**Cause:** Firewall blocking WinRM or WinRM not configured.
|
||||
|
||||
**Solution:**
|
||||
```yaml
|
||||
# In Autounattend.xml - disable firewall
|
||||
<SynchronousCommand wcm:action="add">
|
||||
<CommandLine>powershell -Command "Set-NetFirewallProfile -Profile Private -Enabled False"</CommandLine>
|
||||
<Order>1</Order>
|
||||
</SynchronousCommand>
|
||||
|
||||
# In inventory - ignore certificate validation
|
||||
[windows_vm]
|
||||
10.0.0.5 ansible_winrm_server_cert_validation=ignore
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Invalid Credentials
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
fatal: [10.0.0.5]: UNREACHABLE! => {"msg": "Basic auth failed"}
|
||||
```
|
||||
|
||||
**Cause:** Wrong username or password.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Verify secrets are set
|
||||
echo $WIN_ADMIN_PASS
|
||||
|
||||
# Test manually
|
||||
winrs -r:10.0.0.5 -u:Administrator -p:$WIN_ADMIN_PASS "hostname"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Code Signing Issues
|
||||
|
||||
### Invalid Certificate
|
||||
|
||||
**Symptom:**
|
||||
```
|
||||
Error: PKCS12_parse failed
|
||||
```
|
||||
|
||||
**Cause:** Wrong password or corrupted PFX file.
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Verify certificate
|
||||
openssl pkcs12 -in cert.pfx -info -noout -passin pass:$PFX_PASS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Diagnostic Commands
|
||||
|
||||
### Proxmox Diagnostics
|
||||
|
||||
```bash
|
||||
# List VMs
|
||||
qm list
|
||||
|
||||
# Check VM status
|
||||
qm status <VM_ID>
|
||||
|
||||
# View VM config
|
||||
qm config <VM_ID>
|
||||
|
||||
# Check storage
|
||||
pvesm status
|
||||
```
|
||||
|
||||
### Windows Diagnostics
|
||||
|
||||
```powershell
|
||||
# Check WinRM status
|
||||
winrm quickconfig
|
||||
Get-WinRM -Service
|
||||
|
||||
# Check firewall
|
||||
Get-NetFirewallProfile | Select Name, Enabled
|
||||
|
||||
# Check activation
|
||||
slmgr /dlv
|
||||
```
|
||||
|
||||
### Ansible Diagnostics
|
||||
|
||||
```bash
|
||||
# Test WinRM connection
|
||||
ansible windows_vm -m win_ping -i inventory.ini
|
||||
|
||||
# Verbose output
|
||||
ansible-playbook -i inventory.ini pipeline.yml -vvvv
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Log Locations
|
||||
|
||||
| Component | Log Location |
|
||||
|-----------|--------------|
|
||||
| Packer | Console output + `packer.log` |
|
||||
| OpenTofu | Console output + `.terraform.lock.hcl` |
|
||||
| Ansible | Console output + `/var/log/ansible.log` |
|
||||
| Windows | Event Viewer → System |
|
||||
|
||||
---
|
||||
|
||||
## FAQ
|
||||
|
||||
### Q: Can I use a different Windows edition?
|
||||
|
||||
**A:** Yes, but you need to:
|
||||
1. Update ISO in `packer/windows.pkr.hcl`
|
||||
2. Modify `Autounattend.xml` for that edition
|
||||
3. Update product key settings
|
||||
|
||||
### Q: How do I add more software to the template?
|
||||
|
||||
**A:** Add PowerShell provisioners:
|
||||
|
||||
```hcl
|
||||
provisioner "powershell" {
|
||||
inline = [
|
||||
"choco install -y 7zip git vscode",
|
||||
"& 'C:\\ProgramData\\Chocolatey\\bin\\choco.exe' install -y dotnetfx"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Q: The VM has no IP after provisioning
|
||||
|
||||
**Cause:** DHCP not working or VirtIO drivers missing.
|
||||
|
||||
**Solution:**
|
||||
1. Ensure VirtIO drivers are installed in template
|
||||
2. Verify network bridge is correct
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
| Goal | Next Document |
|
||||
|------|---------------|
|
||||
| Manage evaluations | [Evaluation Management](evaluation.md) |
|
||||
| View pipeline | [Forgejo Workflows](../06-ci-cd/forgejo-workflows.md) |
|
||||
| Full documentation | [Documentation Index](../index.md) |
|
||||
|
||||
---
|
||||
|
||||
[← Documentation Index](../index.md) | [← Evaluation Management](evaluation.md) | [← Home](../index.md)
|
||||
Loading…
Add table
Add a link
Reference in a new issue