AI/playbooks/docker-security-audit.md

# Playbook: Docker Image Security Audit

Use this playbook to audit Docker images across your environment for vulnerabilities, misconfigurations, secrets, and anything that could compromise the host or network.

When invoked, read `containers.local.txt` in the current working directory and work through each section for every image listed.

---

## How to Use

Tell the AI: _"Run this playbook: https://git.chns.tech/CHNS/AI/raw/branch/main/playbooks/docker-security-audit.md"_

The AI will:
1. Read `containers.local.txt` to get the list of images to audit
2. For each image — pull it, run all checks below, and evaluate results
3. For each finding — report CRITICAL, HIGH, MEDIUM, LOW, or INFO
4. Generate a timestamped report file in the `reports/` directory
5. Remove the image from the local host after scanning **unless** it is currently active (`docker ps`)
6. At the end, give an overall environment summary

Do not skip or abbreviate checks. Run every tool listed for every image.

---

## Pre-Flight

Before scanning any images:

- [ ] Confirm `trivy` is installed: `trivy --version`
- [ ] Confirm `grype` is installed: `grype version`
- [ ] Confirm `hadolint` is installed: `hadolint --version`
- [ ] Confirm Docker daemon is running: `docker info`
- [ ] Confirm a `reports/` directory exists in the working directory

**AI Action:** Run the above checks. If any tool is missing, stop and tell the user what needs to be installed before continuing. Do not proceed with a partial toolset.

### containers.local.txt Bootstrap

Check if `containers.local.txt` exists in the current working directory.

- **If it does not exist:** Create it with the following content, then stop and instruct the user to populate it before running the audit:

```
# Local Docker image inventory
# Format: image:tag — one per line
# This file is excluded from git. Keep it updated per host.
# Images without a tag will be treated as :latest (flagged as WARN in audit).

```

  Tell the user: _"`containers.local.txt` was not found and has been created. Add your image:tag entries (one per line) and re-run the playbook."_

- **If it exists but is empty or contains only comments:** Same as above — stop and prompt the user to add images.
- **If it exists and has valid entries:** Continue to scanning.

---

## Per-Image Procedure

Repeat this entire procedure for **every** `image:tag` entry in `containers.local.txt`.

### Step 1 — Check if Already Active

Run: `docker ps --format '{{.Image}}'`

- If the image is in the active list → mark as **ACTIVE**, skip Step 6 (do not remove)
- If not active → mark as **NOT ACTIVE**, will be removed after scan

### Step 2 — Pull the Image

Run: `docker pull <image:tag>`

- If pull fails → mark as WARN (image may be private, unavailable, or tag deleted) and skip to next image
- Note the image digest and creation date

### Step 3 — CVE Scan (Trivy)

Run: `trivy image --severity UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL --format json <image:tag>`

Report findings grouped by severity:

| Severity | Count | Notable CVEs |
|---|---|---|
| CRITICAL | | |
| HIGH | | |
| MEDIUM | | |
| LOW | | |
| UNKNOWN | | |

- List every CRITICAL and HIGH CVE individually with: CVE ID, affected package, fixed version (if available), and a one-line description of the risk
- Summarise MEDIUM/LOW as counts only unless the user asks for detail
- Flag any CVE with a known public exploit (Trivy marks these with `--ignore-unfixed` info)

### Step 4 — CVE Scan (Grype)

Run: `grype <image:tag> --output json`

- Cross-reference with Trivy results — note any findings unique to Grype that Trivy missed
- Report new findings in the same severity table format as Step 3
- If Grype and Trivy agree, note "Confirmed by both scanners"

### Step 5 — Dockerfile Static Analysis (Hadolint)

Locate the Dockerfile for this image if available in the current working directory or a subdirectory.

Run: `hadolint <path/to/Dockerfile>`

- Report every rule violation with: rule ID, severity, line number, and description
- Flag `DL` (Dockerfile) and `SC` (ShellCheck) rule violations separately
- If no Dockerfile is found locally, note it as INFO and skip this step — do not fail the audit

### Step 7 — Extended Checks (Trivy)

Run each of the following and report all findings:

**Secrets scan:**
`trivy image --scanners secret <image:tag>`
- Report any detected secrets: type, file path inside image, line number if available
- Even LOW confidence matches should be reported — flag clearly as low confidence

**Misconfiguration scan:**
`trivy image --scanners misconfig <image:tag>`
- Dockerfile best practice violations
- CIS Docker Benchmark failures
- Report each finding with: check ID, title, severity, description, and remediation

**License scan:**
`trivy image --scanners license --severity UNKNOWN,HIGH <image:tag>`
- Flag any unknown or restrictive licenses (GPL in a closed deployment, etc.)

**SBOM generation:**
`trivy image --format cyclonedx --output reports/sbom-<image-name>-<date>.json <image:tag>`
- Generate and save an SBOM for the image — no pass/fail, this is for records

### Step 8 — Image Cleanup

After all scans are complete for this image:

- If **NOT ACTIVE**: run `docker rmi <image:tag>`
  - Confirm removal succeeded
  - If removal fails (e.g. stopped container referencing it), note the reason and flag for manual cleanup
- If **ACTIVE**: skip removal, note it in the report

---

## Report Generation

After all images are scanned, generate the following files. All output goes into a timestamped folder:

**Folder structure:**

```
reports/audit-<YYYY-MM-DD>/
  audit-<YYYY-MM-DD>.md          ← overview report
  <image-name>/                  ← one folder per image (name only, no tag, slashes replaced with -)
    <image-name>-<tag>.md        ← per-image detail report
    sbom-<image-name>-<tag>.json ← SBOM
```

Create `reports/audit-<YYYY-MM-DD>/` and each per-image subfolder before writing any files. If folders already exist, continue writing into them.

---

### File 1 — Overview Report (replaces previous audit file)

**Filename:** `reports/audit-<YYYY-MM-DD>/audit-<YYYY-MM-DD>.md`

This is the top-level summary. It must link to each per-image detail file.

#### Summary Table

| Image | CRITICAL | HIGH | MEDIUM | LOW | Secrets | Misconfigs | Hadolint | Status | Removed | Detail File |
|---|---|---|---|---|---|---|---|---|---|---|
| image:tag | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ACTIVE/NOT ACTIVE | YES/NO/FAILED | [image-name/image-name-tag.md](image-name/image-name-tag.md) |

#### Critical & High Findings (All Images)

List every CRITICAL or HIGH CVE across all images — abbreviated to one line each:

```
[image:tag] CVE-XXXX-XXXXX — <package> — <one-line risk> — Fixed in: <version or "none">
```

#### Secrets Detected (All Images)

List every secret finding across all images — one line each:

```
[image:tag] <type> — <file path> — Confidence: HIGH/MEDIUM/LOW — Action: INVESTIGATE/ROTATE IMMEDIATELY
```

#### Cleanup Log

List every image and whether it was removed, skipped (active), or failed to remove.

#### Overall Risk Rating

After all findings:

- **CRITICAL findings present** → Environment status: **AT RISK** — immediate action required
- **HIGH findings only** → Environment status: **ELEVATED** — remediate within 7 days
- **MEDIUM/LOW only** → Environment status: **ACCEPTABLE** — remediate in next maintenance window
- **No findings** → Environment status: **CLEAN**

---

### File 2 — Per-Image Detail Reports

**Filename per image:** `reports/audit-<YYYY-MM-DD>/<image-name>/<image-name>-<tag>.md`

- `<image-name>` is the image name only — no tag, slashes replaced with `-` (e.g. `nginx`, `myrepo-myapp`)
- `<tag>` is the image tag (e.g. `latest`, `1.2.3`)
- Example: `nginx:latest` → `reports/audit-2026-03-22/nginx/nginx-latest.md`
- Example: `myrepo/myapp:1.2.3` → `reports/audit-2026-03-22/myrepo-myapp/myrepo-myapp-1.2.3.md`
- Generate one file per image — do not combine images into a single detail file

Each per-image file must include:

#### Header

```
# Image Audit: <image:tag>
Date:   <YYYY-MM-DD>
Digest: <image digest>
Status: ACTIVE / NOT ACTIVE
```

#### CVE Findings (Trivy + Grype)

Full detail for every CRITICAL and HIGH CVE:

```
CVE:      CVE-XXXX-XXXXX
Package:  <package name and version>
Fixed in: <version> (or "No fix available")
Scanner:  Trivy / Grype / Both
Risk:     <one-line plain-English description>
```

Summarise MEDIUM and LOW as counts with a table — do not list individually unless they have a known public exploit.

#### Secrets

Full detail for every secret finding:

```
Type:       <e.g. AWS key, generic password, private key>
Confidence: HIGH / MEDIUM / LOW
Path:       <file path inside image>
Action:     INVESTIGATE / ROTATE IMMEDIATELY
```

#### Hadolint Findings

Full detail for every Hadolint rule violation:

```
Rule:     <DL or SC rule ID>
Line:     <line number in Dockerfile>
Severity: <error / warning / info>
Detail:   <description>
```

If no Dockerfile was found, note: `Hadolint: Skipped — no Dockerfile located`

#### Misconfigurations

Full detail for every misconfiguration finding:

```
Check:    <CIS ID or Trivy check ID>
Title:    <short title>
Severity: <severity>
Detail:   <description>
Fix:      <remediation>
```

#### License Findings

List any unknown or restrictive licenses flagged by Trivy.

#### SBOM

Note the SBOM filename saved alongside this report:

`SBOM saved: reports/audit-<YYYY-MM-DD>/<image-name>/sbom-<image-name>-<tag>.json`

Update the SBOM output path to match this folder:
`trivy image --format cyclonedx --output reports/audit-<YYYY-MM-DD>/<image-name>/sbom-<image-name>-<tag>.json <image:tag>`

#### Cleanup

Note whether the image was removed, skipped (active), or failed to remove — and the reason if failed.

---

## Severity Reference

| Severity | Action Required |
|---|---|
| CRITICAL | Stop and flag immediately. Do not defer. Notify user before continuing. |
| HIGH | Report in full. User must acknowledge before moving to next image. |
| MEDIUM | Report and continue. Include in final report. |
| LOW | Count only unless user requests detail. |
| INFO | Include in SBOM/records only. |

---

## Notes for the AI

- Never skip an image without documenting why
- Never abbreviate CVE lists for CRITICAL or HIGH findings
- If a scan command fails (not just returns no findings), report the error and the command that failed
- Do not remove an image that appears in `docker ps` output under any circumstances
- If `containers.local.txt` references an image with no tag, assume `:latest` and note it as a WARN (unpinned tag)
- SBOM files are supplemental — save them but do not report their contents unless the user asks