---
stepsCompleted: ["step-01-validate-prerequisites", "step-02-design-epics", "step-03-create-stories", "step-04-final-validation"]
inputDocuments:
  - "/var/www/html/ai_cats/_bmad-output/planning-artifacts/prd.md"
  - "/var/www/html/ai_cats/_bmad-output/planning-artifacts/architecture.md"
  - "/var/www/html/ai_cats/_bmad-output/planning-artifacts/ux-design-specification.md"
---

# AI-Assisted Product Categorization System - Epic Breakdown

## Overview

This document provides the complete epic and story breakdown for the AI-Assisted Product Categorization System, decomposing the requirements from the PRD, UX Design, and Architecture into implementable stories.

## Requirements Inventory

### Functional Requirements

FR1: Data stewards can submit products from Akeneo for AI-powered categorization
FR2: System can categorize products using multiple data sources (part number patterns, manufacturer categories, enrichment data, product descriptions, specifications)
FR3: System can generate confidence scores for each categorization prediction
FR4: System can route predictions below 90% confidence to Akeneo Collaborative Workflow for human review
FR5: System can auto-approve predictions at or above 90% confidence threshold
FR6: Data stewards can review categorization predictions with explainable evidence
FR7: Data stewards can approve or reject AI-suggested categories in Akeneo workflow
FR8: Data stewards can adjust categories for low-confidence predictions
FR9: System can process categorization in batches with status tracking
FR10: System can generate explainable evidence showing reasoning behind each prediction
FR11: System can display evidence trails linking predictions to source data
FR12: System can track confidence score distribution across categorized products
FR13: System can maintain audit trail of all categorization decisions and evidence
FR14: System can receive product data from Akeneo via API
FR15: System can submit categorization results back to Akeneo via API
FR16: System can integrate with Akeneo Collaborative Workflow for human review
FR17: System can track Akeneo workflow approval and rejection outcomes
FR18: System can handle Akeneo API rate limiting and retry logic
FR19: System can import supplier enrichment data via CSV files
FR20: System can validate and normalize imported enrichment data
FR21: System can store product and attribute data in PostgreSQL
FR22: System can preserve original supplier values for traceability
FR23: System can handle SKU/MPN/part numbers as strings without numeric coercion
FR24: System administrators can configure Akeneo API credentials
FR25: System administrators can configure confidence threshold settings
FR26: System can provide configuration dashboard for system settings
FR27: System can manage user authentication and session management
FR28: System can schedule automated backups of system data
FR29: System can log all categorization decisions with timestamps
FR30: System can log errors and failures with detailed context
FR31: System can provide health monitoring dashboard with system metrics
FR32: System can retry failed categorization batches automatically
FR33: System can provide troubleshooting tools for error investigation
FR34: System can send alerts for integration failures and system issues
FR35: System can display categorization status and progress to users

### NonFunctional Requirements

NFR1: Dashboard pages load within 2 seconds on modern browsers
NFR2: Batch categorization processes 5-10 seconds per SKU (100 SKUs = 500-1000 seconds total)
NFR3: Status polling updates display within 5-10 seconds during batch processing
NFR4: CSV enrichment data import completes within 30 seconds for files up to 10,000 rows
NFR5: Akeneo API credentials stored in environment variables, never in code or version control
NFR6: Admin dashboard requires authentication for access
NFR7: All data transmitted to/from Akeneo via HTTPS
NFR8: PostgreSQL database access restricted to application credentials only
NFR9: System logs do not expose sensitive data (API credentials, personal information)
NFR10: Akeneo API calls timeout after 30 seconds with retry logic
NFR11: System handles Akeneo API rate limiting with exponential backoff
NFR12: API data format compatibility maintained with Akeneo version changes
NFR13: Failed API integrations trigger alerts within 5 minutes
NFR14: System maintains 99% uptime during business hours
NFR15: Automated backups complete daily with 1-hour recovery time objective
NFR16: Failed categorization batches automatically retry up to 3 times
NFR17: System maintains audit trail integrity with no data loss

### Additional Requirements

- Project built from scratch with no CLI starter — first story initializes the full project structure
- Composer init + Phinx migration setup as the foundational build tooling
- `.env` / `.env.example` environment configuration pattern must be in place before any credential-dependent stories
- `public/index.php` as application entry point with PHP server-side routing
- Phinx (PHP) as the sole database migration framework — all schema changes go through Phinx migrations
- Session-based authentication with admin/user roles — must be implemented before any protected routes
- CSRF protection + input validation middleware required for all form submissions
- Hybrid PostgreSQL data model: normalized core tables (products, predictions, import_jobs, users) + JSONB columns for flexible data (enrichment data, evidence storage)
- Consistent JSON API response envelope: `{"success": true/false, "data/error": {...}}`
- Application-level file-based logging to `logs/app.log` and `logs/error.log` — no external logging dependency
- Manual deployment via SSH + git pull — no CI/CD pipeline required
- PHP 8.1+ minimum version — agents must not use features unavailable in PHP 8.1
- Python directory structure must be defined and established before the categorization epic begins (architecture gap — Python/PHP boundary not fully specified)
- Queue/worker table structure must be defined before any long-running background job stories (architecture gap — schema not yet specified)
- HTTP client for Akeneo API calls must be selected: Guzzle or native cURL (not yet specified in architecture)
- PHPUnit as test framework for PHP unit and integration tests
- Browser caching headers + asset minification for static assets to meet NFR1 (2-second page load)
- Bootstrap 5 vendored into `public/assets/` — no CDN dependency in production

### UX Design Requirements

UX-DR1: Implement Bootstrap 5 design system — vendor Bootstrap 5 CSS and JS into `public/assets/`; use light mode as default (no `data-bs-theme` attribute, or `data-bs-theme="light"` on `<html>`); override Bootstrap CSS variables in `public/assets/css/styles.css` to match the light palette (body-bg `#f6f8fa`, body-color `#1f2328`, border-color `#d0d7de`, primary `#0969da`)
UX-DR2: Implement confidence badge system — three custom badge variants (`badge-confidence-high` green `#3fb950` for ≥90%, `badge-confidence-medium` amber `#d29922` for 70–89%, `badge-confidence-low` red `#f85149` for <70%) using Bootstrap CSS variable overrides
UX-DR3: Implement semantic status label system — pill badges with fixed, non-reused color-to-status mapping for all 8 product statuses: Predicted (blue), Needs Review (amber), Submitted to Akeneo Workflow (blue), Awaiting Akeneo Approval (amber), Approved in Akeneo (green), Rejected in Akeneo (red), Auto-Approved (muted gray), Import Failed (red)
UX-DR4: Implement fixed left sidebar navigation (220px width) with queue-first primary nav — default navigation destination is the Predictions review queue, not a summary dashboard
UX-DR5: Implement review queue page — full-width Bootstrap `.table-sm` table with per-row: product name/SKU (monospace for identifiers), suggested category, confidence badge, top 2 evidence signals inline; row count display ("X items awaiting review"); default filter shows only `needs_review` status items
UX-DR6: Implement inline expandable evidence rows — clicking a row expands an inline evidence trail showing full reasoning and missing attributes without opening a modal or leaving the list view
UX-DR7: Implement bulk select + floating action bar — per-row checkboxes; floating action bar appears on selection with "Submit X items to Akeneo workflow" and "Adjust category" actions
UX-DR8: Implement AJAX status polling for batch and import jobs — auto-refresh status at 5–10 second intervals without manual page reload; status badges update in-place via fetch
UX-DR9: Implement inline category adjustment — clicking "Adjust category" opens an inline category picker within the row; save confirms with immediate status badge update; no full-page reload or modal
UX-DR10: Implement empty state completion screen — when review queue empties, show "All predictions reviewed. Nothing waiting on you." with a summary count of auto-approved, submitted, and adjusted items
UX-DR11: Implement typography system — system font stack (`-apple-system, BlinkMacSystemFont, "Segoe UI", ...`) for all body text; monospace stack (`"SFMono-Regular", Consolas, ...`) specifically for SKU/MPN/part number fields
UX-DR12: Implement page layout structure — fixed sidebar (220px) + main content area (max-width 1200px centered); dashboard: 3-column stat card row + full-width table below; admin settings: 2-column form layout
UX-DR13: Implement human-readable error display — errors shown as banners below the affected row, never raw SQL/PHP errors; every UI error message must include a suggested next action written for data stewards
UX-DR14: Implement visible queue counter — counter displaying remaining items that decrements in real time as items are acted on during a review session
UX-DR15: Implement persistent Akeneo submission confirmation — submission triggers an immediate, persistent status badge update (not a disappearing toast); status must reflect actual Akeneo state, not just "submitted"

### FR Coverage Map

```
FR1  → Epic 5 — Submit products from Akeneo for AI categorization (batch initiation)
FR2  → Epic 5 — Categorize using multiple data sources (categorization engine)
FR3  → Epic 5 — Generate confidence scores per prediction
FR4  → Epic 5 — Route predictions <90% to Akeneo Collaborative Workflow
FR5  → Epic 5 — Auto-approve predictions ≥90%
FR6  → Epic 6 — Review predictions with explainable evidence (review queue UI)
FR7  → Epic 6 — Approve/reject AI-suggested categories via Akeneo workflow
FR8  → Epic 6 — Adjust categories for low-confidence predictions (inline adjustment)
FR9  → Epic 5 — Batch processing with status tracking
FR10 → Epic 5 — Generate explainable evidence for each prediction
FR11 → Epic 5 — Display evidence trails linking predictions to source data
FR12 → Epic 5 — Track confidence score distribution
FR13 → Epic 5 — Maintain full audit trail of categorization decisions
FR14 → Epic 4 — Receive product data from Akeneo via API
FR15 → Epic 4 — Submit categorization results back to Akeneo via API
FR16 → Epic 4 — Integrate with Akeneo Collaborative Workflow for human review
FR17 → Epic 4 — Track Akeneo workflow approval and rejection outcomes
FR18 → Epic 4 — Handle Akeneo API rate limiting and retry logic
FR19 → Epic 3 — Import supplier enrichment data via CSV files
FR20 → Epic 3 — Validate and normalize imported enrichment data
FR21 → Epic 3 — Store product and attribute data in PostgreSQL
FR22 → Epic 3 — Preserve original supplier values for traceability
FR23 → Epic 3 — Handle SKU/MPN/part numbers as strings (no numeric coercion)
FR24 → Epic 2 — Configure Akeneo API credentials
FR25 → Epic 2 — Configure confidence threshold settings
FR26 → Epic 2 — Configuration dashboard for system settings
FR27 → Epic 1 — User authentication and session management
FR28 → Epic 2 — Schedule automated backups
FR29 → Epic 5 — Log all categorization decisions with timestamps
FR30 → Epic 2 — Log errors and failures with detailed context
FR31 → Epic 2 — Health monitoring dashboard with system metrics
FR32 → Epic 5 — Retry failed categorization batches automatically
FR33 → Epic 2 — Troubleshooting tools for error investigation
FR34 → Epic 4 — Send alerts for integration failures
FR35 → Epic 5 — Display categorization status and progress to users
```

## Epic List

### Epic 1: Project Foundation & Authenticated Access
A working, deployable application skeleton that admins can log into. Establishes the full project structure (Composer, Phinx, `.env`, Bootstrap 5 light-mode design system, sidebar navigation, routing) and session-based authentication with admin/user roles.

**User outcome:** An admin can navigate to the application, log in, and see the main layout — the foundation all other epics build on.
**FRs covered:** FR27
**Additional reqs:** All project initialization requirements, PHP 8.1+ pin, Phinx, `.env`, Bootstrap 5 light-mode design system, sidebar layout, typography
**UX-DRs:** UX-DR1 (Bootstrap 5 light mode), UX-DR4 (sidebar navigation), UX-DR11 (typography), UX-DR12 (page layout)

### Epic 2: System Administration & Monitoring
Admins can configure the system (Akeneo credentials, confidence thresholds), monitor system health, review error logs, access troubleshooting tools, and schedule automated backups — all from the admin dashboard.

**User outcome:** An admin has full operational control and visibility into the system without needing server log access.
**FRs covered:** FR24, FR25, FR26, FR28, FR30, FR31, FR33
**UX-DRs:** UX-DR3 (status label system), UX-DR13 (human-readable error display)

### Epic 3: Supplier Enrichment Data Import
Admins and data stewards can upload supplier enrichment CSV files, see validation results immediately, track import job status, and diagnose row-level failures from the UI — with full string-safe part number handling and original value preservation.

**User outcome:** Enrichment data is in the database and available for categorization, with full visibility into what was imported and what failed.
**FRs covered:** FR19, FR20, FR21, FR22, FR23
**UX-DRs:** UX-DR8 (AJAX job status polling), UX-DR13 (human-readable error banners)

### Epic 4: Akeneo Integration & Workflow Connectivity
The system can pull products from Akeneo via API, submit categorization results back, participate in Akeneo Collaborative Workflow for human review, track approval and rejection outcomes, and handle rate limiting and integration failures with alerts.

**User outcome:** Products flow in from Akeneo, decisions flow back, and the system stays in sync with Akeneo's approval state automatically.
**FRs covered:** FR14, FR15, FR16, FR17, FR18, FR34
**UX-DRs:** UX-DR15 (persistent submission confirmation status)

### Epic 5: AI Categorization Engine
The system can run batch categorization against enriched product data, generate confidence scores and explainable evidence for each prediction, auto-approve at ≥90%, route lower-confidence predictions to Akeneo Collaborative Workflow, maintain a full audit trail of all decisions, and automatically retry failed batches.

**User outcome:** Products pulled from Akeneo are categorized with confidence scores and evidence, decisions are correctly routed, and everything is auditable.
**FRs covered:** FR1, FR2, FR3, FR4, FR5, FR9, FR10, FR11, FR12, FR13, FR29, FR32, FR35
**UX-DRs:** UX-DR2 (confidence badge system), UX-DR8 (batch progress polling)
**Note:** First story resolves the Python/PHP boundary architecture gap before AI implementation begins.

---

## Epic 1: Project Foundation & Authenticated Access

A working, deployable application skeleton that admins can log into. Establishes the full project structure (Composer, Phinx, `.env`, Bootstrap 5 light-mode design system, sidebar navigation, routing) and session-based authentication with admin/user roles.

### Story 1.1: Project Scaffold & Design System

As a developer,
I want a fully scaffolded project with a running application skeleton and Bootstrap 5 design system in place,
So that all future stories have a consistent, runnable foundation to build on.

**Acceptance Criteria:**

**Given** the repository is cloned to a PHP 8.1+ server with PostgreSQL available
**When** `composer install` is run and the web server is pointed to `public/`
**Then** the application responds to HTTP requests without errors
**And** a `.env.example` file exists at the project root documenting all required environment variables
**And** a `.env` file (not committed) can be created from `.env.example` to configure the local environment

**Given** the application is running with a valid `.env`
**When** any page is requested
**Then** the response uses the Bootstrap 5 layout with the light-mode palette applied via CSS variable overrides in `public/assets/css/styles.css`
**And** `--bs-body-bg` is `#f6f8fa`, `--bs-body-color` is `#1f2328`, `--bs-border-color` is `#d0d7de`, and `--bs-primary` is `#0969da`
**And** Bootstrap 5 CSS and JS are served from `public/assets/` (no CDN dependency)
**And** the `<html>` element does not carry `data-bs-theme="dark"`

**Given** the application is running
**When** any page is rendered
**Then** the page uses a fixed left sidebar (220px) and a main content area with max-width 1200px centered
**And** the sidebar contains stub navigation links for: Dashboard, Predictions, Import Jobs, Admin — with placeholder `#` hrefs
**And** body text uses the system font stack: `-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif`
**And** any element displaying a SKU, MPN, or part number uses the monospace stack: `"SFMono-Regular", Consolas, "Liberation Mono", Menlo, monospace`

**Given** the project structure is in place
**When** `vendor/bin/phinx status` is run
**Then** Phinx reports it is connected to the configured PostgreSQL database and shows migration status

**Given** the project is reviewed for structure
**When** the `src/` directory is inspected
**Then** the following directories exist: `Controllers/`, `Services/`, `Models/`, `Utils/`, `Middleware/`, `Views/`
**And** `src/Utils/Database.php` exists as the PDO connection utility
**And** `src/Utils/Logger.php` exists as the application logger writing to `logs/app.log` and `logs/error.log`
**And** `config/database.php`, `config/app.php`, and `config/akeneo.php` exist as PHP configuration files loaded at bootstrap
**And** `db/migrations/` exists as the Phinx migrations directory
**And** `logs/` exists and is writable by the web server user
**And** `public/uploads/csv/` exists and is writable by the web server user

**Given** a request is made to an undefined route
**When** `public/index.php` handles it
**Then** a 404 response is returned with a simple error page (not a PHP error or blank page)

### Story 1.2: User Authentication

As a system administrator,
I want to log in with a username and password and have all application routes protected behind authentication,
So that the system is only accessible to authorized users.

**Acceptance Criteria:**

**Given** the application is running
**When** a user navigates to `/login`
**Then** a login form is displayed with username and password fields and a submit button
**And** the form includes a hidden CSRF token field validated server-side on submission
**And** the page uses the established Bootstrap 5 layout (sidebar hidden or minimal on login page is acceptable)

**Given** valid credentials are submitted to `/login`
**When** `AuthService` validates the username and password against the `users` table (bcrypt hash comparison)
**Then** a server-side session is created storing the user's id and role
**And** the user is redirected to the main dashboard stub
**And** the session cookie is `HttpOnly` and `SameSite=Strict`

**Given** invalid credentials are submitted to `/login`
**When** `AuthService` fails to authenticate
**Then** the login page is re-rendered with a human-readable error message: "Invalid username or password."
**And** no session is created
**And** the failed attempt is logged to `logs/app.log` with timestamp (username attempted, not the password)

**Given** an unauthenticated user attempts to access any route other than `/login`
**When** `AuthMiddleware` inspects the session
**Then** the user is redirected to `/login`
**And** the originally requested URL is preserved so the user is redirected back after successful login

**Given** an authenticated user clicks "Log out"
**When** the logout action is triggered (POST to `/logout`)
**Then** the session is destroyed server-side
**And** the session cookie is cleared
**And** the user is redirected to `/login`

**Given** the `users` table migration is run via Phinx
**When** the schema is inspected
**Then** the `users` table exists with columns: `id` (serial PK), `username` (varchar, unique, not null), `password_hash` (varchar, not null), `role` (varchar, not null, values: `admin` or `user`), `created_at` (timestamp, default now()), `updated_at` (timestamp, default now())
**And** an index exists on `username` (`idx_users_username`)

**Given** the migration has run
**When** a database seed or manual insert creates a default admin user
**Then** the admin user can log in using the credentials defined in `.env` (e.g., `ADMIN_USERNAME`, `ADMIN_PASSWORD`)
**And** the stored password is a bcrypt hash, never plaintext

**Given** a CSRF token mismatch on the login form submission
**When** `CsrfMiddleware` validates the token
**Then** the request is rejected with a 403 response
**And** a human-readable error is displayed: "Invalid form submission. Please try again."

---

## Epic 2: System Administration & Monitoring

Admins can configure the system (Akeneo credentials, confidence thresholds), monitor system health, review error logs, access troubleshooting tools, and schedule automated backups — all from the admin dashboard.

### Story 2.1: Admin Configuration Dashboard

As a system administrator,
I want to configure Akeneo API credentials and the confidence threshold from a settings page,
So that I can control how the system connects to Akeneo and routes categorization decisions without touching server files.

**Acceptance Criteria:**

**Given** an authenticated admin navigates to `/admin/settings`
**When** the page loads
**Then** a settings form is displayed with fields for: Akeneo Base URL, Akeneo Client ID, Akeneo Client Secret, Akeneo Username, Akeneo Password, and Confidence Threshold (%)
**And** current saved values are pre-populated in the form (credentials shown as masked placeholders if set, not exposed in plaintext)
**And** the Confidence Threshold field shows its current value (default: 90) and accepts integer input between 1 and 100
**And** a "Test Akeneo Connection" button is visible alongside the Akeneo credential fields

**Given** an admin submits the settings form with valid values
**When** the form is POST'd to `/admin/settings`
**Then** all values are saved to the `system_settings` table (key/value rows, one per setting)
**And** Akeneo credentials are stored as individual rows with keys: `akeneo_base_url`, `akeneo_client_id`, `akeneo_client_secret`, `akeneo_username`, `akeneo_password`
**And** the confidence threshold is stored with key `confidence_threshold`
**And** a success message is displayed: "Settings saved."
**And** the CSRF token is validated on submission; a mismatch returns a 403 with a human-readable error

**Given** an admin submits the settings form with an invalid confidence threshold (e.g., 0, 101, or non-numeric)
**When** `AdminController` validates the input
**Then** the form is re-rendered with a field-specific error: "Confidence threshold must be a whole number between 1 and 100."
**And** no settings are saved

**Given** the `system_settings` table migration is run via Phinx
**When** the schema is inspected
**Then** the `system_settings` table exists with columns: `id` (serial PK), `key` (varchar, unique, not null), `value` (text), `updated_at` (timestamp, default now()), `updated_by` (integer, FK to `users.id`, nullable)
**And** an index exists on `key` (`idx_system_settings_key`)

**Given** an admin clicks "Test Akeneo Connection"
**When** the AJAX request is sent to `/api/admin/test-akeneo-connection`
**Then** `AkeneoService` attempts to authenticate with the currently saved credentials
**And** a success result displays: "Connection successful." with a green status indicator
**And** a failure result displays a human-readable message: "Connection failed: [reason]" (e.g., "Invalid credentials", "Host unreachable") with a red status indicator
**And** the raw API error is logged to `logs/app.log` but not exposed in the UI

**Given** CSS classes for the status label system are needed across the application
**When** `public/assets/css/styles.css` is reviewed
**Then** the following pill badge CSS classes exist with their defined colors against the light background: `.badge-status-predicted` (blue `#0969da`), `.badge-status-needs-review` (amber `#d29922`), `.badge-status-submitted` (blue `#0969da`), `.badge-status-awaiting-akeneo` (amber `#d29922`), `.badge-status-approved` (green `#3fb950`), `.badge-status-rejected` (red `#f85149`), `.badge-status-auto-approved` (muted gray `#7d8590`), `.badge-status-import-failed` (red `#f85149`)
**And** the three confidence badge classes also exist: `.badge-confidence-high` (green), `.badge-confidence-medium` (amber), `.badge-confidence-low` (red)
**And** no two distinct statuses share the same color class

### Story 2.2: Error Logging & Troubleshooting Tools

As a system administrator,
I want to view application error logs and filter by severity from the admin UI,
So that I can diagnose failures without needing direct server access to log files.

**Acceptance Criteria:**

**Given** an authenticated admin navigates to `/admin/logs`
**When** the page loads
**Then** a paginated table of log entries is displayed, ordered by most recent first, showing 50 entries per page
**And** each row shows: timestamp, level (ERROR / WARNING / INFO), message, and a truncated context summary
**And** a filter control is visible allowing selection of: All Levels, ERROR only, WARNING only, INFO only

**Given** an admin selects a log level filter and submits
**When** the page reloads with the filter applied
**Then** only log entries matching the selected level are displayed
**And** the active filter is visually indicated in the filter control
**And** the URL reflects the filter state (e.g., `/admin/logs?level=error`) so it can be bookmarked or shared

**Given** an admin clicks a log entry row
**When** the row expands inline
**Then** the full log context is displayed (request data, stack trace if present, relevant IDs such as import job ID or product SKU)
**And** no raw database credentials or API secrets are visible in the expanded context

**Given** the log file (`logs/app.log`) contains more than 1,000 entries
**When** the admin views `/admin/logs`
**Then** only the most recent 1,000 entries are available for browsing via pagination
**And** a note indicates the total entry count and the display limit

**Given** an admin clicks "Clear Log"
**When** the POST request is sent to `/admin/logs/clear` with a valid CSRF token
**Then** the active log file is truncated (not deleted)
**And** a new INFO entry is written: "Log cleared by admin [username] at [timestamp]"
**And** the admin is redirected back to `/admin/logs` with a success message: "Log cleared."

**Given** `src/Utils/Logger.php` is reviewed
**When** a log entry is written from any service or controller
**Then** the entry is written as a single JSON line to `logs/app.log` with fields: `timestamp` (ISO 8601), `level`, `message`, `context` (associative array)
**And** `logs/error.log` receives only ERROR-level entries (as a separate write)
**And** no entry ever contains values from `system_settings` for credential keys (`akeneo_client_secret`, `akeneo_password`) or PHP session tokens

### Story 2.3: Health Monitoring Dashboard

As a system administrator,
I want a health dashboard showing system status, recent job statistics, and Akeneo connectivity at a glance,
So that I can confirm the system is operating normally and catch problems before they affect data stewards.

**Acceptance Criteria:**

**Given** an authenticated admin navigates to `/admin` (or `/admin/health`)
**When** the page loads
**Then** three stat cards are displayed in a row using the Bootstrap 3-column grid layout: "Akeneo Connection", "Import Jobs (Last 7 Days)", "Categorization Batches (Last 7 Days)"
**And** below the stat cards, a full-width table shows the 10 most recent import jobs with columns: Job ID, filename, status, row count, created at
**And** the page auto-refreshes its stat values every 30 seconds via AJAX without a full page reload

**Given** the Akeneo Connection stat card is rendered
**When** the last connection test result is retrieved from `system_settings` (key: `akeneo_last_connection_status`)
**Then** the card shows: status label ("Connected" in green or "Disconnected" / "Untested" in red/gray), and the timestamp of the last successful connection
**And** a "Test Now" button triggers the same connection test as Story 2.1 and updates the card in-place

**Given** the Import Jobs stat card is rendered
**When** import job counts are queried from the `import_jobs` table for the last 7 days
**Then** the card shows: total jobs, completed count (green), failed count (red), and in-progress count (blue)
**And** if no `import_jobs` table exists yet, the card shows "No data yet" gracefully without an error

**Given** the Categorization Batches stat card is rendered
**When** batch counts are queried from the database for the last 7 days
**Then** the card shows: total batches, auto-approved count, sent-to-workflow count, failed count
**And** if no categorization data exists yet, the card shows "No data yet" gracefully without an error

**Given** the recent jobs table is rendered
**When** fewer than 10 import jobs exist
**Then** only the available rows are shown with no empty rows or errors
**And** when no jobs exist at all, an empty state message is displayed: "No import jobs yet."

**Given** an authenticated non-admin user (role: `user`) navigates to any `/admin/*` route
**When** `AuthMiddleware` checks the user's role
**Then** the user is redirected to the main dashboard with an error message: "You do not have permission to access that page."

### Story 2.4: Automated Backup Scheduling

As a system administrator,
I want to configure, trigger, and monitor automated PostgreSQL database backups from the admin UI,
So that I can meet the daily backup requirement without manual intervention and verify backups are completing.

**Acceptance Criteria:**

**Given** an authenticated admin navigates to `/admin/backups`
**When** the page loads
**Then** the page shows: the configured backup destination path, the configured backup schedule (cron expression or simple daily/weekly selector), the timestamp and status of the last backup, and a "Run Backup Now" button
**And** a table of the last 10 backup attempts is displayed with columns: timestamp, status (Success / Failed), filename, file size (if successful), and duration

**Given** an admin configures the backup destination path and schedule and saves
**When** the form is POST'd to `/admin/backups/settings` with a valid CSRF token
**Then** the destination path is validated to be an absolute path on the server that the web server user can write to
**And** the path and schedule are saved to `system_settings` (keys: `backup_destination_path`, `backup_schedule`)
**And** a success message is displayed: "Backup settings saved."
**And** if the path is not writable, an error is displayed: "The specified path is not writable. Please check server permissions."

**Given** an admin clicks "Run Backup Now"
**When** the POST request is sent to `/admin/backups/run` with a valid CSRF token
**Then** a `pg_dump` of the configured database is executed by a PHP CLI script (`scripts/backup.php`)
**And** the output file is saved to the configured destination path with filename format: `backup_YYYY-MM-DD_HHMMSS.sql`
**And** the backup attempt is recorded in the `backup_log` table with: timestamp, status, filename, file size, duration in seconds, and any error message
**And** on success, the page refreshes and shows the new backup in the log table with status "Success"
**And** on failure, a human-readable error is displayed: "Backup failed: [reason]" and the failure is recorded in the log and written to `logs/app.log`

**Given** the `backup_log` table migration is run via Phinx
**When** the schema is inspected
**Then** the `backup_log` table exists with columns: `id` (serial PK), `started_at` (timestamp, not null), `completed_at` (timestamp, nullable), `status` (varchar: `success` or `failed`), `filename` (varchar, nullable), `file_size_bytes` (integer, nullable), `duration_seconds` (integer, nullable), `error_message` (text, nullable)

**Given** the configured backup schedule is set to "Daily"
**When** a server cron job runs `php scripts/backup.php` at the scheduled time
**Then** the backup executes identically to a manual "Run Backup Now" trigger
**And** the result is recorded in `backup_log`
**And** `scripts/backup.php` documents the required cron entry in its header comment (e.g., `# Add to crontab: 0 2 * * * php /path/to/scripts/backup.php`)

**Given** the backup log table has more than 30 entries
**When** `scripts/backup.php` completes a run
**Then** backup log entries older than 30 days are pruned automatically

---

## Epic 3: Supplier Enrichment Data Import

Admins and data stewards can upload supplier enrichment CSV files, see validation results immediately, track import job status, and diagnose row-level failures from the UI — with full string-safe part number handling and original value preservation.

### Story 3.1: CSV Upload & Import Job Creation

As a system administrator,
I want to upload a supplier enrichment CSV file and immediately see a new import job created with its status,
So that I know the file was accepted and processing is queued.

**Acceptance Criteria:**

**Given** an authenticated user navigates to `/import-jobs/create`
**When** the page loads
**Then** a file upload form is displayed with a file input accepting `.csv` files only, and a "Upload & Import" submit button
**And** the form includes a CSRF token validated on submission
**And** a link to `/import-jobs` is visible to return to the jobs list

**Given** a user submits a valid `.csv` file under 50MB
**When** the file is uploaded and the form is POST'd to `/import-jobs`
**Then** the file is stored in `public/uploads/csv/` with a sanitized filename: `{import_job_id}_{original_filename}`
**And** an `import_jobs` record is created with status `pending`, the original filename, the stored filepath, the uploading user's ID, and `created_at` timestamp
**And** the user is redirected to `/import-jobs/{id}` (the job detail page) with a success message: "Import job created. Processing will begin shortly."
**And** the import job detail page shows status: "Pending"

**Given** a user submits a file that is not a `.csv` (e.g., `.xlsx`, `.txt`)
**When** `ImportJobController` validates the uploaded file
**Then** the upload is rejected and the form is re-rendered with an error: "Only CSV files are accepted. Please upload a file with a .csv extension."
**And** no file is saved and no import job is created

**Given** a user submits a CSV file larger than 50MB
**When** `ImportJobController` validates the file size
**Then** the upload is rejected with an error: "File too large. Maximum upload size is 50MB."
**And** no file is saved and no import job is created

**Given** the `import_jobs` table migration is run via Phinx
**When** the schema is inspected
**Then** the `import_jobs` table exists with columns: `id` (serial PK), `original_filename` (varchar, not null), `stored_filepath` (varchar, not null), `status` (varchar, not null: `pending`, `processing`, `completed`, `completed_with_errors`, `failed`), `total_rows` (integer, nullable), `processed_rows` (integer, default 0), `success_rows` (integer, default 0), `error_rows` (integer, default 0), `created_by` (integer, FK `users.id`), `created_at` (timestamp, default now()), `started_at` (timestamp, nullable), `completed_at` (timestamp, nullable), `error_message` (text, nullable)
**And** indexes exist on `status` (`idx_import_jobs_status`) and `created_by` (`idx_import_jobs_created_by`)

**Given** a user navigates to `/import-jobs`
**When** the page loads
**Then** a table of all import jobs is displayed, ordered by most recent first, with columns: ID, filename, status (as a `.badge-status-*` pill), total rows, success rows, error rows, created at
**And** each row links to `/import-jobs/{id}` for the detail view
**And** a "New Import" button links to `/import-jobs/create`
**And** if no jobs exist, an empty state message is shown: "No import jobs yet. Upload a CSV file to get started."

### Story 3.2: Import Processing & Data Normalization

As a system administrator,
I want uploaded CSV files to be parsed, validated, and normalized into the database automatically,
So that supplier enrichment data is available for categorization without manual data entry.

**Acceptance Criteria:**

**Given** an import job exists with status `pending`
**When** `scripts/process_import.php {import_job_id}` is run (manually or via cron)
**Then** the job status is updated to `processing` and `started_at` is recorded
**And** the CSV file at `stored_filepath` is opened and parsed row by row using PHP's CSV functions
**And** all column header names are trimmed of leading/trailing whitespace before use
**And** all cell values are trimmed of leading/trailing whitespace before processing

**Given** the CSV is being parsed
**When** a row is read
**Then** at least one of the following columns must be present and non-empty: `sku`, `mpn`, or `part_number` (case-insensitive column matching after trim)
**And** if none of the identifier columns are present in the header row, the entire job fails immediately with error: "CSV must contain at least one identifier column: sku, mpn, or part_number"
**And** if an individual row is missing all identifier values, that row is recorded as an error with reason: "Row {n}: missing product identifier (sku/mpn/part_number)" and processing continues to the next row

**Given** a valid row is being processed
**When** a product identifier is found
**Then** the identifier value is stored as a string — no `intval()`, `floatval()`, or numeric casting is applied under any circumstances
**And** leading zeros, hyphens, and special characters in SKU/MPN/part number values are preserved exactly as they appear in the source CSV

**Given** a valid row is being processed
**When** the product record is created in the `products` table
**Then** a new row is inserted with: `sku` (from `sku` column if present, else null), `mpn` (from `mpn` or `part_number` column if present, else null), `name` (from `name` or `product_name` column if present, else null), `manufacturer` (from `manufacturer` column if present, else null), `manufacturer_category` (from `manufacturer_category` or `category` column if present, else null), `import_job_id` (FK to the current job), `status` = `imported`, `original_data` (JSONB storing the full raw row as-is before any normalization), `created_at` timestamp
**And** all remaining columns not mapped to named product fields are stored as rows in `product_attributes` with `attribute_name` (the column header, trimmed), `attribute_value` (the cell value, trimmed), `original_value` (the raw untrimmed cell value), and `product_id` FK

**Given** a row causes a database error during insert
**When** the error is caught
**Then** the row is skipped and recorded in `import_job_errors` with: `import_job_id`, `row_number`, `field_name` (if determinable), `error_reason` (human-readable, not a raw SQL error), `original_value`
**And** processing continues to the next row
**And** the error is written to `logs/app.log`

**Given** all rows have been processed
**When** the script completes
**Then** `import_jobs` is updated with: `completed_at` timestamp, final `total_rows`, `success_rows`, `error_rows`
**And** if `error_rows` > 0 and `success_rows` > 0, status is set to `completed_with_errors`
**And** if `error_rows` > 0 and `success_rows` = 0, status is set to `failed`
**And** if `error_rows` = 0, status is set to `completed`
**And** a summary is written to `logs/app.log`: "Import job {id} completed: {success} rows imported, {errors} errors"

**Given** the `products` table migration is run via Phinx
**When** the schema is inspected
**Then** the `products` table exists with columns: `id` (serial PK), `import_job_id` (integer, FK `import_jobs.id`), `sku` (varchar, nullable), `mpn` (varchar, nullable), `name` (varchar, nullable), `manufacturer` (varchar, nullable), `manufacturer_category` (varchar, nullable), `status` (varchar, not null, default `imported`), `original_data` (JSONB, not null), `created_at` (timestamp, default now()), `updated_at` (timestamp, default now())
**And** indexes exist on `sku` (`idx_products_sku`), `mpn` (`idx_products_mpn`), `import_job_id` (`idx_products_import_job_id`), and `status` (`idx_products_status`)

**Given** the `product_attributes` table migration is run via Phinx
**When** the schema is inspected
**Then** the `product_attributes` table exists with columns: `id` (serial PK), `product_id` (integer, FK `products.id`), `attribute_name` (varchar, not null), `attribute_value` (text), `original_value` (text), `created_at` (timestamp, default now())
**And** an index exists on `product_id` (`idx_product_attributes_product_id`)

**Given** the `import_job_errors` table migration is run via Phinx
**When** the schema is inspected
**Then** the `import_job_errors` table exists with columns: `id` (serial PK), `import_job_id` (integer, FK `import_jobs.id`), `row_number` (integer, not null), `field_name` (varchar, nullable), `error_reason` (text, not null), `original_value` (text, nullable), `created_at` (timestamp, default now())
**And** an index exists on `import_job_id` (`idx_import_job_errors_import_job_id`)

**Given** a CSV file with 10,000 rows is processed
**When** the script runs
**Then** processing completes within 30 seconds (NFR4)
**And** memory usage does not spike by loading the entire file into memory — rows are streamed/read one at a time

### Story 3.3: Import Job Status & Error Review UI

As a system administrator,
I want to see the live status of an import job and review any row-level errors in detail,
So that I can confirm what was imported successfully and diagnose exactly what failed without needing log file access.

**Acceptance Criteria:**

**Given** an import job is in `pending` or `processing` status and the user is on `/import-jobs/{id}`
**When** the page has loaded
**Then** the status badge auto-updates every 5 seconds via an AJAX fetch to `/api/import-jobs/{id}/status`
**And** the `processed_rows` count and progress indicator update in-place with each poll
**And** polling stops automatically once the status reaches `completed`, `completed_with_errors`, or `failed`
**And** no full page reload is required at any point during polling

**Given** the `/api/import-jobs/{id}/status` endpoint is called
**When** the job record is fetched
**Then** the response is JSON: `{"success": true, "data": {"status": "...", "total_rows": N, "processed_rows": N, "success_rows": N, "error_rows": N}}`
**And** if the job ID does not exist, the response is: `{"success": false, "error": {"code": "NOT_FOUND", "message": "Import job not found."}}`

**Given** an import job has status `completed` or `completed_with_errors`
**When** the user views `/import-jobs/{id}`
**Then** the detail page shows a summary section with: job ID, original filename, final status badge, total rows, success rows (green), error rows (red), started at, completed at, and duration
**And** if `success_rows` > 0, a message is shown: "{N} products imported successfully."
**And** if `error_rows` > 0, an amber warning banner is shown: "{N} rows could not be imported. Review errors below."

**Given** an import job has row-level errors
**When** the errors section of `/import-jobs/{id}` is rendered
**Then** a paginated table of errors is shown (25 per page), ordered by `row_number` ascending, with columns: Row #, Field, Reason, Original Value
**And** the "Reason" column shows the human-readable error message (not a raw SQL or PHP error)
**And** the "Original Value" column shows the raw cell value that caused the error, truncated to 100 characters with a tooltip showing the full value
**And** if there are no errors, the errors section is not shown

**Given** an import job has status `failed` (all rows errored or job-level failure)
**When** the user views `/import-jobs/{id}`
**Then** a red error banner is shown at the top with the job-level error message if one exists
**And** the error rows table is still shown if any row-level errors were recorded before the failure

**Given** a user is on the `/import-jobs` list page
**When** any job in the list has status `pending` or `processing`
**Then** the status badges for those jobs update every 10 seconds via AJAX without a full page reload
**And** jobs that reach a terminal status stop polling and their badge remains static

**Given** a user navigates to `/import-jobs/{id}` for a job they did not create
**When** `AuthMiddleware` and `ImportJobController` check access
**Then** any authenticated user (admin or user role) can view any import job
**And** unauthenticated users are redirected to `/login`

---

## Epic 4: Akeneo Integration & Workflow Connectivity

The system can pull products from Akeneo via API, submit categorization results back, participate in Akeneo Collaborative Workflow for human review, track approval and rejection outcomes, and handle rate limiting and integration failures with alerts.

### Story 4.1: Akeneo Authentication & Product Sync (Inbound)

As a system administrator,
I want to sync products from Akeneo into the local database,
So that the categorization engine has Akeneo product records to work against.

**Acceptance Criteria:**

**Given** Akeneo credentials are saved in `system_settings`
**When** `AkeneoService::authenticate()` is called
**Then** an OAuth2 password grant request is made to the Akeneo token endpoint (`{akeneo_base_url}/api/oauth/v1/token`)
**And** the returned access token is cached in-memory for the duration of the request or script run
**And** if authentication fails (invalid credentials, unreachable host, timeout), an exception is thrown with a human-readable message and the failure is written to `logs/app.log`
**And** `system_settings` key `akeneo_last_connection_status` is updated to `connected` with timestamp on success, or `failed: [reason]` on failure

**Given** `AkeneoService` is authenticated
**When** `AkeneoService::fetchProducts(int $page, int $limit)` is called
**Then** a GET request is made to `{akeneo_base_url}/api/rest/v1/products` with the access token in the `Authorization: Bearer` header
**And** the request includes pagination parameters and a 30-second timeout (NFR10)
**And** on a 429 rate limit response, the service waits and retries with exponential backoff (1s, 2s, 4s) up to 3 attempts before throwing an exception (NFR11)
**And** on a 5xx response, the same retry logic applies
**And** on any unrecoverable error, the exception is logged to `logs/app.log` with the HTTP status code and response body (credentials redacted)

**Given** `scripts/sync_akeneo.php` is run from the CLI
**When** the script executes
**Then** a new `akeneo_sync_log` record is created with status `running` and `started_at`
**And** `AkeneoService` fetches all products from Akeneo using paginated requests until no further pages exist
**And** for each fetched product, the script attempts to find an existing `products` record matching on `sku` or `mpn` (string comparison, case-insensitive)
**And** if a match is found, the existing record's `akeneo_identifier` and `akeneo_synced_at` are updated
**And** if no match is found, a new `products` record is created with `status = 'imported'`, the Akeneo identifier, product identifier fields populated from the Akeneo response, and `original_data` (JSONB) storing the full raw Akeneo product payload
**And** on completion, `akeneo_sync_log` is updated with `completed_at`, final counts, and status `completed`
**And** on failure, status is set to `failed` with `error_message`

**Given** the `akeneo_identifier` column does not yet exist on the `products` table
**When** the Phinx migration for this story is run
**Then** `akeneo_identifier` (varchar, nullable) and `akeneo_synced_at` (timestamp, nullable) are added to the `products` table via `ALTER TABLE`
**And** an index exists on `akeneo_identifier` (`idx_products_akeneo_identifier`)

**Given** the `akeneo_sync_log` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `started_at` (timestamp, not null), `completed_at` (timestamp, nullable), `status` (varchar: `running`, `completed`, `failed`), `products_fetched` (integer, default 0), `products_created` (integer, default 0), `products_updated` (integer, default 0), `error_message` (text, nullable)

**Given** an authenticated admin is on the health dashboard (`/admin`)
**When** they click "Sync Products from Akeneo"
**Then** a POST request is made to `/api/admin/sync-akeneo` which runs `scripts/sync_akeneo.php` as a background process
**And** the button is disabled and replaced with "Sync in progress…" while the sync runs
**And** the last sync status and timestamp (from `akeneo_sync_log`) are visible on the dashboard, refreshing every 10 seconds via AJAX polling until the sync reaches a terminal status

### Story 4.2: Outbound Category Submission & Rate Limiting

As a system administrator,
I want the system to submit approved categorization results back to Akeneo reliably,
So that approved category assignments appear in Akeneo without manual data entry, even when the Akeneo API is slow or rate-limiting.

**Acceptance Criteria:**

**Given** a product with an `akeneo_identifier` has an approved category assignment
**When** `AkeneoService::submitCategoryUpdate(string $akeneIdentifier, string $categoryCode)` is called
**Then** a PATCH request is made to `{akeneo_base_url}/api/rest/v1/products/{akeneIdentifier}` with the category code in the request body and the access token in the `Authorization: Bearer` header
**And** the request has a 30-second timeout (NFR10)
**And** on a successful 2xx response, the method returns `true`

**Given** the Akeneo API responds with HTTP 429 (rate limited)
**When** `AkeneoService::submitCategoryUpdate` handles the response
**Then** the request is retried with exponential backoff: 1 second before retry 1, 2 seconds before retry 2, 4 seconds before retry 3 (NFR11)
**And** if all 3 retries are exhausted, an exception is thrown: "Akeneo API rate limit exceeded after 3 retries."
**And** the failure is logged to `logs/app.log` with the product identifier and attempted category code

**Given** the Akeneo API responds with HTTP 5xx
**When** `AkeneoService::submitCategoryUpdate` handles the response
**Then** the same 3-attempt retry logic applies as for rate limiting
**And** on exhausted retries, an exception is thrown and logged

**Given** `AkeneoService::submitCategoryUpdate` throws an unrecoverable exception
**When** the calling code catches the exception
**Then** `system_settings` key `akeneo_last_integration_failure` is updated with timestamp and error summary
**And** an ERROR entry is written to `logs/app.log` with full context (product identifier, category code, HTTP status, response body with credentials redacted)
**And** the admin health dashboard displays the integration failure in the Akeneo Connection stat card on next load or poll (FR34)

**Given** `AkeneoService::submitCategoryUpdate` is called with an `akeneIdentifier` that does not exist in Akeneo
**When** Akeneo responds with HTTP 404
**Then** no retry is attempted (404 is not a transient error)
**And** an exception is thrown: "Akeneo product not found: {identifier}"
**And** the failure is logged

**Given** the Akeneo access token has expired mid-session
**When** any `AkeneoService` method receives a 401 Unauthorized response
**Then** the service re-authenticates once using stored credentials and retries the original request
**And** if re-authentication also fails, an exception is thrown and logged

### Story 4.3: Akeneo Collaborative Workflow Submission & Outcome Tracking

As a data steward,
I want low-confidence predictions to be submitted to Akeneo Collaborative Workflow and their approval or rejection outcomes to be reflected automatically in the system,
So that I can manage category review entirely through Akeneo's standard approval process.

**Acceptance Criteria:**

**Given** a product prediction with confidence below the configured threshold has been marked for workflow submission
**When** `AkeneoService::submitToCollaborativeWorkflow(string $akeneIdentifier, string $categoryCode, string $evidenceSummary)` is called
**Then** a request is made to the Akeneo Collaborative Workflow API to create a draft or proposal for the product's category assignment
**And** the same 30-second timeout, 3-retry, and exponential backoff rules from Story 4.2 apply
**And** on success, the Akeneo proposal identifier returned by the API is stored in the `akeneo_workflow` record's `akeneo_proposal_code` field
**And** the `akeneo_workflow` record's `workflow_status` is updated to `submitted` and `submitted_at` is recorded

**Given** the `akeneo_workflow` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `product_id` (integer, FK `products.id`, not null), `prediction_id` (integer, nullable — FK to `predictions.id`, to be populated in Epic 5), `akeneo_product_identifier` (varchar, not null), `suggested_category_code` (varchar, not null), `workflow_status` (varchar, not null: `pending`, `submitted`, `approved`, `rejected`, `failed`), `akeneo_proposal_code` (varchar, nullable), `submitted_at` (timestamp, nullable), `outcome_received_at` (timestamp, nullable), `error_message` (text, nullable), `created_at` (timestamp, default now()), `updated_at` (timestamp, default now())
**And** indexes exist on `product_id` (`idx_akeneo_workflow_product_id`), `workflow_status` (`idx_akeneo_workflow_status`), and `akeneo_proposal_code` (`idx_akeneo_workflow_proposal_code`)

**Given** one or more `akeneo_workflow` records exist with `workflow_status = 'submitted'`
**When** `scripts/poll_akeneo_workflow.php` is run (manually or via cron)
**Then** for each submitted record, `AkeneoService::getProposalStatus(string $proposalCode)` checks the current status in Akeneo
**And** if Akeneo returns an approved status, `workflow_status` is updated to `approved` and `outcome_received_at` is set
**And** if Akeneo returns a rejected status, `workflow_status` is updated to `rejected` and `outcome_received_at` is set
**And** if the proposal is still pending in Akeneo, the record is unchanged and checked again on the next poll run
**And** the poll script logs a summary to `logs/app.log`: "Workflow poll complete: {approved} approved, {rejected} rejected, {pending} still pending"

**Given** `AkeneoService::submitToCollaborativeWorkflow` fails after exhausting retries
**When** the exception is caught
**Then** the `akeneo_workflow` record's `workflow_status` is set to `failed` with the error stored in `error_message`
**And** `system_settings` key `akeneo_last_integration_failure` is updated
**And** an ERROR entry is written to `logs/app.log` (FR34)
**And** the admin health dashboard reflects the integration failure on next poll

**Given** an `akeneo_workflow` record's `workflow_status` transitions
**When** the status change is saved to the database
**Then** `products.status` for the associated product is updated in the same transaction: `submitted` → `products.status = 'submitted'`; `approved` → `products.status = 'approved'`; `rejected` → `products.status = 'rejected'`
**And** `products.updated_at` is updated

**Given** a products status API endpoint exists (implemented in Epic 5)
**When** a product has an associated `akeneo_workflow` record
**Then** the workflow status is included in the response so the UI can reflect the actual Akeneo approval state persistently — not just "submitted" (UX-DR15)

**Given** `scripts/poll_akeneo_workflow.php` is intended to run on a schedule
**When** the script file is reviewed
**Then** its header comment documents the recommended cron entry: `# Add to crontab: */5 * * * * php /path/to/scripts/poll_akeneo_workflow.php`

---

## Epic 5: AI Categorization Engine

The system can run batch categorization against enriched product data, generate confidence scores and explainable evidence for each prediction, auto-approve at ≥90%, route lower-confidence predictions to Akeneo Collaborative Workflow, maintain a full audit trail of all decisions, and automatically retry failed batches.

### Story 5.1: Python/PHP Boundary & Categorization Infrastructure

As a developer,
I want a clearly defined Python directory structure and database schema for predictions and evidence,
So that the AI categorization work has a consistent home and all future categorization stories build on the same foundation.

**Acceptance Criteria:**

**Given** the project root is inspected
**When** the Python directory structure is reviewed
**Then** the following structure exists:
```
python/
├── requirements.txt
├── .gitignore          (excludes .venv/, __pycache__/, *.pyc)
├── config.py           (reads DB connection from environment variables)
├── categorize.py       (entry point: accepts batch_id as CLI argument)
├── services/
│   ├── __init__.py
│   ├── categorizer.py
│   ├── evidence_builder.py
│   └── confidence_scorer.py
└── models/
    ├── __init__.py
    └── database.py     (psycopg2 connection using env vars from .env)
```
**And** `python/requirements.txt` lists at minimum: `psycopg2-binary` and `python-dotenv`
**And** `docs/python-setup.md` documents how to create the virtual environment and install dependencies: `python3 -m venv python/.venv && python/.venv/bin/pip install -r python/requirements.txt`

**Given** `python/config.py` is reviewed
**When** the database connection is initialized
**Then** it reads `DB_HOST`, `DB_PORT`, `DB_NAME`, `DB_USER`, `DB_PASSWORD` from environment variables loaded from `.env` via `python-dotenv`
**And** no credentials are hardcoded anywhere in the Python source files

**Given** the PHP/Python communication contract is reviewed
**When** a categorization batch is submitted
**Then** PHP writes a `categorization_batches` record with status `pending` and product IDs as a JSONB array
**And** PHP spawns `python3 python/categorize.py {batch_id}` as a background process with output redirected to `logs/categorize_{batch_id}.log`
**And** Python reads the batch record, writes results to `predictions` and `evidence` tables, and updates `categorization_batches` status as it progresses
**And** PHP polls the `categorization_batches` status via the API endpoint — it does not parse Python stdout or stderr directly

**Given** the `categorization_batches` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `status` (varchar, not null: `pending`, `processing`, `completed`, `completed_with_errors`, `failed`), `product_ids` (JSONB, not null), `total_products` (integer, nullable), `processed_products` (integer, default 0), `success_products` (integer, default 0), `error_products` (integer, default 0), `retry_count` (integer, default 0), `created_by` (integer, FK `users.id`), `created_at` (timestamp, default now()), `started_at` (timestamp, nullable), `completed_at` (timestamp, nullable), `error_message` (text, nullable)
**And** an index exists on `status` (`idx_categorization_batches_status`)

**Given** the `predictions` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `product_id` (integer, FK `products.id`, not null), `batch_id` (integer, FK `categorization_batches.id`, not null), `suggested_category_code` (varchar, not null), `suggested_category_label` (varchar, not null), `confidence_score` (numeric(5,2), not null), `confidence_level` (varchar, not null: `high`, `medium`, `low`), `status` (varchar, not null: `predicted`, `needs_review`, `auto_approved`, `submitted`, `approved`, `rejected`), `created_at` (timestamp, default now()), `updated_at` (timestamp, default now())
**And** indexes exist on `product_id` (`idx_predictions_product_id`), `batch_id` (`idx_predictions_batch_id`), and `status` (`idx_predictions_status`)

**Given** the `evidence` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `prediction_id` (integer, FK `predictions.id`, not null), `source_type` (varchar, not null: `part_number_pattern`, `manufacturer_category`, `enrichment_attribute`, `product_description`, `category_rule`), `source_label` (varchar, not null), `evidence_value` (text, not null), `weight` (numeric(4,3), not null — 0.000 to 1.000), `created_at` (timestamp, default now())
**And** an index exists on `prediction_id` (`idx_evidence_prediction_id`)

**Given** the `audit_log` table migration is run via Phinx
**When** the schema is inspected
**Then** the table exists with columns: `id` (serial PK), `entity_type` (varchar, not null), `entity_id` (integer, not null), `action` (varchar, not null), `actor_type` (varchar, not null: `system`, `user`), `actor_id` (integer, nullable), `metadata` (JSONB, nullable), `created_at` (timestamp, default now())
**And** indexes exist on `entity_type` + `entity_id` (`idx_audit_log_entity`) and `created_at` (`idx_audit_log_created_at`)

**Given** the `akeneo_workflow` table exists from Epic 4 with a nullable `prediction_id` column
**When** the Phinx migration for this story is run
**Then** a FK constraint is added: `akeneo_workflow.prediction_id` references `predictions.id` ON DELETE SET NULL

### Story 5.2: Batch Categorization Job Submission

As a data steward,
I want to select products and submit them for AI categorization, then watch the batch progress in real time,
So that I can start the categorization process and know when results are ready to review.

**Acceptance Criteria:**

**Given** an authenticated user navigates to `/products`
**When** the page loads
**Then** a table of imported products is displayed with columns: SKU (monospace), MPN (monospace), Name, Manufacturer, Status badge, Created At
**And** each row has a checkbox for selection, with a "Select All" checkbox at the top
**And** a "Submit for Categorization" button is disabled when no rows are selected and shows "Submit X selected products" when one or more rows are checked

**Given** a user selects one or more products and clicks "Submit for Categorization"
**When** the form is POST'd to `/categorization-batches`
**Then** a `categorization_batches` record is created with status `pending`, `product_ids` as a JSONB array, and `created_by` set to the current user's ID
**And** PHP spawns `python3 python/categorize.py {batch_id}` as a background process with output to `logs/categorize_{batch_id}.log`
**And** the user is redirected to `/categorization-batches/{id}` with message: "Categorization batch submitted. Processing {N} products."

**Given** a user selects products that already have existing predictions
**When** the batch is submitted
**Then** a warning is displayed before submission: "{N} of your selected products already have predictions. Re-categorizing will replace their existing predictions. Continue?"
**And** the user must confirm before the batch is created

**Given** `python/categorize.py {batch_id}` is running
**When** the script processes each product
**Then** `categorization_batches.processed_products` is incremented in the DB after each product so PHP polling shows live progress
**And** on full completion: `status = 'completed'` or `completed_with_errors'`, `completed_at` = now()
**And** on script-level failure: `status = 'failed'`, `error_message` is set

**Given** a user is on `/categorization-batches/{id}` with a batch in `pending` or `processing` status
**When** the page has loaded
**Then** status badge and progress counter auto-update every 5 seconds via AJAX to `/api/categorization-batches/{id}/status` (UX-DR8)
**And** polling stops once status reaches a terminal state
**And** on completion, the page shows: total, success count, error count, and a link to the Predictions queue filtered to this batch

**Given** the `/api/categorization-batches/{id}/status` endpoint is called
**When** the batch record is fetched
**Then** the response is: `{"success": true, "data": {"status": "...", "total_products": N, "processed_products": N, "success_products": N, "error_products": N}}`

**Given** a user navigates to `/categorization-batches`
**When** the page loads
**Then** all batches are shown ordered by most recent first, with columns: ID, status badge, total products, success, errors, created by, created at, duration
**And** batches in `pending` or `processing` status have status badges live-updating every 10 seconds via AJAX

### Story 5.3: Confidence Scoring & Threshold Routing

As a system administrator,
I want the categorization engine to automatically approve high-confidence predictions and flag low-confidence ones for human review,
So that products above the confidence threshold flow directly to Akeneo without manual intervention.

**Acceptance Criteria:**

**Given** `python/services/categorizer.py` processes a product
**When** a category prediction is generated
**Then** `python/services/confidence_scorer.py` computes a `confidence_score` (0.00–100.00) based on the weighted combination of available evidence signals
**And** `confidence_level` is derived from the score: `high` if score ≥ configured threshold (default 90), `medium` if score ≥ 70 and < threshold, `low` if score < 70
**And** the confidence score, level, suggested category code and label are written to `predictions` with `status = 'predicted'`
**And** `products.status` is updated to `predicted` in the same transaction

**Given** `python/categorize.py` has completed writing predictions for a batch
**When** `CategorizationService::applyThresholdRouting(int $batchId)` is called by PHP
**Then** for each prediction with `confidence_score >= confidence_threshold` (read fresh from `system_settings`):
- `predictions.status` → `auto_approved`
- `products.status` → `auto_approved`
- `AkeneoService::submitCategoryUpdate` is called if `akeneo_identifier` is set
- An `audit_log` entry is written: action `auto_approved`, actor_type `system`, metadata includes `confidence_score` and `category_code`
**And** for each prediction with `confidence_score < confidence_threshold`:
- `predictions.status` → `needs_review`
- `products.status` → `needs_review`
- An `audit_log` entry is written: action `flagged_for_review`, actor_type `system`

**Given** `AkeneoService::submitCategoryUpdate` fails for an auto-approved product
**When** the exception is caught in `CategorizationService::applyThresholdRouting`
**Then** the product's prediction status is reverted to `needs_review`
**And** an ERROR entry is written to `logs/app.log` with product ID, category, and error reason
**And** processing continues to the next prediction

**Given** a batch has completed routing
**When** `CategorizationService::getConfidenceDistribution(int $batchId)` is called
**Then** it returns: count of high/medium/low confidence predictions, total products, average confidence score
**And** this summary is displayed on the batch detail page (FR12)

**Given** confidence badges are rendered anywhere in the UI
**When** a prediction's `confidence_level` is displayed
**Then** the correct CSS class is applied: `badge-confidence-high` (≥ threshold), `badge-confidence-medium` (70–89), `badge-confidence-low` (< 70) (UX-DR2)
**And** the badge always shows both the color treatment and the numeric percentage (e.g., "94%")

### Story 5.4: Explainable Evidence Generation & Audit Trail

As a data steward,
I want to see exactly what data drove each AI category prediction,
So that I can make an informed decision about whether to trust, adjust, or reject it.

**Acceptance Criteria:**

**Given** `python/services/evidence_builder.py` processes a product during categorization
**When** a prediction is generated
**Then** one or more evidence records are written to the `evidence` table, each with: `source_type`, `source_label` (human-readable), `evidence_value` (specific data used), and `weight` (0.000–1.000)
**And** the categorizer attempts to generate evidence from every available source type — records are only omitted when that source has no usable data for the product

**Given** a product has a `part_number_pattern` evidence source
**When** the evidence record is stored
**Then** `source_label` is "Part Number Pattern" and `evidence_value` describes the match in plain English (e.g., "SKU '0782-ABC' matches the pattern for category 'Industrial Fasteners'")

**Given** a product has a `manufacturer_category` evidence source
**When** the evidence record is stored
**Then** `source_label` is "Manufacturer Category" and `evidence_value` includes the raw manufacturer category value and its mapping to the suggested category

**Given** a product has enrichment attributes from a CSV import
**When** `enrichment_attribute` evidence records are stored
**Then** one evidence record is created per contributing attribute, with `source_label` = the attribute name and `evidence_value` = the attribute value

**Given** a product has no usable evidence from any source
**When** the confidence score is computed
**Then** a minimum confidence floor of 10.00 is applied
**And** one fallback evidence record is written: `source_type = 'category_rule'`, `source_label = 'Fallback'`, `evidence_value = 'No specific evidence found; prediction based on default category rules'`, `weight = 0.001`

**Given** any prediction is created, auto-approved, submitted to workflow, approved, or rejected
**When** the state change occurs
**Then** an `audit_log` record is written with entity, action, actor, and metadata (confidence score, category code, product ID)
**And** existing `audit_log` records are never modified or deleted — the table is append-only (NFR17)

**Given** a prediction detail is requested
**When** `EvidenceService::getEvidenceForPrediction(int $predictionId)` is called
**Then** it returns all evidence records ordered by `weight` descending
**And** each item includes: source type, source label, evidence value, and weight as a percentage of that prediction's total weight
**And** this data is available via `/api/predictions/{id}/evidence` returning the standard JSON envelope

### Story 5.5: Batch Retry & Failure Recovery

As a data steward,
I want failed categorization batches and individual predictions to be retriable,
So that transient errors do not require re-submitting an entire set of products from scratch.

**Acceptance Criteria:**

**Given** a `categorization_batches` record has `status = 'failed'` or `completed_with_errors'`
**When** `python/categorize.py {batch_id}` is run again
**Then** if `retry_count < 3`, the batch is re-processed: `retry_count` is incremented, `status` is set to `processing`, and only products without a successful prediction are re-categorized (already-successful predictions are skipped)
**And** if `retry_count >= 3`, the script exits immediately and logs: "Batch {id} has reached the maximum retry limit (3). Manual intervention required."
**And** an `audit_log` entry is written for each retry: action `batch_retry_started`, metadata includes `retry_count`

**Given** a batch has `status = 'failed'` and `retry_count < 3` on `/categorization-batches/{id}`
**When** an authenticated admin views the page
**Then** a "Retry Batch" button is visible showing remaining retries: "Retry Batch (2 retries remaining)"
**And** clicking it sends a POST to `/categorization-batches/{id}/retry` with a CSRF token

**Given** a POST is made to `/categorization-batches/{id}/retry`
**When** `CategorizationService::retryBatch(int $batchId)` is called
**Then** if `retry_count < 3`, the Python script is spawned in the background and the user is redirected to the batch detail page with: "Retry started."
**And** if `retry_count >= 3`, the request is rejected with a 400 response: "This batch has reached the maximum retry limit. Please create a new batch."

**Given** `python/categorize.py` encounters an unhandled exception for a single product
**When** the exception is caught within the product processing loop
**Then** an error is recorded in `error_products` and an `audit_log` entry is written for that product's prediction failure
**And** the script continues processing remaining products in the batch
**And** after all products are attempted, if any errors occurred, `status = 'completed_with_errors'`

**Given** the Python process is killed mid-batch leaving `status = 'processing'`
**When** an admin views the batch detail page and `started_at` is more than 30 minutes ago
**Then** a warning banner is shown: "This batch may be stalled. The process has been running for over 30 minutes."
**And** the "Retry Batch" button is available to restart it (counts as a retry)

**Given** a batch has `status = 'completed_with_errors'`
**When** the admin views `/categorization-batches/{id}`
**Then** a list of failed product IDs is shown with the error reason for each
**And** a "Retry Failed Products Only" button creates a new `categorization_batches` record containing only the failed product IDs
**And** clicking it sends a POST to `/categorization-batches/{id}/retry-failed`

---

## Epic 6: Data Steward Review Interface

Data stewards see only items that need their attention, can read confidence scores and inline evidence without clicking through, bulk-submit items to Akeneo Collaborative Workflow, adjust categories inline, and watch their queue counter drop to zero.

### Story 6.1: Predictions Review Queue

As a data steward,
I want a review queue that shows only the predictions requiring my attention, with confidence scores and top evidence signals visible per row,
So that I can scan and act on items efficiently without opening individual detail views.

**Acceptance Criteria:**

**Given** an authenticated user navigates to `/predictions`
**When** the page loads
**Then** the default view shows only predictions with `status = 'needs_review'`, ordered by `confidence_score` ascending (lowest confidence first)
**And** a queue counter is prominently displayed: "X items awaiting review"
**And** if zero `needs_review` predictions exist, the empty state from Story 6.5 is shown instead of an empty table

**Given** the predictions table is rendered
**When** rows are displayed
**Then** each row contains: checkbox (left), SKU in monospace font, MPN in monospace font, Product Name, Suggested Category, Confidence Badge (color-coded per UX-DR2 + numeric %, e.g., "74%"), and the top 2 evidence signals inline as compact text showing only `source_label`
**And** evidence signals are the 2 highest-weight evidence records for that prediction
**And** rows use Bootstrap `.table-sm` for compact density
**And** SKU and MPN columns use the monospace font stack

**Given** the predictions page is loaded
**When** the status filter tabs are rendered
**Then** the following tabs are visible: "Needs Review" (default, active), "Submitted", "Awaiting Akeneo Approval", "Approved", "Rejected", "Auto-Approved", "All"
**And** each tab shows a count badge with the number of predictions in that status
**And** clicking a tab reloads the table filtered to that status via AJAX without a full page reload
**And** the URL updates to reflect the active filter (e.g., `/predictions?status=approved`)

**Given** the "Needs Review" tab is active
**When** a prediction is acted on (submitted, adjusted, or approved)
**Then** that row is removed from the table immediately without a full page reload
**And** the queue counter decrements by 1 (UX-DR14)
**And** if the queue counter reaches 0, the empty state from Story 6.5 replaces the table

**Given** a user wants to filter the review queue further
**When** the filter controls are rendered above the table
**Then** a text search input filters by SKU, MPN, or product name (client-side filtering on current page rows)
**And** a confidence level filter allows narrowing by: All, High (≥ threshold), Medium (70 – threshold), Low (< 70)

**Given** an authenticated user navigates to `/predictions` as a sidebar nav item
**When** the sidebar is rendered
**Then** the "Predictions" nav link is active and routes to `/predictions` as the primary navigation destination (UX-DR4)
**And** the sidebar nav item shows the current `needs_review` count as a badge (e.g., "Predictions 14") updated on each page load

### Story 6.2: Inline Evidence Expansion

As a data steward,
I want to expand a prediction row to see the full evidence trail without leaving the list view,
So that I can understand exactly why a category was suggested and decide in seconds whether to trust it.

**Acceptance Criteria:**

**Given** a user clicks anywhere on a prediction row (excluding the checkbox)
**When** the row is expanded
**Then** an inline evidence panel appears directly below the row within the table — no modal, no page navigation, no scroll jump (UX-DR6)
**And** the expanded panel shows all evidence records for that prediction, ordered by `weight` descending
**And** each evidence item displays: `source_label` (bold), `evidence_value` (plain text below), and a weight indicator (percentage of total weight for that prediction)
**And** clicking the row again collapses the panel

**Given** the evidence panel is expanded
**When** the evidence records are rendered
**Then** a heading "Why this category was suggested" appears above the evidence list
**And** evidence items with `source_label = 'Fallback'` are visually de-emphasized with a note: "Limited data available — prediction confidence is low"
**And** the total is shown: "Based on X evidence signals"

**Given** the evidence panel is expanded
**When** the product has attributes in `product_attributes` with empty or null `attribute_value`
**Then** a "Missing Data" section appears below the evidence list, listing attribute names with empty values
**And** the section is labeled: "These attributes are missing and could improve confidence if provided"
**And** if no attributes are missing, this section is not shown

**Given** the evidence panel is expanded
**When** the prediction has an associated `akeneo_workflow` record
**Then** the panel shows the current Akeneo workflow status: "Akeneo Workflow: [status badge]"
**And** this status is fetched from the `akeneo_workflow` table, not just `products.status`

**Given** the evidence data is fetched for an expanded row
**When** the AJAX request is made to `/api/predictions/{id}/evidence`
**Then** the response includes all evidence records plus the product's `product_attributes`
**And** the panel renders within 1 second for typical record counts
**And** if the request fails, an inline error shows within the panel: "Could not load evidence. Please try again." — the row remains expanded

**Given** multiple rows are expanded simultaneously
**When** a second row is clicked
**Then** the previously expanded row remains open — multiple rows can be expanded at once

### Story 6.3: Bulk Submit to Akeneo Collaborative Workflow

As a data steward,
I want to select multiple predictions and submit them to Akeneo Collaborative Workflow in one action,
So that I can process a batch of low-confidence items without submitting each one individually.

**Acceptance Criteria:**

**Given** a user is on `/predictions` with `needs_review` items visible
**When** one or more row checkboxes are checked
**Then** a floating action bar appears at the bottom of the viewport (UX-DR7)
**And** the action bar shows: "X items selected", a "Submit to Akeneo Workflow" primary button, and a "Clear selection" link
**And** the action bar persists as the user scrolls
**And** unchecking all rows hides the action bar

**Given** a user clicks the header "Select All" checkbox
**When** all visible rows are selected
**Then** the floating action bar appears with the count of all currently visible rows
**And** a notice appears: "X items on this page selected. Select all Y items in queue?" with a link to extend selection to all `needs_review` predictions

**Given** a user clicks "Submit to Akeneo Workflow" in the floating action bar
**When** the POST is sent to `/predictions/submit-to-workflow` with selected IDs and a CSRF token
**Then** for each prediction ID, `AkeneoService::submitToCollaborativeWorkflow` is called
**And** an `akeneo_workflow` record is created for each successful submission
**And** `predictions.status` and `products.status` are updated to `submitted` for each success
**And** an `audit_log` entry is written for each: action `submitted_to_workflow`, actor_type `user`, actor_id = current user

**Given** the submission request completes
**When** the JSON response is processed by the frontend
**Then** successfully submitted rows have their status badge updated in-place to "Submitted to Akeneo Workflow" — no page reload (UX-DR15)
**And** submitted rows are removed from the "Needs Review" table and the queue counter decrements

**Given** one or more submissions fail
**When** the response is processed
**Then** failed rows remain in the queue with status unchanged
**And** an error banner appears: "{M} items could not be submitted. The remaining {N} were submitted successfully. [View errors]"
**And** expanding [View errors] lists each failed SKU with a human-readable reason (e.g., "Akeneo product not found", "API rate limit exceeded")

**Given** a user submits a single prediction via a per-row "Submit" action button
**When** the POST is made to `/predictions/{id}/submit-to-workflow`
**Then** the same workflow applies for that single prediction and the row's status badge updates in-place on success

### Story 6.4: Inline Category Adjustment

As a data steward,
I want to change the suggested category for a prediction directly in the review queue row,
So that I can correct an AI suggestion without navigating away from the list.

**Acceptance Criteria:**

**Given** a user clicks the "Adjust" action on a prediction row
**When** inline adjustment mode activates
**Then** the "Suggested Category" cell is replaced in-place with a text input with autocomplete dropdown, pre-populated with the current suggested category (UX-DR9)
**And** a "Save" button and "Cancel" button appear inline within the row
**And** the rest of the row remains visible and the table does not shift layout

**Given** the category selector input is active
**When** the user types into the input
**Then** matching categories from the `categories` table are shown in a dropdown (up to 10 results), filtered against `akeneo_code` and `label`
**And** results show both label and code: "Industrial Fasteners (cat_123)"
**And** if no matches are found, the dropdown shows: "No matching categories found."

**Given** the `categories` table does not yet exist
**When** the Phinx migration for this story is run
**Then** the `categories` table is created with columns: `id` (serial PK), `akeneo_code` (varchar, unique, not null), `label` (varchar, not null), `parent_code` (varchar, nullable), `synced_at` (timestamp, default now())
**And** an index exists on `akeneo_code` (`idx_categories_akeneo_code`)
**And** `AkeneoService::fetchCategories()` is implemented to populate this table from Akeneo's category API, added to `scripts/sync_akeneo.php` alongside the existing product sync

**Given** a user selects a category from the dropdown and clicks "Save"
**When** the POST is made to `/predictions/{id}/adjust`
**Then** `predictions.suggested_category_code` and `predictions.suggested_category_label` are updated
**And** `predictions.status` remains `needs_review` (adjusted predictions still need workflow submission)
**And** an `audit_log` entry is written: action `category_adjusted`, actor_type `user`, actor_id = current user, metadata includes `old_category_code`, `new_category_code`, `confidence_score`
**And** the row's category cell updates in-place — no page reload
**And** the confidence badge is unchanged (reflects the original AI prediction score)

**Given** a user clicks "Save" with text that does not match any category
**When** `PredictionController` validates the input
**Then** an inline error appears within the row: "Please select a valid category from the list."
**And** the prediction is not updated

**Given** a user clicks "Cancel"
**When** the cancel action is triggered
**Then** the row reverts to its original display state with no changes saved

**Given** a prediction has been manually adjusted
**When** the row is displayed
**Then** a small "Adjusted" indicator (e.g., a muted label or pencil icon) is visible next to the category name
**And** the evidence expansion panel still shows the original AI evidence — it does not change on adjustment

**Given** the `/api/categories/search` endpoint is called with a query parameter
**When** the category search runs
**Then** the response is: `{"success": true, "data": [{"code": "...", "label": "..."}, ...]}`
**And** the endpoint requires authentication and returns 401 for unauthenticated requests

### Story 6.5: Empty State & Review Completion

As a data steward,
I want to see a clear completion state when my review queue is empty,
So that I know I have processed all pending predictions and can see a summary of what was done.

**Acceptance Criteria:**

**Given** a user is on `/predictions` with the "Needs Review" filter active
**When** the last `needs_review` prediction is acted on
**Then** the table is replaced immediately with a positive empty state panel — no page reload (UX-DR10)
**And** the panel displays: "All predictions reviewed. Nothing waiting on you."
**And** the queue counter updates to "0 items awaiting review"

**Given** the empty state panel is shown
**When** the session review summary is rendered
**Then** the panel shows session activity counts: "Auto-approved: X · Submitted to Akeneo: Y · Adjusted: Z"
**And** session counts are tracked in JavaScript state from actions taken during this page session
**And** if the user arrived at an already-empty queue, only the completion message is shown — no zero-count summary

**Given** the empty state panel is shown
**When** the available actions are rendered
**Then** two secondary links are displayed: "View submitted predictions" (→ `/predictions?status=submitted`) and "Start a new categorization batch" (→ `/products`)
**And** the panel uses positive visual treatment — green accent or checkmark — not a blank placeholder

**Given** a user navigates directly to `/predictions` when no `needs_review` predictions exist
**When** the page loads server-side
**Then** the empty state panel is rendered server-side so no flash of an empty table occurs
**And** the sidebar nav badge shows "0"

**Given** a user switches to a non-empty status tab from the empty state view
**When** the tab is clicked
**Then** the table renders with the filtered results for that status
**And** switching back to "Needs Review" shows the empty state again

**Given** new `needs_review` predictions arrive while the user views the empty state
**When** the user is on the empty state screen
**Then** the page does not automatically detect new items (no background polling on the empty state)
**And** a manual page refresh will show the updated queue
Data stewards see only items that need their attention, can read confidence scores and inline evidence without clicking through, bulk-submit items to Akeneo Collaborative Workflow, adjust categories inline, and watch their queue counter drop to zero.

**User outcome:** A data steward can process an entire review queue efficiently from a single list view, with full evidence and no tool-switching.
**FRs covered:** FR6, FR7, FR8
**UX-DRs:** UX-DR5 (review queue page), UX-DR6 (inline evidence rows), UX-DR7 (bulk action bar), UX-DR9 (inline category adjustment), UX-DR10 (empty state), UX-DR14 (queue counter)
