---
stepsCompleted: [1, 2, 3, 4, 5, 6, 7, 8]
lastStep: 8
status: 'complete'
completedAt: '2026-05-19'
inputDocuments: ["/var/www/html/ai_cats/_bmad-output/planning-artifacts/prd.md", "/var/www/html/ai_cats/_bmad-output/project-context.md"]
workflowType: 'architecture'
project_name: 'ai_cats'
user_name: 'Sayre'
date: '2026-05-19'
---

# Architecture Decision Document

_This document builds collaboratively through step-by-step discovery. Sections are appended as we work through each architectural decision together._

## Project Context Analysis

### Requirements Overview

**Functional Requirements:**

The system requires 35 functional capabilities across 6 major domains:

- **Product Categorization (FR1-FR9):** AI-powered categorization using multiple data sources (part number patterns, manufacturer categories, enrichment data, product descriptions), confidence scoring, batch processing with status tracking, and confidence-based routing to Akeneo Collaborative Workflow for predictions below 90% threshold
- **Evidence & Confidence (FR10-FR13):** Explainable evidence generation showing reasoning behind each prediction, evidence trails linking predictions to source data, confidence score distribution tracking, and comprehensive audit trail maintenance
- **Akeneo Integration (FR14-FR18):** Bidirectional API integration for receiving product data and submitting categorization results, Collaborative Workflow integration for human review, API rate limiting and retry logic handling
- **Data Management (FR19-FR23):** CSV import for supplier enrichment data with validation and normalization, PostgreSQL storage with string handling for SKU/MPN/part numbers, preservation of original supplier values for traceability
- **System Administration (FR24-FR28):** Configuration dashboard for API credentials and confidence thresholds, user authentication and session management, automated backup scheduling
- **Monitoring & Troubleshooting (FR29-FR35):** Comprehensive logging of decisions and errors, health monitoring dashboard with system metrics, automatic batch retry mechanisms, troubleshooting tools, alert system, and status display

**Non-Functional Requirements:**

The system must meet 17 non-functional requirements across 4 categories:

- **Performance (NFR1-NFR4):** Dashboard pages load within 2 seconds, batch categorization at 5-10 seconds per SKU, status polling updates within 5-10 seconds, CSV import completes within 30 seconds for files up to 10,000 rows
- **Security (NFR5-NFR9):** API credentials stored in environment variables, admin dashboard authentication, HTTPS-only Akeneo communication, restricted PostgreSQL access, non-exposing logs
- **Integration (NFR10-NFR13):** 30-second API timeouts with retry logic, exponential backoff for rate limiting, API format compatibility maintenance, 5-minute alert triggers for integration failures
- **Reliability (NFR14-NFR17):** 99% uptime during business hours, daily automated backups with 1-hour RTO, automatic batch retry up to 3 times, audit trail integrity with no data loss

**Scale & Complexity:**

- Primary domain: Web application with data processing and ML components
- Complexity level: Medium-high (involves ML components, data governance, external PIM system integration)
- Estimated architectural components: 8-10 major components (web frontend, API layer, categorization engine, database, Akeneo integration, CSV processing, monitoring/logging, admin configuration)

### Technical Constraints & Dependencies

**Technology Stack Constraints:**
- PHP-first architecture for web application
- PostgreSQL for data storage
- JavaScript for frontend interactivity
- Python approved for data processing, AI/ML, embeddings, batch jobs
- No React, Next.js, Laravel, Node.js, Express unless explicitly justified
- Self-hosted deployment with no unapproved cloud dependencies
- SKU/MPN/part numbers must be handled as strings (never numeric coercion)

**Integration Dependencies:**
- Akeneo PIM system for bidirectional data flow
- Akeneo Collaborative Workflow for human review process
- CSV file upload for supplier enrichment data

**Operational Constraints:**
- Confidence threshold starts at 90% for automatic categorization
- Human approval occurs inside Akeneo, not in this application
- System must preserve auditability for all decisions
- Imports must be repeatable and diagnosable through job/error tables

### Cross-Cutting Concerns Identified

- **Auditability:** Every import, prediction, workflow submission, approval outcome, and assignment must be traceable
- **Confidence Scoring:** Confidence calculations must be explainable and stored for display in UI
- **API Integration Resilience:** Rate limiting, retry logic, timeout handling, and error recovery for Akeneo API
- **Data Normalization:** Consistent processing of imported data while preserving original values for traceability
- **Error Handling & Logging:** Row-level import error capture, comprehensive logging for troubleshooting, diagnosable failures
- **Security:** Credential management, input validation, least-privilege database access, non-exposing logs
- **Performance:** Batch processing efficiency, dashboard responsiveness, status update polling strategy

## Starter Template Evaluation

### Primary Technology Domain

Custom PHP web application with PostgreSQL database and plain JavaScript frontend, based on project requirements analysis and technical constraints documented in project context.

### Starter Options Considered

No starter template evaluation was performed. The project context explicitly defines a custom PHP-first architecture with specific constraints:
- PHP as primary backend language
- PostgreSQL for data storage
- Plain JavaScript for frontend (no frameworks unless explicitly justified)
- Self-hosted deployment with no cloud dependencies
- No React, Next.js, Laravel, Node.js, or Express unless story-justified

These constraints make modern framework-based starter templates inappropriate for this project.

### Selected Starter: Custom Build from Scratch

**Rationale for Selection:**

The project context provides comprehensive technical architecture rules that establish a clear foundation:
- PHP-first web application with PDO database access
- PostgreSQL database with specific indexing and data handling requirements
- Plain JavaScript for frontend interactivity
- Python for data processing/ML tasks where appropriate
- Self-hosted deployment avoiding cloud dependencies

Using a framework-based starter would conflict with these established technical preferences and introduce unnecessary complexity.

**Initialization Approach:**

No CLI starter command. Project will be initialized with:
- Standard PHP project structure
- PostgreSQL database schema setup
- Basic HTML/CSS/JavaScript frontend
- Composer for PHP dependency management
- Environment configuration (.env files)

**Architectural Decisions Established by Project Context:**

**Language & Runtime:**
- PHP for primary backend (version compatible with target deployment environment)
- JavaScript for frontend (ES6+ compatible with target browsers)
- Python for data processing/ML tasks (isolated scripts/jobs)

**Database Layer:**
- PostgreSQL as primary database
- PDO for database access with prepared statements
- Explicit column names in SQL queries
- Indexes on high-volume lookup paths (SKU, MPN, category, import job, prediction status)

**Frontend Architecture:**
- Server-side rendering with PHP for core functionality
- Plain JavaScript for interactive elements (AJAX for status updates)
- Progressive enhancement (core functionality works without JavaScript)
- No frontend frameworks unless explicitly story-justified

**Code Organization:**
- Thin route/controller code
- Business logic in reusable services
- Long-running jobs via PHP CLI scripts, queue tables, or cron jobs
- Environment variables for configuration

**Development Experience:**
- Standard PHP development workflow
- Manual testing initially, automated tests as stories define requirements
- Server-side session management for authentication

**Note:** Project initialization will be the first implementation story, establishing the basic project structure following project context guidelines.

## Core Architectural Decisions

### Decision Priority Analysis

**Critical Decisions (Block Implementation):**
- Data modeling approach (hybrid normalized + JSONB)
- Database migration strategy (Phinx)
- Authentication method (session-based)
- API design patterns (simple PHP endpoints)
- Environment configuration (.env files)

**Important Decisions (Shape Architecture):**
- Data validation strategy (database + application)
- Authorization patterns (admin/user roles)
- Error handling standards (JSON responses)
- Frontend state management (simple JavaScript objects)
- Logging approach (application-level file logging)

**Deferred Decisions (Post-MVP):**
- Caching strategy (can add if performance NFRs not met)
- Rate limiting (can add if abuse detected)
- Scaling strategy (can add if single server insufficient)
- Advanced monitoring (can add if basic logging insufficient)

### Data Architecture

**Data Modeling Approach:**
- Decision: Hybrid approach (normalized core tables + JSONB for flexible data)
- Rationale: Normalized core tables (products, predictions, import jobs) ensure data integrity and query performance, while JSONB columns provide flexibility for enrichment data and evidence storage
- Affects: Database schema design, query patterns, data access layer
- Provided by Starter: No

**Data Validation Strategy:**
- Decision: Database constraints + application validation
- Rationale: Double validation ensures data integrity, database constraints provide last line of defense, application validation provides user-friendly error messages
- Affects: Database schema, service layer code, error handling
- Provided by Starter: No

**Migration Approach:**
- Decision: PHP migration framework (Phinx)
- Rationale: Provides version control for schema changes, rollback capability, and team collaboration support
- Version: Latest stable Phinx version
- Affects: Database schema management, deployment process
- Provided by Starter: No

**Caching Strategy:**
- Decision: No caching initially
- Rationale: Simplest approach, can add caching later if performance NFRs (2-second dashboard load, 5-10 second per SKU categorization) are not met
- Affects: Application architecture, performance optimization
- Provided by Starter: No

### Authentication & Security

**Authentication Method:**
- Decision: Session-based authentication
- Rationale: Already specified in project context, simple and effective for internal tools, no need for token-based complexity
- Affects: User session management, middleware implementation
- Provided by Starter: No

**Authorization Patterns:**
- Decision: Simple admin/user roles
- Rationale: Data stewards need review access, admins need configuration access, simple RBAC sufficient for internal tool
- Affects: User management, access control implementation
- Provided by Starter: No

**Security Middleware:**
- Decision: CSRF protection + input validation
- Rationale: Essential security measures for web applications, rate limiting can be added later if needed
- Affects: Middleware implementation, form handling
- Provided by Starter: No

**Data Encryption:**
- Decision: HTTPS only (encryption in transit)
- Rationale: NFR7 requires HTTPS for Akeneo communication, database encryption not needed for self-hosted trusted environment
- Affects: SSL/TLS configuration, data transmission
- Provided by Starter: No

**API Security Strategy:**
- Decision: API key authentication for Akeneo
- Rationale: Simple and secure for service-to-service communication, credentials stored in environment variables (NFR5 requirement)
- Affects: Akeneo integration implementation, credential management
- Provided by Starter: No

### API & Communication Patterns

**API Design Patterns:**
- Decision: Simple PHP endpoints returning JSON
- Rationale: No need for formal REST framework, PHP endpoints returning JSON for AJAX calls sufficient for internal tool
- Affects: API layer implementation, frontend-backend communication
- Provided by Starter: No

**API Documentation:**
- Decision: Inline code comments + simple documentation file
- Rationale: Internal tool, comprehensive documentation not required, simple approach sufficient
- Affects: Code documentation practices
- Provided by Starter: No

**Error Handling Standards:**
- Decision: JSON error responses with consistent structure
- Rationale: NFR14 requires consistent JSON responses, enables frontend error handling and user feedback
- Affects: Error handling middleware, API response structure
- Provided by Starter: No

**Rate Limiting Strategy:**
- Decision: No rate limiting initially
- Rationale: Internal tool with trusted users, can add session-based limiting later if needed
- Affects: API middleware, performance considerations
- Provided by Starter: No

### Frontend Architecture

**State Management Approach:**
- Decision: Simple JavaScript object state
- Rationale: Minimal state needed (dashboard status, form data), no complex state management required
- Affects: Frontend code organization, data flow
- Provided by Starter: No

**Component Architecture:**
- Decision: Function-based components
- Rationale: Reusable JavaScript functions for UI elements (status displays, form handlers), simple and maintainable
- Affects: Frontend code structure, reusability
- Provided by Starter: No

**Routing Strategy:**
- Decision: PHP server-side routing only
- Rationale: MPA approach specified in PRD, no need for client-side routing complexity
- Affects: Navigation implementation, URL structure
- Provided by Starter: No

**Performance Optimization:**
- Decision: Browser caching headers + asset minification
- Rationale: Basic optimization to meet NFR1 (2-second page load), simple to implement
- Affects: Asset management, server configuration
- Provided by Starter: No

### Infrastructure & Deployment

**Hosting Strategy:**
- Decision: Traditional VPS/dedicated server
- Rationale: Self-hosted requirement, Docker adds complexity for PHP application, traditional hosting simpler
- Affects: Deployment process, server configuration
- Provided by Starter: No

**CI/CD Pipeline:**
- Decision: Manual deployment (SSH + git pull)
- Rationale: Single developer, simple deployment process sufficient, no need for CI/CD complexity
- Affects: Deployment workflow, version control
- Provided by Starter: No

**Environment Configuration:**
- Decision: .env files per environment
- Rationale: Standard practice, keeps credentials out of code (NFR5 requirement), easy to manage multiple environments
- Affects: Configuration management, security practices
- Provided by Starter: No

**Monitoring and Logging:**
- Decision: Application-level logging to files
- Rationale: NFR30 requires detailed error context, file logging is simple and effective for troubleshooting
- Affects: Logging implementation, error tracking
- Provided by Starter: No

**Scaling Strategy:**
- Decision: No scaling strategy (single server)
- Rationale: Internal tool, NFR14 (99% uptime) achievable on single server, can add scaling later if needed
- Affects: Infrastructure planning, capacity management
- Provided by Starter: No

### Decision Impact Analysis

**Implementation Sequence:**
1. Environment configuration setup (.env files)
2. Database schema with Phinx migrations
3. Authentication and authorization implementation
4. Core API endpoints (PHP + JSON responses)
5. Frontend components and state management
6. Security middleware (CSRF, input validation)
7. Logging implementation
8. Akeneo API integration with authentication
9. Performance optimization (caching headers, minification)
10. Backup automation setup

**Cross-Component Dependencies:**
- Database schema must be established before API implementation
- Authentication must be implemented before authorization checks
- Logging infrastructure should be early in sequence for troubleshooting
- Akeneo integration depends on environment configuration and API patterns
- Frontend components depend on API endpoints being available
- Security middleware applies across all API endpoints

## Implementation Patterns & Consistency Rules

### Pattern Categories Defined

**Critical Conflict Points Identified:**
25 areas where AI agents could make different choices across naming, structure, format, and process patterns

### Naming Patterns

**Database Naming Conventions:**
- Tables: snake_case, plural (e.g., `import_jobs`, `products`, `predictions`)
- Columns: snake_case (e.g., `import_job_id`, `created_at`)
- Foreign keys: referenced_table_id (e.g., `import_job_id`, `product_id`)
- Indexes: idx_table_column (e.g., `idx_import_jobs_status`, `idx_products_sku`)
- Primary keys: `id` (integer, auto-increment)

**API Naming Conventions:**
- Endpoints: kebab-case, plural (e.g., `/api/import-jobs`, `/api/products`)
- Route parameters: colon prefix (e.g., `:id`, `:import_job_id`)
- Query parameters: snake_case (e.g., `import_job_id`, `status`)
- Custom headers: X- prefix (e.g., `X-Request-ID`, `X-API-Key`)

**Code Naming Conventions:**
- PHP classes: PascalCase (e.g., `ImportJobService`, `ProductController`)
- PHP functions: camelCase (e.g., `getImportJobData`, `createProduct`)
- PHP variables: camelCase with $ prefix (e.g., `$importJobId`, `$productName`)
- JavaScript functions: camelCase (e.g., `getImportJobData`, `submitForm`)
- JavaScript variables: camelCase (e.g., `importJobId`, `formData`)
- Constants: UPPER_SNAKE_CASE (e.g., `CONFIDENCE_THRESHOLD`, `API_TIMEOUT`)

### Structure Patterns

**Project Organization:**
- Tests: `tests/` directory at root, mirroring `src/` structure
- Features organized by domain: `src/Features/ImportJobs/`, `src/Features/Categorization/`
- Shared utilities: `src/Utils/` for reusable helper functions
- Services: `src/Services/` for business logic
- Controllers: `src/Controllers/` for request handling
- Models: `src/Models/` for database entities
- Database migrations: `db/migrations/` for Phinx migration files
- Views: `src/Views/` for PHP templates

**File Structure Patterns:**
- Config: `config/` directory for configuration files
- Static assets: `public/assets/` (css/, js/, images/)
- Documentation: `docs/` directory for project documentation
- Environment files: `.env` (local), `.env.example` (template)
- Logs: `logs/` directory for application logs

### Format Patterns

**API Response Formats:**
- Success response: `{"success": true, "data": {...}}`
- Error response: `{"success": false, "error": {"code": "ERROR_CODE", "message": "Human readable message"}}`
- Date format in JSON: ISO 8601 strings (YYYY-MM-DDTHH:mm:ssZ)
- Pagination response: `{"success": true, "data": [...], "pagination": {"total": 100, "page": 1, "per_page": 25}}`

**Data Exchange Formats:**
- JSON field naming: snake_case for database/API, camelCase for JavaScript
- Boolean representations: true/false (JSON), 1/0 (database)
- Null handling: null for missing values in JSON, NULL in database
- Array vs object: Use arrays for lists, objects for single items
- Timestamps: Unix timestamps or ISO 8601 strings (consistent per context)

### Communication Patterns

**State Management Patterns:**
- State updates: Direct mutation acceptable for simple JavaScript objects
- State organization: Group by feature (e.g., `importJobs.state`, `categorization.state`)
- Loading states: Boolean flags (e.g., `isLoading`, `isSubmitting`)
- Error states: Store error messages (e.g., `errorMessage`)

### Process Patterns

**Error Handling Patterns:**
- Global error handler: Catches unhandled exceptions, returns JSON error response
- Error logging: Log full error context (stack trace, request data) to file
- User-facing errors: Human-readable messages without technical details
- HTTP status codes: 200 (success), 400 (client error), 401 (unauthorized), 403 (forbidden), 404 (not found), 500 (server error)
- Validation errors: Return 400 with field-specific error messages

**Loading State Patterns:**
- Loading state naming: `isLoading` (general), `isSubmitting` (form submission)
- Local loading states: Per operation (e.g., `isImporting`, `isCategorizing`)
- Loading UI: Spinner or progress indicator during async operations
- Loading persistence: Clear loading state on success or error

**Validation Patterns:**
- Validation timing: Validate before database operations
- Validation location: Application layer (services) + database constraints
- Validation error format: Field-specific messages in JSON response
- Required fields: Return 400 with missing field names

### Enforcement Guidelines

**All AI Agents MUST:**

- Follow naming conventions exactly as specified
- Organize code according to project structure patterns
- Use consistent API response formats
- Implement error handling with logging and user-friendly messages
- Write tests in `tests/` directory mirroring `src/` structure
- Use environment variables for configuration (never hardcode credentials)
- Handle SKU/MPN/part numbers as strings (never numeric coercion)

**Pattern Enforcement:**

- Code review should check naming convention compliance
- Automated tests should verify API response format consistency
- Documentation should be updated if patterns change
- Pattern violations should be documented in code comments with rationale
- Pattern updates require team discussion and documentation update

### Pattern Examples

**Good Examples:**

```php
// Database naming
CREATE TABLE import_jobs (
    id SERIAL PRIMARY KEY,
    status VARCHAR(50) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

// PHP class naming
class ImportJobService {
    public function getImportJobData(int $importJobId): array {
        // Implementation
    }
}

// API endpoint naming
GET /api/import-jobs/:id
```

```javascript
// JavaScript naming
function getImportJobData(importJobId) {
    return fetch(`/api/import-jobs/${importJobId}`);
}

// State management
const state = {
    importJobs: {
        isLoading: false,
        data: [],
        errorMessage: null
    }
};
```

**Anti-Patterns:**

```php
// WRONG: Inconsistent naming
class import_job_service {  // Should be ImportJobService
    public function get_import_job_data($import_job_id) {  // Should be camelCase
        // Implementation
    }
}

// WRONG: Hardcoded credentials
$apiKey = 'hardcoded-api-key';  // Should use environment variable
```

```javascript
// WRONG: Inconsistent naming
function get_import_job_data(import_job_id) {  // Should be camelCase
    // Implementation
}

// WRONG: Numeric coercion for part numbers
const partNumber = parseInt(sku);  // Should keep as string
```

### Technology-Specific Patterns

**PHP Patterns:**
- Use PDO prepared statements for all database queries
- Use type declarations in function signatures
- Return consistent array structures from services
- Use exceptions for error handling, catch at controller level

**JavaScript Patterns:**
- Use const/let, avoid var
- Use arrow functions for callbacks
- Handle fetch errors with try/catch
- Use template literals for string interpolation

**SQL Patterns:**
- Use explicit column names in SELECT statements
- Use parameterized queries (never concatenate strings)
- Add indexes on foreign keys and frequently queried columns
- Use transactions for multi-step operations

## Project Structure & Boundaries

### Complete Project Directory Structure

```
ai_cats/
├── README.md
├── composer.json
├── .env
├── .env.example
├── .gitignore
├── config/
│   ├── database.php
│   ├── app.php
│   └── akeneo.php
├── db/
│   ├── phinx.php
│   └── migrations/
│       ├── 20240519000001_create_users_table.php
│       ├── 20240519000002_create_import_jobs_table.php
│       ├── 20240519000003_create_products_table.php
│       ├── 20240519000004_create_product_attributes_table.php
│       ├── 20240519000005_create_predictions_table.php
│       ├── 20240519000006_create_evidence_table.php
│       └── 20240519000007_create_akeneo_workflow_table.php
├── public/
│   ├── index.php
│   ├── assets/
│   │   ├── css/
│   │   │   └── styles.css
│   │   └── js/
│   │       ├── app.js
│   │       ├── import-jobs.js
│   │       ├── categorization.js
│   │       └── admin.js
│   └── uploads/
│       └── csv/
├── src/
│   ├── Controllers/
│   │   ├── AuthController.php
│   │   ├── ImportJobController.php
│   │   ├── ProductController.php
│   │   ├── PredictionController.php
│   │   └── AdminController.php
│   ├── Services/
│   │   ├── AuthService.php
│   │   ├── ImportJobService.php
│   │   ├── ProductService.php
│   │   ├── CategorizationService.php
│   │   ├── AkeneoService.php
│   │   └── EvidenceService.php
│   ├── Models/
│   │   ├── User.php
│   │   ├── ImportJob.php
│   │   ├── Product.php
│   │   ├── ProductAttribute.php
│   │   ├── Prediction.php
│   │   ├── Evidence.php
│   │   └── AkeneoWorkflow.php
│   ├── Utils/
│   │   ├── Database.php
│   │   ├── Logger.php
│   │   ├── Validator.php
│   │   └── CsvParser.php
│   ├── Middleware/
│   │   ├── AuthMiddleware.php
│   │   ├── CsrfMiddleware.php
│   │   └── ErrorMiddleware.php
│   └── Views/
│       ├── layouts/
│       │   └── main.php
│       ├── auth/
│       │   └── login.php
│       ├── import-jobs/
│       │   ├── index.php
│       │   ├── create.php
│       │   └── show.php
│       ├── products/
│       │   ├── index.php
│       │   └── show.php
│       ├── predictions/
│       │   ├── index.php
│       │   └── show.php
│       └── admin/
│           ├── index.php
│           └── settings.php
├── scripts/
│   ├── categorize.php
│   └── process_import.php
├── logs/
│   ├── app.log
│   └── error.log
├── tests/
│   ├── Controllers/
│   │   ├── AuthControllerTest.php
│   │   └── ImportJobControllerTest.php
│   ├── Services/
│   │   ├── ImportJobServiceTest.php
│   │   └── CategorizationServiceTest.php
│   └── Utils/
│       └── CsvParserTest.php
└── docs/
    ├── api.md
    └── deployment.md
```

### Architectural Boundaries

**API Boundaries:**
- External API: Akeneo PIM API (authenticated via API key)
- Internal API: Simple PHP endpoints under `/api/` prefix
- Authentication boundary: Session-based auth required for all endpoints except login
- Data access boundary: All database access through Models with PDO

**Component Boundaries:**
- Frontend-backend communication: AJAX fetch calls to PHP endpoints
- Service layer: Business logic isolated in Services, controllers handle HTTP only
- View layer: PHP templates for server-side rendering, JavaScript for interactivity
- State management: Simple JavaScript objects, no complex state management

**Service Boundaries:**
- ImportJobService: Handles CSV import, validation, and job tracking
- CategorizationService: AI categorization logic and confidence scoring
- AkeneoService: All Akeneo API communication and workflow integration
- EvidenceService: Evidence generation and storage
- AuthService: Session management and authorization

**Data Boundaries:**
- Database access: All queries through Model classes with PDO prepared statements
- Schema changes: Managed through Phinx migrations only
- Data validation: Application layer (Services) + database constraints
- External data: Akeneo API integration through AkeneoService only

### Requirements to Structure Mapping

**Feature Mapping:**

**Product Categorization (FR1-FR9):**
- Services: src/Services/CategorizationService.php, src/Services/ProductService.php
- Controllers: src/Controllers/ProductController.php, src/Controllers/PredictionController.php
- Models: src/Models/Product.php, src/Models/Prediction.php, src/Models/Evidence.php
- Views: src/Views/predictions/
- Scripts: scripts/categorize.php
- Database: db/migrations/*products*, *predictions*, *evidence*

**Evidence & Confidence (FR10-FR13):**
- Services: src/Services/EvidenceService.php
- Models: src/Models/Evidence.php
- Database: db/migrations/*evidence*

**Akeneo Integration (FR14-FR18):**
- Services: src/Services/AkeneoService.php
- Models: src/Models/AkeneoWorkflow.php
- Config: config/akeneo.php
- Database: db/migrations/*akeneo_workflow*

**Data Management (FR19-FR23):**
- Services: src/Services/ImportJobService.php, src/Services/ProductService.php
- Controllers: src/Controllers/ImportJobController.php
- Models: src/Models/ImportJob.php, src/Models/Product.php, src/Models/ProductAttribute.php
- Utils: src/Utils/CsvParser.php
- Views: src/Views/import-jobs/
- Scripts: scripts/process_import.php
- Database: db/migrations/*import_jobs*, *products*, *product_attributes*

**System Administration (FR24-FR28):**
- Controllers: src/Controllers/AdminController.php
- Services: src/Services/AuthService.php
- Views: src/Views/admin/
- Config: config/app.php, config/database.php
- Database: db/migrations/*users*

**Monitoring & Troubleshooting (FR29-FR35):**
- Utils: src/Utils/Logger.php
- Middleware: src/Middleware/ErrorMiddleware.php
- Directory: logs/
- All services include logging calls

**Cross-Cutting Concerns:**

**Authentication System:**
- Services: src/Services/AuthService.php
- Controllers: src/Controllers/AuthController.php
- Middleware: src/Middleware/AuthMiddleware.php
- Models: src/Models/User.php
- Views: src/Views/auth/
- Database: db/migrations/*users*

**Error Handling:**
- Middleware: src/Middleware/ErrorMiddleware.php
- Utils: src/Utils/Logger.php
- All Controllers: Implement consistent error response format
- Directory: logs/

**CSRF Protection:**
- Middleware: src/Middleware/CsrfMiddleware.php
- All form submissions: Include CSRF token validation

**Database Access:**
- Utils: src/Utils/Database.php
- Models: All src/Models/ extend base Model class
- Migrations: db/migrations/
- Config: config/database.php

### Integration Points

**Internal Communication:**
- Controllers → Services: Direct method calls
- Services → Models: Direct method calls for data access
- Services → Utils: Reusable helper functions
- Frontend → Backend: AJAX fetch to PHP endpoints
- Views → Controllers: Server-side rendering via PHP includes

**External Integrations:**
- Akeneo PIM API: Through AkeneoService only
- Credentials: Stored in .env file, loaded via config/akeneo.php
- API authentication: API key in headers
- Rate limiting: Handled in AkeneoService with retry logic

**Data Flow:**
1. CSV upload → ImportJobController → ImportJobService → CsvParser
2. ImportJobService → Database (via Models) → ImportJob record created
3. Process import script → ProductService → Product normalization
4. CategorizationService → AI processing → Prediction + Evidence
5. AkeneoService → API submission → Workflow tracking
6. Frontend polls → API endpoints → Status updates

### File Organization Patterns

**Configuration Files:**
- Root: .env (environment variables), .env.example (template)
- config/: PHP configuration files loaded at bootstrap
- db/phinx.php: Phinx migration configuration

**Source Organization:**
- src/Controllers/: HTTP request handling
- src/Services/: Business logic layer
- src/Models/: Database entity models
- src/Utils/: Reusable helper functions
- src/Middleware/: Request/response processing
- src/Views/: PHP templates for server-side rendering

**Test Organization:**
- tests/: Mirrors src/ structure
- tests/Controllers/: Controller tests
- tests/Services/: Service layer tests
- tests/Utils/: Utility function tests

**Asset Organization:**
- public/assets/css/: Stylesheets
- public/assets/js/: JavaScript modules
- public/uploads/csv/: Temporary CSV upload storage
- public/index.php: Application entry point

### Development Workflow Integration

**Development Server Structure:**
- PHP built-in server or Apache/Nginx pointing to public/
- .env file for local development configuration
- logs/ directory for local debugging
- public/uploads/csv/ for testing CSV uploads

**Build Process Structure:**
- Composer for PHP dependencies
- No JavaScript build process (plain JS)
- Asset minification can be added via simple script if needed
- Phinx for database migrations: `vendor/bin/phinx migrate`

**Deployment Structure:**
- Manual deployment via SSH + git pull
- .env file on server (not in version control)
- Database migrations run via Phinx on deployment
- logs/ directory writable by web server user
- public/uploads/csv/ writable by web server user

## Architecture Validation Results

### Coherence Validation ✅

**Decision Compatibility:**
All technology choices are internally consistent. PHP + PDO + PostgreSQL form a coherent data access stack. Phinx (PHP) integrates cleanly with PostgreSQL for migration management. Session-based auth is natural for a PHP server-rendered MPA. Python scripts are isolated from the PHP web app and communicate through PostgreSQL tables — no conflicts. No contradictory decisions found.

**Pattern Consistency:**
Naming conventions are consistent: snake_case for DB, PascalCase for PHP classes, camelCase for functions/JS variables. API response format (success/data envelope) is referenced consistently across error handling and pagination patterns. Structure patterns (thin controllers → services → models) align with the chosen PHP stack.

**Structure Alignment:**
The directory structure supports all decisions: service layer boundaries are respected, credentials are in `.env` + `config/`, logs are isolated to `logs/`, Python work is implied by CLI scripts. One structural concern noted in Gap Analysis below.

### Requirements Coverage Validation ✅

**Functional Requirements (35 FRs):**

| Domain | FRs | Architectural Support |
|---|---|---|
| Product Categorization | FR1–FR9 | CategorizationService, PredictionController, scripts/categorize.php |
| Evidence & Confidence | FR10–FR13 | EvidenceService, Evidence model, JSONB storage |
| Akeneo Integration | FR14–FR18 | AkeneoService, config/akeneo.php, retry/backoff logic |
| Data Management | FR19–FR23 | ImportJobService, CsvParser, string handling rule |
| System Administration | FR24–FR28 | AdminController, AuthService, config/, backup mention |
| Monitoring & Troubleshooting | FR29–FR35 | Logger, ErrorMiddleware, logs/, alert mention |

All 35 FRs have architectural homes.

**Non-Functional Requirements (17 NFRs):**
- Performance (NFR1–4): Browser caching headers + asset minification cover dashboard load time; batch via background scripts covers per-SKU timing
- Security (NFR5–9): .env credentials, session auth, HTTPS-only Akeneo comms, PDO prepared statements, log sanitization
- Integration (NFR10–13): AkeneoService handles timeouts (30s), exponential backoff, retry logic, alert triggers
- Reliability (NFR14–17): 99% uptime on single server, backup automation (FR28), 3x retry, audit trail via DB

### Implementation Readiness Validation ✅

**Decision Completeness:**
All critical decisions are documented. PHP version is listed as "compatible with target deployment environment" — not pinned. Composer dependencies are referenced but not enumerated (Phinx version, HTTP client for Akeneo API, test framework).

**Structure Completeness:**
Project structure is well-defined with specific file paths. Notable gap: no Python directory or script structure defined. The categorization engine almost certainly requires Python for ML/embeddings (per project context), but `scripts/categorize.php` is listed — not a `.py` file. The Python/PHP boundary is acknowledged conceptually but not structurally specified.

**Pattern Completeness:**
25 conflict points are addressed. All major patterns have code examples. Good coverage of error handling, loading states, validation, and security patterns.

### Gap Analysis Results

**Critical Gaps:** None that block implementation of initial stories — the PHP web application foundation is fully specified.

**Important Gaps:**

1. **Python directory structure undefined** — The project context explicitly approves Python for AI/ML/categorization/embeddings. The current structure only defines `scripts/categorize.php`. If categorization uses Python (likely for embedding-based similarity or ML models), there is no defined home for Python scripts, virtual environments, requirements files, or the PHP→Python communication boundary. This needs to be defined before the categorization epic.

2. **PHP version unpinned** — "Compatible with target deployment environment" leaves agents free to make incompatible choices. Should specify a minimum (e.g., PHP 8.1+).

3. **Queue/worker table structure not defined** — Long-running jobs are approved to use queue tables, but no queue table is listed in the migration plan. Agents will invent schemas independently without guidance.

**Nice-to-Have Gaps:**

4. **Composer package list absent** — Phinx version, any HTTP client for Akeneo API calls (Guzzle? native cURL?), test framework (PHPUnit?) not specified.
5. **Backup mechanism unspecified** — FR28 is architecturally housed in AdminController/cron but the backup mechanism (pg_dump + cron? third-party?) isn't defined.

### Validation Issues Addressed

No critical issues found. Important gaps (Python structure, PHP version, queue table schema) are documented for resolution before their respective epics are implemented.

### Architecture Completeness Checklist

**Requirements Analysis**
- [x] Project context thoroughly analyzed
- [x] Scale and complexity assessed
- [x] Technical constraints identified
- [x] Cross-cutting concerns mapped

**Architectural Decisions**
- [x] Critical decisions documented with versions
- [x] Technology stack fully specified
- [ ] Integration patterns defined *(Python/PHP boundary not fully specified)*
- [x] Performance considerations addressed

**Implementation Patterns**
- [x] Naming conventions established
- [x] Structure patterns defined
- [x] Communication patterns specified
- [x] Process patterns documented

**Project Structure**
- [x] Complete directory structure defined
- [x] Component boundaries established
- [x] Integration points mapped
- [ ] Requirements to structure mapping complete *(Python categorization structure missing)*

### Architecture Readiness Assessment

**Overall Status:** READY WITH MINOR GAPS

**Confidence Level:** High — the PHP web application foundation is solid and fully specified. The gap is scoped to the Python/ML layer, addressable before the categorization epic.

**Key Strengths:**
- Clean service boundary separation (controllers thin, logic in services)
- Comprehensive naming convention coverage prevents agent drift
- All 35 FRs explicitly mapped to files and services
- Strong security posture (credentials, CSRF, prepared statements)
- Auditability baked into data flow at every step

**Areas for Future Enhancement:**
- Python categorization layer structure (define before categorization epic)
- PHP version pin + Composer dependency list
- Queue worker table schema
- Backup mechanism specification

### Implementation Handoff

**AI Agent Guidelines:**
- Follow all architectural decisions exactly as documented
- Use implementation patterns consistently across all components
- Respect project structure and boundaries
- Refer to this document for all architectural questions
- Do not introduce frameworks, Node, React, or cloud dependencies unless a story explicitly justifies it

**First Implementation Priority:**
No CLI starter. Begin with:
1. Composer init + Phinx setup
2. `.env` / `.env.example` scaffolding
3. `public/index.php` entry point with routing
4. First Phinx migration (users table)
5. AuthService + session-based login
