ADR 006: Testing Strategy

What is this?

The finalized testing architecture for ChatKcal, prioritizing behavior-driven component tests and mocked network integration for reliability.

Metadata

Status: Accepted & Implemented
Date: 2026-01-01
Type: Architecture Decision Record (ADR)

ChatKcal adopts a Layered Testing Strategy (The Testing Trophy) to ensure reliability without slowing down development.

Static Analysis (Base): Catch typos and syntax errors instantly.
Unit Tests: Verify pure logic and edge cases.
Integration Tests (The Bulk): Verify that components work together.
End-to-End (E2E): Verify the critical user journeys work on a real browser.

2. The Layers

Layer 1: Static Analysis

Goal: Catch stupid mistakes before code runs.
Tools: ESLint, Prettier, TypeScript (or JSDoc typing).
Cost: Free (Runs in editor).

Layer 2: Unit Tests (Business Logic)

Goal: Verify complex algorithms, data transformers, or custom hooks in isolation.
Tools: Vitest (fast, native Vite support).
What to test: dateUtils.js, calculateMacros(), complex Reducers.
What NOT to test: React Components (usually).

Layer 3: Component Integration (DOM)

Goal: Verify that a Component Tree renders correctly and handles user interaction.
Tools: React Testing Library (RTL).
Philosophy: Test Behavior, not Implementation.
- ❌ Bad: expect(component.state.isOpen).toBe(true)
- ✅ Good: expect(screen.getByText('Settings')).toBeVisible()

Layer 4: E2E & Network Integration (Browser)

Goal: Verify the app works in a real browser environment.
Tools: Playwright / Cypress.
Two Flavors:
1. Full E2E (Smoke Tests): Hits the real backend/database. Tests the whole stack. Expensive, slow, flaky. Run sparingly (e.g., Critical Paths only).
2. Network Mocking (Integration): Runs in a real browser but mocks the API.
3. Why: Fast, reliable, deterministic.
4. Use Case: Verifying frontend logic like Race Conditions, Form Submissions, and Error States without needing a seed database.

3. Practical Example: The "Settings Race Condition"

We recently encountered a bug where saving settings fired two requests (Race Condition).

Method	Result	Why?
Unit Test	Missed	Mocks the `mutation` function directly. Doesn't see the race.
RTL Test	Maybe	Might catch it if we check call counts, but doesn't test the actual network payload serialization.
Playwright (Mocked)	Caught	Intercepts the actual HTTP requests leaving the browser. We can assert: "Did the browser send exactly ONE POST request with `calories: 1500`?"

The "Modern" Recommendation

For a project like ChatKcal:

Linting: Enforce strict rules.
Vitest: For utility functions (e.g., date logic).
Playwright (Mocked): For almost everything else.
It tests the UI (Rendering).
It tests the Network Logic (Payloads).
It tests Accessibility (Interactions).
It is faster and more reliable than hitting the real AppSync API.

4. Implementation Details

How to Mock in Playwright

Instead of setting up a test user in Cognito, simply tell the browser what to return:

// intercept.spec.js
test('should update settings without race condition', async ({
    page
}) => {
    // 1. Mock the Network
    await page.route('**/graphql', route => {
        // Return a fake success
        route.fulfill({
            status: 200,
            body: JSON.stringify({
                data: {
                    success: true
                }
            })
        });
    });

    // 2. Perform Action
    await page.goto('/dashboard');
    await page.getByRole('button', {
        name: 'Settings'
    }).click();
    await page.getByLabel('Calories').fill('2000');

    // 3. Spy on the Request
    const requestPromise = page.waitForRequest(req =>
        req.url().includes('graphql') &&
        req.postDataJSON().operationName === 'UpdateUserTargets'
    );

    await page.getByRole('button', {
        name: 'Done'
    }).click();

    // 4. Assert
    const request = await requestPromise;
    expect(request.postDataJSON().variables.calories).toBe(2000);
});

5. Current Implementation Status (Jan 2026)

Layer 1 (Static): ESLint is configured.
Layer 2 (Unit): No Vitest setup yet.
- Frontend: Logic is mostly in hooks.
- Backend: AppSync resolvers are untested locally.
Layer 3 (Integration): No RTL setup.
Layer 4 (E2E): Partially implemented.
- We have Visual Regression Tests (frontend/tests/visual.spec.js) using Playwright.
- Gap: We are missing the "Network Mocking" tests described above. Adding these is the next step to prevent regressions like the Settings Race Condition.

6. Backend Logic Testing (AppSync)

Our AppSync resolvers are pure JavaScript files (appsync/*.js). We don't need to deploy them to test them.

Strategy: Unit Test the JS

Since resolvers just export request(ctx) and response(ctx), we can test them like any other function.

Tool: Vitest.
Method: Import the resolver, pass a mock ctx object, and assert the output.

// appsync/updateUserTargets.test.js
import {
    request
} from './updateUserTargets';
import {
    util
} from '@aws-appsync/utils'; // You might need a mock for this

test('should generate correct UpdateItem expression', () => {
    const ctx = {
        args: {
            calories: 2000
        },
        identity: {
            sub: 'user-123'
        }
    };

    const result = request(ctx);

    expect(result.operation).toBe('UpdateItem');
    expect(result.key.PK.S).toBe('USER#user-123');
    // Check that dynamic expression generation works
    expect(result.update.expression).toContain('#Calories = :Calories');
});

Status: Not Implemented. We rely on manual deployment (task deploy:test) and checking CloudWatch logs. Adding this layer would drastically reduce our feedback loop from ~2 minutes (deploy) to ~2 milliseconds (local test).