ADR 006: Testing Strategy
Metadata
- Status: Accepted & Implemented
- Date: 2026-01-01
- Type: Architecture Decision Record (ADR)
ChatKcal adopts a Layered Testing Strategy (The Testing Trophy) to ensure reliability without slowing down development.
- Static Analysis (Base): Catch typos and syntax errors instantly.
- Unit Tests: Verify pure logic and edge cases.
- Integration Tests (The Bulk): Verify that components work together.
- End-to-End (E2E): Verify the critical user journeys work on a real browser.
2. The Layers
Layer 1: Static Analysis
- Goal: Catch stupid mistakes before code runs.
- Tools: ESLint, Prettier, TypeScript (or JSDoc typing).
- Cost: Free (Runs in editor).
Layer 2: Unit Tests (Business Logic)
- Goal: Verify complex algorithms, data transformers, or custom hooks in isolation.
- Tools: Vitest (fast, native Vite support).
- What to test:
dateUtils.js,calculateMacros(), complex Reducers. - What NOT to test: React Components (usually).
Layer 3: Component Integration (DOM)
- Goal: Verify that a Component Tree renders correctly and handles user interaction.
- Tools: React Testing Library (RTL).
- Philosophy: Test Behavior, not Implementation.
- โ Bad:
expect(component.state.isOpen).toBe(true) - โ
Good:
expect(screen.getByText('Settings')).toBeVisible()
- โ Bad:
Layer 4: E2E & Network Integration (Browser)
-
Goal: Verify the app works in a real browser environment.
-
Tools: Playwright / Cypress.
-
Two Flavors:
- Full E2E (Smoke Tests): Hits the real backend/database. Tests the whole stack. Expensive, slow, flaky. Run sparingly (e.g., Critical Paths only).
-
Network Mocking (Integration): Runs in a real browser but mocks the API.
-
Why: Fast, reliable, deterministic.
- Use Case: Verifying frontend logic like Race Conditions, Form Submissions, and Error States without needing a seed database.
3. Practical Example: The "Settings Race Condition"
We recently encountered a bug where saving settings fired two requests (Race Condition).
| Method | Result | Why? |
|---|---|---|
| Unit Test | Mocks the mutation function directly. Doesn't see the race. |
|
| RTL Test | Might catch it if we check call counts, but doesn't test the actual network payload serialization. | |
| Playwright (Mocked) | Intercepts the actual HTTP requests leaving the browser. We can assert: "Did the browser send exactly ONE POST request with calories: 1500?" |
The "Modern" Recommendation
For a project like ChatKcal:
- Linting: Enforce strict rules.
- Vitest: For utility functions (e.g., date logic).
-
Playwright (Mocked): For almost everything else.
-
It tests the UI (Rendering).
- It tests the Network Logic (Payloads).
- It tests Accessibility (Interactions).
- It is faster and more reliable than hitting the real AppSync API.
4. Implementation Details
How to Mock in Playwright
Instead of setting up a test user in Cognito, simply tell the browser what to return:
// intercept.spec.js
test('should update settings without race condition', async ({
page
}) => {
// 1. Mock the Network
await page.route('**/graphql', route => {
// Return a fake success
route.fulfill({
status: 200,
body: JSON.stringify({
data: {
success: true
}
})
});
});
// 2. Perform Action
await page.goto('/dashboard');
await page.getByRole('button', {
name: 'Settings'
}).click();
await page.getByLabel('Calories').fill('2000');
// 3. Spy on the Request
const requestPromise = page.waitForRequest(req =>
req.url().includes('graphql') &&
req.postDataJSON().operationName === 'UpdateUserTargets'
);
await page.getByRole('button', {
name: 'Done'
}).click();
// 4. Assert
const request = await requestPromise;
expect(request.postDataJSON().variables.calories).toBe(2000);
});
5. Current Implementation Status (Jan 2026)
Layer 1 (Static): ESLint is configured.
Layer 2 (Unit): No Vitest setup yet.
- Frontend: Logic is mostly in hooks.
- Backend: AppSync resolvers are untested locally.
Layer 3 (Integration): No RTL setup.
Layer 4 (E2E): Partially implemented.
- We have Visual Regression Tests (
frontend/tests/visual.spec.js) using Playwright. - Gap: We are missing the "Network Mocking" tests described above. Adding these is the next step to prevent regressions like the Settings Race Condition.
- We have Visual Regression Tests (
6. Backend Logic Testing (AppSync)
Our AppSync resolvers are pure JavaScript files (appsync/*.js). We don't need to deploy them to test them.
Strategy: Unit Test the JS
Since resolvers just export request(ctx) and response(ctx), we can test them like any other function.
- Tool: Vitest.
- Method: Import the resolver, pass a mock
ctxobject, and assert the output.
// appsync/updateUserTargets.test.js
import {
request
} from './updateUserTargets';
import {
util
} from '@aws-appsync/utils'; // You might need a mock for this
test('should generate correct UpdateItem expression', () => {
const ctx = {
args: {
calories: 2000
},
identity: {
sub: 'user-123'
}
};
const result = request(ctx);
expect(result.operation).toBe('UpdateItem');
expect(result.key.PK.S).toBe('USER#user-123');
// Check that dynamic expression generation works
expect(result.update.expression).toContain('#Calories = :Calories');
});
Status: Not Implemented. We rely on manual deployment (
task deploy:test) and checking CloudWatch logs. Adding this layer would drastically reduce our feedback loop from ~2 minutes (deploy) to ~2 milliseconds (local test).