The Coder-in-the-Loop Checklist
How to work with AI tools without accumulating technical debt or shipping vulnerabilities.
What is Coder-in-the-Loop?
Coder-in-the-loop means using AI as a tool—not a replacement. The human remains accountable for architecture, security, and maintainability.
The Problem with “Vibe Coding”
AI can generate working code fast. But:
- It doesn’t validate edge cases.
- It doesn’t consider security implications.
- It doesn’t design for maintainability.
- It doesn’t understand your specific constraints.
If you ship AI-generated code without review, you’re accumulating invisible debt.
The Checklist
Use this checklist every time you merge AI-generated (or AI-assisted) code.
1. Security Review
- Input validation on all user-facing endpoints
- Authentication and authorization implemented correctly
- No secrets or API keys hardcoded
- SQL injection protection (parameterized queries)
- XSS protection (sanitize outputs)
- CSRF tokens on state-changing requests
Red flag: If AI generated auth code, assume it’s wrong until proven otherwise.
Reference: OWASP Foundation. (2021). OWASP Top Ten. https://owasp.org/www-project-top-ten/
See especially: A01:2021-Broken Access Control, A03:2021-Injection, A05:2021-Security Misconfiguration
2. Error Handling
- All async operations have error handling
- User-facing errors don’t leak internal details
- Errors are logged with sufficient context
- Edge cases (network failure, timeout, invalid state) are handled
Red flag: Code that assumes “happy path” only.
Best Practice: According to Node.js best practices (Goldberg, Y., Node.js Best Practices Repository, 2024), never return errors to clients containing stack traces or internal system details. Log full errors server-side; return generic user-friendly messages to clients.
3. Type Safety
- TypeScript strict mode enabled (if applicable)
- Zod (or similar) validation on external inputs
- No
anytypes (or explicitly justified) - Database schema matches application types
Red flag: Liberal use of as unknown as T or @ts-ignore.
Reference: Microsoft TypeScript Team. (2026). TypeScript Handbook: Strict Type-Checking Options. https://www.typescriptlang.org/docs/handbook/compiler-options.html
Zod Validation: Colinhacks. (2026). Zod: TypeScript-first schema validation with static type inference. https://github.com/colinhacks/zod
Zod provides runtime validation that matches TypeScript types, catching type mismatches at API boundaries.
4. Performance
- No N+1 queries
- Database indexes on queried fields
- Pagination on list endpoints
- Rate limiting on public APIs
Red flag: AI code often doesn’t optimize for scale.
N+1 Query Problem: The N+1 query problem occurs when code executes one query to fetch a list, then executes N additional queries (one per item) to fetch related data. Solution: Use JOIN queries or eager loading.
Reference: Winand, M. (2012). SQL Performance Explained. https://sql-performance-explained.com/
5. Testing
- Critical paths have test coverage
- Auth flows are tested
- Error cases are tested (not just success paths)
Red flag: AI generates code, not tests. You write those.
Testing Pyramid: According to Martin Fowler’s Testing Pyramid (Fowler, M., 2012), prioritize:
- Unit tests (70%): Fast, isolated, test business logic
- Integration tests (20%): Test component interactions
- E2E tests (10%): Test critical user flows
6. Observability
- Structured logging (JSON, not console.log)
- Key operations are instrumented (auth, payments, errors)
- Alerts configured for critical failures
Red flag: No visibility into production behavior.
Structured Logging Standard: OpenTelemetry. (2026). OpenTelemetry Logging Specification. https://opentelemetry.io/docs/specs/otel/logs/
Use JSON-formatted logs with consistent field names (timestamp, level, message, trace_id, user_id, etc.) for machine-parseable observability.
7. Deployment
- Environment variables documented
- Rollback plan exists
- Database migrations are reversible
- Secrets are rotated post-deployment (if exposed during development)
Database Migration Best Practices:
- Forward-only migrations are safest in production
- Always test rollback procedures in staging
- Use feature flags to decouple code deployment from feature activation
Reference: Sadalage, P. J., & Fowler, M. (2012). NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley. (Chapter on evolutionary database design)
When to Get Human Review
Use this heuristic:
- Ship without review: UI tweaks, copy changes, non-critical refactors
- Peer review: New features, architectural changes, database migrations
- Expert review: Auth, payments, compliance-sensitive code
Agent Xero sessions are designed for that “expert review” layer.
Tools We Recommend
- Zod: Runtime validation (zod.dev)
- Sentry: Error tracking (sentry.io)
- Playwright: End-to-end testing (playwright.dev)
- OpenTelemetry: Observability (opentelemetry.io)
- Dependabot: Dependency updates (github.com/dependabot)
Need a second pair of eyes? Book a code review session →
References
- OWASP Foundation. (2021). OWASP Top Ten Application Security Risks. https://owasp.org/Top10/2021
- Goldberg, Y. (2024). Node.js Best Practices. GitHub Repository. https://github.com/goldbergyoni/nodebestpractices
- Microsoft TypeScript Team. (2026). TypeScript Handbook: Compiler Options. https://www.typescriptlang.org/docs/handbook/
- Colinhacks. (2026). Zod: TypeScript-first schema validation. https://github.com/colinhacks/zod
- Winand, M. (2012). SQL Performance Explained. https://sql-performance-explained.com
- Fowler, M. (2012). TestPyramid. https://martinfowler.com/bliki/TestPyramid.html
- OpenTelemetry. (2026). OpenTelemetry Logging Specification. https://opentelemetry.io/docs/specs/otel/logs/
- Sadalage, P. J., & Fowler, M. (2012). NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley.
Last verified: February 2026