7 min Agent Xero

The Coder-in-the-Loop Checklist

How to work with AI tools without accumulating technical debt or shipping vulnerabilities.

What is Coder-in-the-Loop?

Coder-in-the-loop means using AI as a tool—not a replacement. The human remains accountable for architecture, security, and maintainability.

The Problem with “Vibe Coding”

AI can generate working code fast. But:

  • It doesn’t validate edge cases.
  • It doesn’t consider security implications.
  • It doesn’t design for maintainability.
  • It doesn’t understand your specific constraints.

If you ship AI-generated code without review, you’re accumulating invisible debt.

The Checklist

Use this checklist every time you merge AI-generated (or AI-assisted) code.

1. Security Review

  • Input validation on all user-facing endpoints
  • Authentication and authorization implemented correctly
  • No secrets or API keys hardcoded
  • SQL injection protection (parameterized queries)
  • XSS protection (sanitize outputs)
  • CSRF tokens on state-changing requests

Red flag: If AI generated auth code, assume it’s wrong until proven otherwise.

Reference: OWASP Foundation. (2021). OWASP Top Ten. https://owasp.org/www-project-top-ten/
See especially: A01:2021-Broken Access Control, A03:2021-Injection, A05:2021-Security Misconfiguration

2. Error Handling

  • All async operations have error handling
  • User-facing errors don’t leak internal details
  • Errors are logged with sufficient context
  • Edge cases (network failure, timeout, invalid state) are handled

Red flag: Code that assumes “happy path” only.

Best Practice: According to Node.js best practices (Goldberg, Y., Node.js Best Practices Repository, 2024), never return errors to clients containing stack traces or internal system details. Log full errors server-side; return generic user-friendly messages to clients.

3. Type Safety

  • TypeScript strict mode enabled (if applicable)
  • Zod (or similar) validation on external inputs
  • No any types (or explicitly justified)
  • Database schema matches application types

Red flag: Liberal use of as unknown as T or @ts-ignore.

Reference: Microsoft TypeScript Team. (2026). TypeScript Handbook: Strict Type-Checking Options. https://www.typescriptlang.org/docs/handbook/compiler-options.html

Zod Validation: Colinhacks. (2026). Zod: TypeScript-first schema validation with static type inference. https://github.com/colinhacks/zod
Zod provides runtime validation that matches TypeScript types, catching type mismatches at API boundaries.

4. Performance

  • No N+1 queries
  • Database indexes on queried fields
  • Pagination on list endpoints
  • Rate limiting on public APIs

Red flag: AI code often doesn’t optimize for scale.

N+1 Query Problem: The N+1 query problem occurs when code executes one query to fetch a list, then executes N additional queries (one per item) to fetch related data. Solution: Use JOIN queries or eager loading.
Reference: Winand, M. (2012). SQL Performance Explained. https://sql-performance-explained.com/

5. Testing

  • Critical paths have test coverage
  • Auth flows are tested
  • Error cases are tested (not just success paths)

Red flag: AI generates code, not tests. You write those.

Testing Pyramid: According to Martin Fowler’s Testing Pyramid (Fowler, M., 2012), prioritize:

  1. Unit tests (70%): Fast, isolated, test business logic
  2. Integration tests (20%): Test component interactions
  3. E2E tests (10%): Test critical user flows

6. Observability

  • Structured logging (JSON, not console.log)
  • Key operations are instrumented (auth, payments, errors)
  • Alerts configured for critical failures

Red flag: No visibility into production behavior.

Structured Logging Standard: OpenTelemetry. (2026). OpenTelemetry Logging Specification. https://opentelemetry.io/docs/specs/otel/logs/
Use JSON-formatted logs with consistent field names (timestamp, level, message, trace_id, user_id, etc.) for machine-parseable observability.

7. Deployment

  • Environment variables documented
  • Rollback plan exists
  • Database migrations are reversible
  • Secrets are rotated post-deployment (if exposed during development)

Database Migration Best Practices:

  • Forward-only migrations are safest in production
  • Always test rollback procedures in staging
  • Use feature flags to decouple code deployment from feature activation

Reference: Sadalage, P. J., & Fowler, M. (2012). NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley. (Chapter on evolutionary database design)

When to Get Human Review

Use this heuristic:

  • Ship without review: UI tweaks, copy changes, non-critical refactors
  • Peer review: New features, architectural changes, database migrations
  • Expert review: Auth, payments, compliance-sensitive code

Agent Xero sessions are designed for that “expert review” layer.

Tools We Recommend


Need a second pair of eyes? Book a code review session →


References

  1. OWASP Foundation. (2021). OWASP Top Ten Application Security Risks. https://owasp.org/Top10/2021
  2. Goldberg, Y. (2024). Node.js Best Practices. GitHub Repository. https://github.com/goldbergyoni/nodebestpractices
  3. Microsoft TypeScript Team. (2026). TypeScript Handbook: Compiler Options. https://www.typescriptlang.org/docs/handbook/
  4. Colinhacks. (2026). Zod: TypeScript-first schema validation. https://github.com/colinhacks/zod
  5. Winand, M. (2012). SQL Performance Explained. https://sql-performance-explained.com
  6. Fowler, M. (2012). TestPyramid. https://martinfowler.com/bliki/TestPyramid.html
  7. OpenTelemetry. (2026). OpenTelemetry Logging Specification. https://opentelemetry.io/docs/specs/otel/logs/
  8. Sadalage, P. J., & Fowler, M. (2012). NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley.

Last verified: February 2026