Name: DevTools
Author: DevTools

Your API tests all pass. Every endpoint returns the right status code, the right shape, the right error messages. Coverage looks solid.

Then a user logs in, creates a resource, and gets a 500 because the auth token your login endpoint issued uses a claim format your resource service does not understand. No single-endpoint test could have caught that, because no single-endpoint test exercises the handoff.

This is not an edge case. It is the most common category of API bug in production: the spaces between endpoints, where data flows from one service to another and assumptions break silently.

The false confidence of passing endpoint tests

A single-request API test validates one thing: "Given this exact input, does this endpoint return the expected output?" That is useful. It is also incomplete.

Here is what a typical isolated test setup looks like: two separate flows, each testing one endpoint independently with no data passing between them.

workspace_name: Isolated Endpoint Tests

run:
  - flow: TestLogin
  - flow: TestProjects

flows:
  # Test 1: POST /auth/login (standalone)
  - name: TestLogin
    steps:
      - request:
          name: Login
          method: POST
          url: https://api.example.com/auth/login
          headers:
            Content-Type: application/json
          body:
            email: 'test@example.com'
            password: 'password123'
      - js:
          name: CheckLogin
          code: |
            export default function(ctx) {
              if (ctx.Login?.response?.status !== 200) throw new Error("Login failed");
              if (!ctx.Login?.response?.body?.access_token) throw new Error("No token");
              return { passed: true };
            }
          depends_on: Login

  # Test 2: GET /api/projects (hardcoded token, no chaining)
  - name: TestProjects
    steps:
      - request:
          name: GetProjects
          method: GET
          url: https://api.example.com/api/projects
          headers:
            Authorization: 'Bearer eyJhbGciOiJIUzI1NiIs...'
      - js:
          name: CheckProjects
          code: |
            export default function(ctx) {
              if (ctx.GetProjects?.response?.status !== 200) throw new Error("Get projects failed");
              if (!Array.isArray(ctx.GetProjects?.response?.body?.projects)) throw new Error("Not an array");
              return { passed: true };
            }
          depends_on: GetProjects

Both tests pass. But the token in the second flow is hardcoded. It was copied from a previous manual run. It does not come from the login endpoint. If the login endpoint changes its token format, the projects test still passes with the stale token, and you ship a broken auth flow.

This pattern is everywhere. Hardcoded IDs, pre-seeded database state, manually copied tokens. Each one is a gap between "this endpoint works" and "this workflow works."

Five failure modes that only appear in multi-step tests

1. Token format mismatches

What happens: Service A issues tokens. Service B validates them. Both teams update their JWT libraries independently. Service A starts including a new claim. Service B's validation rejects the new format.

Why single-request tests miss it: Each service's tests use their own test tokens. Service A's tests prove it issues valid tokens. Service B's tests prove it accepts valid tokens. Neither test uses a token from the other service.

What catches it: An end-to-end test that logs in through Service A and makes an authenticated request to Service B with the actual token.

2. Create-then-read inconsistency

What happens: The create endpoint writes to a primary database. The read endpoint hits a replica. Replication lag means freshly created resources return 404 for a few hundred milliseconds.

Why single-request tests miss it: The create test validates 201. The read test uses a pre-existing ID and validates 200. Both pass. The timing gap between create and read never gets tested.

What catches it: A chained flow that creates a resource and immediately reads it. If your system has replication lag, this test will fail intermittently, which is exactly the signal you need.

3. Side effects that never fire

What happens: The payment endpoint charges the card and publishes an event. The order service listens for that event and updates the order status. But the event schema changed, and the order service silently drops the message.

Why single-request tests miss it: The payment endpoint returns 200 (the charge succeeded). The order endpoint returns the correct status when queried with a pre-set state. Neither test verifies that the event triggers the state transition.

What catches it: An end-to-end flow that creates an order, processes the payment, then polls the order status until it transitions to "paid" (or times out).

4. Cascading permission failures

What happens: An admin creates a team, invites a user, and the user accesses team resources. The invite endpoint grants the right permissions, but the resource endpoint checks a different permission scope that the invite does not set.

Why single-request tests miss it: The invite test validates 200. The resource access test uses a pre-authorized user and validates 200. The permission gap between "invited" and "authorized" is never tested.

What catches it: A multi-step flow that follows the actual user journey: admin creates team, admin invites user, user accepts invite, user accesses team resource.

5. State machine violations

What happens: An order can be in states: draft, submitted, approved, shipped. The submit endpoint moves draft to submitted. The approve endpoint moves submitted to approved. But due to a race condition, two concurrent approve calls can move the same order to "shipped" without going through the shipping validation.

Why single-request tests miss it: Each state transition endpoint is tested individually with the correct precondition. The test for "approve" starts with a submitted order and validates it becomes approved. The invalid transition path is never exercised.

What catches it: An end-to-end flow that walks through the full state machine: create draft, submit, approve, ship. Bonus: a negative test that tries to ship directly from submitted and validates it fails.

Three real-world scenarios

These are not hypothetical. They are patterns that show up in production postmortems.

Scenario 1: Payment flow

A checkout flow involves five API calls:

Create cart
Add items
Apply discount code
Process payment
Confirm order

The discount code endpoint returns a discount_id. The payment endpoint expects discount_id as a field in the request body. During a refactor, the discount endpoint renames the field to promo_id. The payment endpoint still expects discount_id. Every individual test passes because they use hardcoded values. The integration breaks.

An end-to-end test that chains all five steps, passing real data between them, catches this on the first CI run after the rename.

Scenario 2: User onboarding

A new user flow involves:

Register account
Verify email (or simulate verification via API)
Complete profile
Get personalized dashboard

The dashboard endpoint returns different content for users who have completed their profile versus those who have not. The profile completion endpoint sets a flag in the database. But a recent migration changed the flag column name from profile_complete to onboarding_finished, and the dashboard query still checks the old column.

Individual tests: register returns 201, profile update returns 200, dashboard returns 200 with mock data. All pass. The end-to-end test that registers, completes the profile, and checks the dashboard content catches the disconnect.

Scenario 3: Data pipeline

An analytics API involves:

Ingest events (POST /events)
Trigger aggregation (POST /aggregate)
Query results (GET /reports/:id)

The aggregation endpoint is asynchronous. It returns 202 and processes in the background. The report is not available until aggregation completes. Single-request tests mock the timing. An end-to-end test that ingests, triggers, polls until ready, and then queries catches timing regressions, schema mismatches between ingest and query, and aggregation errors that only surface with real event data.

What "end-to-end" means for APIs

End-to-end API testing is not the same as browser-based end-to-end testing (Cypress, Playwright, Selenium). The distinction matters:

Browser e2e: Drives a UI, validates the full stack including rendering and JavaScript
API e2e: Drives HTTP calls, validates the backend workflow without a browser

API end-to-end tests are faster (no browser overhead), more stable (no CSS selector fragility), and more targeted (testing the business logic layer specifically). They complement browser e2e tests but do not replace them.

The scope of an API end-to-end test is: one complete user journey, expressed as a chain of API calls with real data flowing between them.

How to start testing workflows instead of endpoints

You do not need to rewrite your test suite. Start with the three workflows that matter most:

Step 1: Identify your critical paths

Every product has a handful of multi-step workflows that, if broken, cause immediate user impact. Common candidates:

Authentication and authorization flow
The primary CRUD lifecycle for your core resource
Payment or billing flow (if applicable)
Onboarding or setup flow

Step 2: Map the requests

For each critical path, list the API calls in order and mark the data that flows between them:

POST /auth/login
  → produces: access_token
POST /api/projects (uses: access_token)
  → produces: project_id
GET /api/projects/:id (uses: access_token, project_id)
  → validate: body matches expected

Step 3: Build the flow

You have two options:

Option A: Write YAML by hand. Define a flow with request steps, JS extraction nodes, and depends_on chains. Good for simple chains (3-5 steps).

Option B: Build visually with Flows. Drag request nodes onto a canvas, connect them, and let auto-variable-mapping handle the wiring. Better for complex workflows (5+ steps, conditions, loops).

DevTools Flows gives you the visual approach with YAML export. You build the graph, verify it locally, and commit the YAML to Git for review and CI execution.

For a step-by-step tutorial, see: How to Build an End-to-End API Test: Login, Create, Verify, Delete.

Step 4: Run in CI

Add the flows to your CI pipeline. Run them on PRs to catch regressions before merge. Use JUnit reports so failures show up directly in your PR checks.

For CI setup details, see: API Test Automation: End-to-End Flows That Run on Every PR.

FAQ

How is this different from integration testing? Integration tests typically verify that two components talk to each other correctly, often with mocks for anything outside the pair. End-to-end API tests verify a complete user workflow across all the real components. The boundary is not always sharp, but the intent is different: integration tests check connections, e2e tests check journeys.

Do I need a special environment for end-to-end tests? You need a backend that behaves like production. A dedicated staging environment, an ephemeral per-PR environment, or even a local Docker Compose setup all work. The key requirement is that the services are real (not mocked) and the data flow is realistic.

How do I handle test data cleanup? Add teardown steps at the end of each flow that delete the resources created during the test. Use unique identifiers (run IDs, timestamps) so tests do not collide when running in parallel. If cleanup fails, it should not block the test result, but it should log a warning.

Won't end-to-end tests be slow? Slower than unit tests, yes. But a well-designed e2e flow with 5-10 steps typically runs in 2-10 seconds, depending on your API's response times. That is fast enough to run on every PR. The cost of not catching workflow bugs is much higher than the cost of a few seconds of CI time.

Stop trusting isolated tests

If you want confidence that your API works for real users, you need to test the way real users interact with it: as a sequence of dependent calls where each step builds on the last.

Start with your three most critical workflows. Chain the requests. Pass real data. Validate at every step. Run in CI.

DevTools Flows makes this practical: visual workflow builder, auto-mapped variables, YAML export, CI-native execution. Build your first end-to-end flow at dev.tools.

For the full guide to end-to-end API testing, including CI setup and common pitfalls, see: End-to-End API Testing: The Complete Guide.