AI API Latency Test
API Latency Test Checks if it feels fast. API Relay Test Checks if a relay is usable. TTFT Test Checks first-token speed.

TTFT test for streaming APIs

Measure how fast the first token appears.

TTFT means time to first token. It shows how long users wait before a streaming LLM response starts, which often matters more than the final completion time.

First token Total latency Streaming total Tokens/sec

TTFT is only the beginning

A low first-token delay makes an app feel responsive, but users still care whether the rest of the answer streams smoothly. The report separates first token latency from total response time.

TTFT test result showing 97 percent score, 183 ms TTFT, 3.8 seconds streaming total, and 21.6 tokens per second.
A fast first token can still be limited by output speed, which is why the report explains the score.

What TTFT Measures

TTFT captures the delay between sending a chat completion request and receiving the first streamed content token. It includes routing, queueing, upstream model startup, and network delay.

TTFT vs Latency vs Tokens per Second

TTFT is about when the answer begins. Total latency is about when the answer finishes. Tokens per second is about how quickly the answer continues after it starts.

Why First Token Latency Matters

A low TTFT makes an API feel responsive. A high TTFT can make users think a chat app is stuck, even if the final answer eventually arrives at a reasonable speed.

FAQ

Is TTFT the same as full response time?

No. TTFT measures the first token. Full response time measures the whole completion.

Does TTFT only matter for streaming?

It matters most for streaming, because users can see the response begin before the full answer is done.

Can an API have good TTFT but poor output speed?

Yes. An endpoint can start quickly but generate the rest of the answer slowly, so both metrics matter.

Should I compare TTFT across models?

Yes, but compare similar model sizes and providers. Larger reasoning models often have naturally higher first-token latency.

Check first-token speed before users notice it.

Measure TTFT alongside streaming total time and output speed in one quick test.

Measure TTFT now