Back to blog
google places apipaginationdevelopernext_page_token

Google Places API next_page_token Delay: Why It Exists and How to Handle It

Why Google Places API requires a delay before next_page_token works, the actual seconds, and how MapsLeads removes the pagination headache.

MapsLeads Team2026-05-0211 min read

Every developer who integrates the Google Places API hits the same wall during their first afternoon of coding. You run a Text Search, you get twenty results back along with a shiny next_page_token, and you immediately fire a second request to grab the next page. The API responds with INVALID_REQUEST. You read the docs again, you double-check the token, you compare strings character by character, and everything looks correct. The Google Places API next_page_token delay is the reason your perfectly valid request is being rejected, and understanding why it exists is the first step to building a pagination loop that does not silently lose data or hammer your error logs at three in the morning.

This article walks through what the token actually is, how long you really need to wait before reusing it, the patterns that work in production, the hard cap on how many results pagination can ever return, and what you can do when this throughput limit starts blocking real work. If you are extracting business leads at any kind of scale, the delay is not a quirk you tune around. It is a structural ceiling on how fast you can move.

What next_page_token is

The next_page_token is an opaque string Google returns alongside Text Search and Nearby Search results when more results exist beyond the current page. It is not a cursor in the traditional database sense and it is not a stable offset. Think of it as a server-side handle to a prepared result set that Google is building for you in the background. You pass the token back as a query parameter on a follow-up request to the same endpoint, with no other parameters required, and Google returns the next page of up to twenty results.

Tokens are short-lived. They are tied to the original query parameters, the location bias, the radius, and the API key context. You cannot mix and match. You cannot store them for later. You certainly cannot build a queue of tokens to fan out to workers, because by the time the second worker picks up the job the token is either invalid or gone.

Why the delay exists (server-side processing)

The delay is not a rate-limiting trick. Google does not invalidate the token to slow you down for billing reasons. The actual mechanism is that when the first response is sent, the server is still assembling the next page of results in the background. It needs to expand the search, score additional candidates against your query, dedupe them against the page you already received, and stage them in a cache keyed to the token.

If you fire the follow-up request before that staging is complete, the token exists but does not yet point to anything. Google returns INVALID_REQUEST because, from the server side, the token has not been promoted from pending to ready. Once it is ready, the same exact request will succeed. Nothing about your code changed. The only variable is wall-clock time.

This explains why retry usually works. It also explains why a fast network and a fast machine make the problem worse, not better. The closer you are to Google's servers, the faster you fire the second request, and the more reliably you trip the not-yet-ready window.

How long to wait

In practice, two seconds is the right floor for a first attempt. Most of the time the token is ready inside that window. Plenty of teams use a flat two-second sleep and never investigate further because their pagination loops just work.

Under load, during regional spikes, or when the underlying query is expensive, the staging can take longer. Five seconds is a safe ceiling for a single retry. If you still get INVALID_REQUEST after five seconds with the same token, you have one of two situations: either you are in a rare slow path and need to wait another beat, or something else is wrong with the request such as the token being malformed in transit or paired with mismatched parameters.

The signal Google gives you is binary. INVALID_REQUEST on a token-only follow-up almost always means the token is not yet ready. ZERO_RESULTS on the same call means the token resolved but the next page legitimately had nothing in it.

Implementation patterns

The simplest pattern is sleep and retry. Wait two seconds, send the request, and if you get INVALID_REQUEST sleep another two seconds and try again. Cap the retries at three or four attempts and move on. This works for low-volume scripts and one-off extractions.

For production code, exponential backoff is more disciplined. Start at one and a half seconds, double on failure, cap at eight seconds, and treat repeated failures as a real error rather than a token-not-ready signal. Wrap the whole flow in a circuit breaker so a regional outage does not cause your worker pool to spin uselessly on dead tokens.

A subtler pattern is to overlap useful work with the wait. If you are processing the first page while waiting for the second, the two-second delay disappears into your pipeline. Persist the page-one results, normalize them, enqueue them for enrichment, and by the time you are done the token is ready. This is the only way to make the delay invisible without changing the API.

Pagination cap

Here is the rule that turns the delay from a nuisance into a real constraint: Text Search and Nearby Search return at most sixty results per query, paginated as three pages of twenty. There is no fourth page. There is no parameter you can pass to lift the cap. After the third page you will not see another next_page_token, and if you somehow construct one yourself the API will reject it.

Sixty is the hard ceiling. If a city has three hundred coffee shops, a single query gives you at most sixty of them. The other two hundred and forty exist in Google's index but are unreachable through this endpoint without changing the search parameters. To get them, you have to slice the geography, narrow the keyword, or shrink the radius and stitch overlapping queries together while deduping.

This cap is documented but easy to miss. We covered the broader picture in our Google Maps API limits explained guide, and the sixty-result ceiling is one of the most common surprises teams report when they move from prototype to production.

Common errors

INVALID_REQUEST on a token request almost always means the token is not yet ready. Wait and retry. If it persists past five seconds, log the original parameters and inspect the token for transport corruption.

OVER_QUERY_LIMIT means you have crossed a per-second or per-day quota for your API key. The next_page_token does not have its own separate budget. The follow-up request counts against the same Text Search or Nearby Search quota as the original. Three pages equal three billable calls.

REQUEST_DENIED means your key is missing the right API enabled, the referer is wrong, or the billing account is not in good standing. Tokens do not bypass key configuration.

UNKNOWN_ERROR is the rare case where Google itself failed mid-flight. Retry once, and if it persists treat the page as lost rather than blocking the whole job.

Implications for bulk extraction

Multiply the delay by the page count and the picture sharpens. Fully exhausting one query takes at minimum two seconds for the second page and another two for the third, plus the latency of the calls themselves. Call it five to seven seconds end to end for sixty results. That is roughly ten queries per minute, per worker, in the best case.

If your goal is to pull every restaurant in a metro area, and the metro splits into eighty grid cells once you respect the sixty-result cap, you are looking at eight to ten minutes of wall time per worker before you even consider quota windows or retry overhead. We unpacked the economics of this approach in Google Maps API pricing vs scraping, and the rate-times-delay product is the number that ends up in every realistic capacity plan.

When the throughput limit becomes a blocker

For a one-off prototype or a hobby project, the delay is fine. You write the sleep, you walk away for a coffee, and the data is there when you get back.

For a sales team that needs five thousand qualified prospects in a region by Monday morning, the delay is no longer a quirk. It is the gating factor. You cannot parallelize past the per-key quota without provisioning more keys, and even then each key still pays the two-second toll on every page boundary. The infrastructure to manage keys, rotate them, handle billing, and dedupe across grid cells is real work, and it is work that has nothing to do with the leads you actually want.

This is the inflection point where teams either build a serious extraction pipeline in-house or hand the problem to a tool that has already solved it. We described the trade-offs of the in-house path in Bulk Google Maps data extraction.

How MapsLeads handles pagination behind the scenes

MapsLeads removes the pagination problem from your workflow entirely. There is no token to manage, no sleep to tune, no retry policy to write, and no sixty-result cap pressing down on your queries. You type a search, the platform handles the geography slicing, the page stitching, the deduping, and the rate management on the backend, and the results show up in a single list ready to export.

When you scale a query that would have required dozens of grid cells and hundreds of paginated calls through the raw API, MapsLeads runs the same expansion logic that disciplined engineering teams build internally, except you do not have to build it. The platform tracks which cells have been covered, which businesses have already been seen across overlapping cells, and which results need a second pass to collect contact information that the first call did not surface.

Pricing is credit-based and transparent so you can model costs before you run anything. Every business in a search costs one credit on the Base tier. Adding the Contact Pro layer, which surfaces verified emails and direct phone numbers when available, costs an extra credit per record. Adding the Reputation layer, which pulls review counts, ratings, and recent review signals, is another credit. Adding the Photos layer, which extracts up to a configurable number of business photos for use in your CRM or outreach, costs two credits per record. You only pay for the layers you turn on, and you only pay for records you actually export. Browsing search results is free, so you can validate that a query returns what you expect before spending anything. Full tier details are on the Pricing page.

The practical effect is that the question of how long to wait between paginated calls disappears from your day. So does the question of how to break a city into cells, how to dedupe across them, and how to rotate keys when one of them hits a daily cap.

FAQ

How long is the next_page_token delay? Two seconds is the practical minimum. Five seconds is a safe ceiling for a single retry. Most tokens are ready in well under three seconds.

Why am I getting INVALID_REQUEST? On a token-only follow-up call, this almost always means the token has not finished staging on Google's side. Wait and retry. If it persists past five seconds, inspect the token for corruption and the original request for parameter mismatch.

How many results can next_page_token return? At most sixty per query, delivered as three pages of twenty. There is no fourth page. To get more results you must slice the query into smaller geographic or keyword segments.

Is there a next_page_token alternative? Not within the same Text Search or Nearby Search call. The alternatives are to use the newer Places API endpoints with their own pagination semantics, or to bypass pagination entirely by using a tool like MapsLeads that handles the slicing and stitching for you.

Does the delay count against my quota? The delay itself does not. The follow-up request does. Each paginated call is billed as a normal Places API call against your key.

Can I cache next_page_token for later use? No. Tokens are short-lived and tied to the original query context. Treat them as single-use and consume them within the same workflow that produced them.

Conclusion

The Google Places API next_page_token delay exists because the server is still assembling your next page when the first response is sent. Two seconds is the floor, five seconds is the ceiling, and the sixty-result cap means pagination alone will never give you a complete dataset for any non-trivial geography. Sleep and retry works for prototypes. Exponential backoff and overlapped processing works for production. Neither approach lifts the cap, and neither makes bulk extraction fast.

If you have spent more than an afternoon writing pagination logic, you have spent more than that logic is worth. Get started with MapsLeads and let the platform handle the tokens, the delays, and the geography while you focus on the leads.