Amazon S3 Vectors now returns up to 10,000 similarity-search results per query
AWS raised the per-query result cap for Amazon S3 Vectors to 10,000, a 100x jump from the previous limit. The change targets multi-stage retrieval pipelines — increasingly common in RAG and agentic systems — where an initial vector search must return a large candidate pool that downstream rerankers and aggregators then refine. A small top-k cap forces awkward pagination or multiple round-trips; the higher limit lets pipelines pull a rich candidate set in a single call.
The improvement is part of the same Summit NYC infrastructure push that included S3 annotations, Bedrock Guardrails' InvokeGuardrailChecks, and SageMaker container caching — all aimed at making S3 and Bedrock the default data and retrieval substrate for AI agents. By embedding vector search directly into S3 rather than a separate vector database, AWS leans on its storage gravity to keep retrieval workloads in-house.
Competitively, the move pressures dedicated vector databases like Pinecone, Weaviate, and Milvus, whose pitch partly rests on richer query semantics and higher result limits. The trade-off for S3 Vectors remains latency and feature depth versus specialized engines; a 100x result cap narrows one gap but doesn't address everything purpose-built vector DBs offer. For teams already on AWS, though, the convenience and cost of co-locating vectors with object storage is compelling. Watch latency at the 10,000-result tier and how reranking throughput holds up.