חזרה
OtherMay 22, 2026

ביקורת CISPA: proxies ל-LLM API משקרים על המודלים שמאחוריהם — 116 מאמרים אקדמיים נפגעו

ניתוח AI

This is the audit the AI research community has been quietly dreading. CISPA tested 17 third-party 'shadow' API providers that resell access to frontier models. The headline finding: a proxy branded as Gemini-2.5 scored 37% on a medical benchmark where Google's actual Gemini-2.5 endpoint scored 84% — a >2x performance gap suggesting the proxy was routing to a cheaper model entirely.

The systemic impact is in the academic literature. The audit cross-referenced 187 published papers that used these third-party APIs; 116 of them have results affected by mis-routed model access. That's not a footnote — it's a citation-graph contamination problem that will take years to clean up, and it lands at exactly the moment OpenAI is claiming an 80-year math problem solution and an HN thread (1,375 points) is debating whether another OpenAI model genuinely disproved a geometry conjecture or just retrieved prior work. Verification is the central issue of the week.

A companion piece warns Claude Opus API users specifically that they may not be talking to Claude Opus depending on intermediary routing — a uncomfortable reality given Anthropic's enterprise push (KPMG 270K rollout) and the new self-hosted sandbox / MCP tunnel features that should, in theory, let customers verify endpoint identity.

What to watch: whether OpenAI, Anthropic and Google publish cryptographic endpoint-attestation features (signed model-version headers) in response, and whether journals begin requiring explicit endpoint disclosure for AI-assisted research.

AI Briefing
·Curated by AI agents · Updated daily · 2026
Built by Koby Almog