← All vendors
Hugging Face logo
Vendor

Hugging Face AI News

Every AI news story AI Briefing has published about Hugging Face — 43 articles spanning Apr 7, 2026 – Jun 20, 2026. Track Hugging Face's model releases, research papers, product launches, funding rounds, and partnerships across the AI industry, updated daily.

43 articles · Apr 7, 2026 – Jun 20, 2026

Liquid AI releases LFM2.5 retrieval models for fast multilingual search

Liquid AI released two new 350M-parameter retrieval models on Hugging Face — LFM2.5-ColBERT-350M (late-interaction) and LFM2.5-Embedding-350M (dense bi-encoder) — designed for fast multilingual and cross-lingual search across 11 languages. They are the first bidirectional members of the LFM family, available under the LFM Open License v1.0.

2026-06-20

GLM-5.2 tops Artificial Analysis intelligence index, electrifying open-weights scene

Z.AI's GLM-5.2, released on Hugging Face under an MIT license, topped Artificial Analysis's intelligence index (877 points) and offers a 1M-token context for long-horizon agentic tasks. It positions competitively against Claude Opus 4.8 and Gemini 3.1 Pro, with r/LocalLLaMA hailing it as 'a win for local AI.'

2026-06-19

Hugging Face launches enterprise service accounts and secretless trusted publishing

Hugging Face introduced service accounts — dedicated organization-owned identities for programmatic access in CI/CD and automation — plus secretless trusted publishing from GitHub, GitLab, and CI with gated repo access. Transformers v5.12.0 added MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT.

2026-06-19

GLM-5.2 tops open-weights leaderboard with 1M-token context at ~1/6 the cost

Z.ai's GLM-5.2 became the new leading open-weights model on the Artificial Analysis intelligence index, built for long-horizon tasks with a stable 1M-token context. Forbes reports it scores 62.1 on SWE-bench Pro, edging GPT-5.5, at roughly one-sixth the cost of leading US closed models, with architectural gains like IndexShare and improved speculative decoding.

2026-06-18

'Your agent doesn't have a trust problem — it has an authority problem'

A widely-shared dev.to essay argues that when one agent acts on behalf of another — accepting tasks, calling tools, spending balances — the instinctive 'trust' framing has no good answer, since inspection can't establish trust. It reframes the real challenge as designing scoped authority and delegation, echoing an emerging industry consensus that the harness, not the prompt, is now the product.

2026-06-07

Researchers build self-spreading enterprise AI worm powered by open-weight LLM

Researchers demonstrated a self-replicating AI worm that uses a bring-your-own open-weight LLM to propagate across enterprise environments, reigniting open-vs-closed safety debates. The prototype shows how freely downloadable model weights can be weaponized into autonomous, potentially unstoppable malware.

2026-06-07

Hugging Face Transformers RCE flaw enables stealthy compromise via model configs

A high-severity remote code execution vulnerability (CVE-2026-4372) in Hugging Face Transformers lets attackers compromise systems via malicious model configuration files, bypassing the trust_remote_code=false flag using an underscore-prefixed _attn_implementation_internal parameter. The library is downloaded 146M+ times monthly, making the blast radius huge.

2026-06-06

Ideogram 4 releases as open-weight text-to-image model

Ideogram released Ideogram 4, an open-weight text-to-image foundation model trained from scratch with a single-stream Diffusion Transformer architecture. It introduces a structured JSON prompting interface and offers best-in-class multilingual text rendering, deep language understanding and native 2K-resolution images. Hugging Face highlighted the weights as state-of-the-art and open.

2026-06-05

Ideogram 4 released as open-weights text-to-image model on Hugging Face

Ideogram 4, a new open-weight text-to-image foundation model, was released on Hugging Face with inference code and weights publicly available. Trained from scratch, it features a single-stream Diffusion Transformer architecture, best-in-class multilingual text rendering, and supports structured JSON prompting.

2026-06-04

MiniMax releases open-weights M3 model, plans China IPO

MiniMax released MiniMax M3, described as the first open-weights model combining three frontier capabilities including coding and agentic use with 1M context, and reported strong growth plus plans for a Mainland China listing. It also failed to dismiss a US lawsuit from Disney, Universal and Warner Bros Discovery over alleged IP theft.

2026-06-03

Liquid AI ships LFM2.5-8B-A1B on-device MoE with 128K context

Liquid AI released LFM2.5-8B-A1B on Hugging Face, an improved on-device Mixture-of-Experts model with a 128K context window, pretraining scaled from 12T to 38T tokens, and large-scale RL. Its vocabulary doubled to improve non-Latin tokenization, enabling tool-call chaining and efficient operation on entry-level laptops.

2026-05-31

Hugging Face open-sources $2,500 3D-printable humanoid robot

Hugging Face open-sourced complete plans for a bipedal humanoid robot platform costing roughly $2,500 in parts. The design uses 75 3D-printed files, common actuators and electronics, and a public Onshape CAD model for inspection and modification — extending the LeRobot project into humanoid form factor.

2026-05-30

Hugging Face Cuts Async RL Weight Sync Bandwidth ~100x

Clement Delangue said the HF science team made async RL weight sync ~100x cheaper on bandwidth and removed the shared-cluster requirement — a meaningful unlock for distributed RL on frontier-scale models.

2026-05-29

Human Archive raises funding for worker-generated AI training datasets

Human Archive closed new funding to build curated datasets sourced directly from workers, joining a wave of labor-data startups feeding frontier model training with consented, domain-specific human data — a category Hugging Face's ecosystem has been actively amplifying as the public web saturates as a training source.

2026-05-27

ByteDance Seed releases Cola DLM: 2B non-autoregressive diffusion language model

ByteDance Seed released Cola DLM (Continuous Latent Diffusion Language Model) on May 7, 2026 — a 2B-parameter non-autoregressive model that plans entire passages in continuous latent space before decoding to tokens in a single pass. It's the first openly released non-autoregressive recipe of its kind, distributed via Hugging Face.

2026-05-23

NanoClaw rejects $20M buyout, raises $12M seed with Hugging Face's Clem Delangue backing

NanoClaw creator Gavriel Cohen turned down a $20M acquisition offer and instead raised a $12M seed led by Valley Capital Partners, with Docker, Vercel, Monday.com, Slow Ventures, and angels including Hugging Face CEO Clem Delangue participating. The project went viral within weeks, picking up endorsements from Andrej Karpathy and Singapore's foreign minister.

2026-05-23

Cohere Drops Command A+ on Hugging Face; NVIDIA Kimi-K2.6-NVFP4 Quantized Release

Cohere released Command A+ on Hugging Face, its most powerful LLM yet, optimized to run on minimal hardware. NVIDIA separately released Kimi-K2.6-NVFP4, a quantized version of Moonshot AI's Kimi-K2.6 with a 256K context window and text/image/video input support, packaged for ready-to-deploy NVIDIA GPU inference.

2026-05-21

MiniCPM-V 4.6 API and OlmoEarth v1.1 ship on Hugging Face

MiniCPM-V 4.6, a pocket-sized multimodal LLM built on SigLIP2-400M and Qwen3.5-0.8B for ultra-efficient image and video understanding on mobile, is now available as a public free API on Hugging Face. Separately, AI2 released OlmoEarth v1.1 — a more efficient family of open Earth-observation foundation models — continuing its open-science geospatial push.

2026-05-20

Open-model bonanza: Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, GLM-5.1 land in one week

Hugging Face's 'latest open artifacts' report highlights a wave of new open-source releases — DeepSeek-V4-Flash (284B total / 13B active) and V4-Pro, NVIDIA's quantized Kimi-K2.6-NVFP4 (1T params / 32B activated), Gemma 4, MiMo 2.5, and GLM-5.1. Evaluations suggest they still lag American frontier labs in aggregate but offer cost-effective alternatives.

2026-05-19

Fake 'OpenAI Privacy Filter' repo hits #1 trending, pushes Rust infostealer to 244K downloads

A malicious Hugging Face repository impersonating OpenAI's Privacy Filter reached #1 trending and amassed ~244,000 downloads in 18 hours before takedown. It dropped a Rust infostealer targeting Windows, Chromium browsers, Discord, and crypto wallets; six related malicious repos were also identified.

2026-05-17

Hugging Face releases Granite Embedding multilingual R2 with 32K context

Hugging Face released two new multilingual embedding models in collaboration with IBM Granite: 311M and 97M parameter variants supporting 200+ languages with enhanced retrieval for 52 languages and code. Context length jumps to 32,768 tokens — a 64x increase over predecessors. Both models ship Apache 2.0.

2026-05-16

Tokenizer-hijack and OpenAI typosquat malware hit Hugging Face

Researchers showed a single-file tweak to tokenizer libraries inside Hugging Face models can hijack outputs and exfiltrate data, while a top-ranking repo was found typosquatting OpenAI to deliver infostealer malware. IBM separately released Granite Embedding Multilingual R2 (32K context, Apache 2.0) on the hub.

2026-05-15

Malicious model posing as OpenAI release hits 244K downloads, #1 trending in 18 hours

A malicious Hugging Face repository disguised as an official OpenAI release delivered Windows infostealer malware and racked up 244,000 downloads before removal, reaching #1 trending in 18 hours. The incident, paired with a fresh out-of-bounds-read CVE in Ollama, reframes public model hubs as a first-class software supply-chain attack surface.

2026-05-12

Hugging Face releases EMO MoE with emergent modularity from optimization

Hugging Face released EMO (Emergent Modularity from Optimization), a Mixture-of-Experts model pretrained to develop modular structure directly from data without human-defined priors. Users can run a 12.5% expert subset and retain near-full-model performance, enabling flexible memory-accuracy tradeoffs.

2026-05-11

Hugging Face launches agentic toolkit for Reachy Mini open-source robot

Hugging Face released an agentic toolkit for Reachy Mini, the desktop robot from its 2024 Pollen Robotics acquisition, letting developers build and share agent behaviors via the platform. The move extends Hugging Face's 'GitHub of AI' model from datasets and weights into open-source robotics.

2026-05-10

State of Open Source Spring 2026: 13M users, 2M models, robotics datasets surge 23×

Hugging Face hit 13M users, 2M public models and 500K+ datasets, with robotics the fastest-growing community — datasets jumped from 1,145 in 2024 to 26,991 in 2025. The company still has $200M of its $400M raise in the bank, and Mistral Medium 3.5's open weights landed on the Hub the same week.

2026-05-07

Hugging Face launches open-source Reachy Mini App Store with 200+ apps

Hugging Face debuted an app store for its $299 Reachy Mini desktop robot with over 200 community-built open-source apps. CEO Clément Delangue framed it as removing the 'roboticist barrier' and predicted AI model builders will release on Reachy Mini to test new models' robotic capability. About 10,000 Reachy Mini units have shipped since the July 2025 launch.

2026-05-07

Cisco open-sources AI model provenance scanner

Cisco released an open-source tool for AI model lineage tracking with 'compare' (shared lineage between two models) and 'scan' (closest match against a fingerprint database) modes. It targets the lineage explosion from continuous fine-tuning, distillation, and merging.

2026-05-03

Hugging Face hosts Talkie 13B model trained only on pre-1931 text

The Talkie project published talkie-1930-13b-base — a 13B open-weight LLM trained exclusively on 260B tokens of English text published before December 31, 1930 — as a research artifact for studying language and reasoning without modern data contamination.

2026-05-01

OpenAI releases Privacy Filter on Hugging Face — 1.5B-parameter open PII redaction model

OpenAI quietly released Privacy Filter on Hugging Face under Apache 2.0: a 1.5B-parameter bidirectional token-classification model with 50M active parameters, purpose-built for detecting and redacting personally identifiable information at the token level.

2026-04-30

Vellum LLM Leaderboard April update: Opus 4.7, Gemini 3.1 Pro, GPT-5.4 tied at the top

Vellum's April 2026 leaderboard shows three models tied at the top of the Artificial Analysis Intelligence Index at 57: Claude Opus 4.7, Gemini 3.1 Pro Preview, and GPT-5.4. Anthropic holds four of the top five spots; Opus 4.7 leads coding at 82% SWE-bench Verified and 1504 Elo on LM Arena.

2026-04-26

ML Intern open-source agent beats Claude Code on scientific reasoning

Hugging Face released 'ML Intern,' an open-source AI agent built on smolagents that automates the entire LLM post-training workflow — literature reviews, dataset discovery, training, and evaluation. Early benchmarks show it outperforms Claude Code on scientific reasoning.

2026-04-24

Hugging Face Launches ML Intern AI Agent for Automated Research Workflows

Hugging Face launched ML Intern, an open-source AI agent designed to automate end-to-end machine learning research workflows including paper research, dataset creation, and model training. Early benchmarks indicate ML Intern outperforms Anthropic's Claude Code and OpenAI's Codex in scientific reasoning and healthcare evaluations, with Hugging Face offering $1,000 in GPU and Anthropic credits to early users.

2026-04-23

Hugging Face Releases ml-intern: Open-Source AI Agent for Automated LLM Post-Training

Hugging Face launched ml-intern, an open-source AI agent built on its smolagents framework that automates the entire post-training workflow for large language models. The tool autonomously performs literature review, dataset discovery, training script execution, and iterative evaluation, demonstrating significant improvements by pushing a Qwen3-1.7B model's GPQA benchmark score from 8.5% to 32% in under 10 hours.

2026-04-22

FineSteer Framework Enables Fine-Grained Inference-Time Model Steering

Hugging Face introduced FineSteer, a unified framework enabling fine-grained inference-time steering to address undesirable behaviors like safety violations and hallucinations without parameter updates. The framework balances effectiveness, utility preservation, and training efficiency through flexible steering mechanisms.

2026-04-20

Hugging Face Launches Inference Providers Marketplace with Pay-Per-Token Billing Across 10 Vendors

Hugging Face released an Inference Providers marketplace that aggregates access to over 10 compute providers—including Together AI, Replicate, and Fireworks AI—under a single API key and unified pay-per-token billing. Developers can switch providers per request for cost or latency optimization without changing integration code. Over 500 open models are available at launch through the unified endpoint.

2026-04-16

Hugging Face Launches Inference Providers Hub Aggregating 10 External APIs

Hugging Face introduced Inference Providers, a unified hub feature that lets developers call models hosted on Together AI, Fireworks, Replicate, Cerebras, and six other inference providers through a single standardized API key and SDK. Switching providers requires changing one parameter in the InferenceClient, enabling cost comparison and fallback routing. Over 10,000 models are immediately supported across providers at launch.

2026-04-14

MiniMax Open-Sources M2.7 Model Achieving SOTA Performance on Multiple Benchmarks

MiniMax announced that its M2.7 model is officially open-sourced, achieving state-of-the-art performance with a score of 56.22% on the SWE-Pro benchmark and demonstrating competitive results across multiple evaluation metrics. The model release expands the ecosystem of high-performance open-source language models available through Hugging Face, providing developers with additional options for fine-tuning and deployment. MiniMax's decision to open-source the M2.7 model reflects the continuing trend toward making advanced AI capabilities accessible to the broader developer community through permissive licensing.

2026-04-13

MiniMax Open-Sources M2.7 Self-Evolving Agent Model Scoring 56% on SWE-Pro Benchmark

MiniMax released MiniMax M2.7 as an open-source self-evolving agent model, achieving 56.22% on the SWE-Pro coding benchmark and 57.0% on Terminal Bench 2, competitive scores for an openly available model. The model is notable as the first to actively participate in its own development cycle, representing a paradigm shift in LLM construction where models contribute to their own improvement. Model weights are publicly available on Hugging Face, continuing the platform's role as the primary distribution channel for major open-weight releases.

2026-04-12

Hugging Face Updates Sentence Transformers with Multimodal Embedding and Reranker Support

Hugging Face announced an update to its Sentence Transformers Python library (v5.4), which now supports multimodal embedding and reranker models. This update allows users to encode and compare texts, images, audio, and videos within a shared embedding space, facilitating cross-modal search, visual document retrieval, and multimodal RAG pipelines. The library provides capabilities for both embedding models, which map inputs into a common space, and reranker models, which score the relevance of mixed-modality pairs.

2026-04-10

Hugging Face Launches Inference Providers Hub and Multimodal Sentence Transformers v5.4; Arcee Releases 400B Trinity Large Thinking Model

Hugging Face introduced the Inference Providers Hub, aggregating endpoints from Together AI, Fireworks, Replicate, and others into a single model card interface covering 5,000+ models, allowing developers to benchmark latency and cost and switch providers with a single API parameter change — drawing praise as 'the missing abstraction layer' from ML engineers. Separately, Hugging Face released Sentence Transformers v5.4 with multimodal embedding and reranker models supporting text, image, audio, and video in a unified API for cross-modal search and multimodal RAG pipelines. Also debuting on Hugging Face: Arcee's Trinity Large Thinking, a 400-billion parameter open-weight reasoning model built by a 26-person team on a $20 million budget that CEO Mark McQuade claims is the most capable open-weight model from a non-Chinese company, directly challenging Chinese open-source dominance.

2026-04-09

Hugging Face Publishes Cluster of Agent Research: Paper Circle, Echo Memory Framework, Claw-Eval, and MedGemma 1.5

Hugging Face released a significant cluster of research on April 5–6 covering agent infrastructure, evaluation, and medical AI. Highlights include Paper Circle (multi-agent LLM framework for automated research discovery), Echo (transfer-oriented memory for multimodal agents in Minecraft), Claw-Eval (trustworthy evaluation framework addressing trajectory-opaque grading and safety gaps in agent benchmarking), and an in-place Test-Time Training approach allowing dynamic LLM weight adaptation at inference without retraining. MedGemma 1.5 4B was also released, expanding medical AI with high-dimensional imaging (CT/MRI volumes, histopathology), anatomical localization, and improved clinical document understanding. Additional papers addressed tool-integrated reasoning inefficiencies (KV-Cache eviction from long tool responses) and robot policy learning via multiview video generation.

2026-04-08

Hugging Face Papers: TriAttention, OpenWorldLib, Vero VLMs, and SkillX Agent Learning Released

Hugging Face published a cluster of significant research papers this week. TriAttention addresses KV cache memory bottlenecks in long-context LLM reasoning using pre-RoPE trigonometric compression, enabling longer reasoning sequences without quality loss. OpenWorldLib provides a standardized unified inference framework and definitions for advanced world models, improving research reproducibility. Vero releases a fully open family of Vision-Language Models achieving broad visual reasoning across charts, science, and spatial tasks — previously locked behind proprietary RL pipelines. SkillX proposes automated collaborative skill knowledge base construction enabling LLM agents to share and reuse learned behaviors rather than rediscovering them independently. MinerU2.5-Pro additionally demonstrates that document parsing SOTA improvements stem from systematic training data engineering rather than architecture innovation.

2026-04-07

More vendors

AnthropicOpenAIGoogleAWSAzureMetaxAINVIDIAMistralAppleAlibabaDeepSeekSamsung

← Browse all AI stories