Perceptron Mk1, Thinking Machines Models Lead AI Releases - featured image
OpenAI

Perceptron Mk1, Thinking Machines Models Lead AI Releases

Photo by Pavel Danilyuk on Pexels

Synthesized from 5 sources

Two AI startups unveiled new models this week targeting gaps in video understanding and real-time multimodal interaction — while a supply-chain attack wave hitting OpenAI, Anthropic, and Meta exposed a category of infrastructure vulnerability that standard model evaluations don’t measure.

Perceptron Mk1: Video Reasoning at 80–90% Lower Cost

Perceptron Inc., a two-year-old startup, released its flagship video analysis reasoning model, Mk1, on May 13, 2026. According to Perceptron’s announcement, the model is priced at $0.15 per million input tokens and $1.50 per million output tokens via API — roughly 80–90% cheaper than comparable proprietary models including Anthropic’s Claude Sonnet 4.5, OpenAI’s GPT-5, and Google’s Gemini 3.1 Pro.

The company was co-founded by CEO Armen Aghajanyan, formerly of Meta FAIR and Microsoft. VentureBeat reported that the team spent 16 months building what it describes as a “multi-modal recipe” from scratch, designed to handle the physical-world complexity of live video feeds rather than adapting a text-first architecture.

The pricing gap is significant for enterprises with high-volume video workloads. At current market rates, processing millions of hours of security footage, marketing content, or behavioral research video through GPT-5 or Claude Sonnet 4.5 carries costs that most mid-market organizations can’t absorb at scale. Mk1’s API pricing changes that arithmetic directly.

What Mk1 Can Do

The model is built for grounded spatial and temporal understanding — not just recognizing objects in a frame, but tracking cause-and-effect relationships, object dynamics, and physical interactions across time. According to Perceptron’s announcement, benchmark results cover:

  • Spatial reasoning across standard video understanding evaluations
  • Object dynamics and physics-consistent prediction
  • Temporal coherence in live-feed analysis scenarios
  • Clip extraction and anomaly flagging for marketing and compliance use cases

A public demo is available for prospective users and enterprise customers to test the model directly.

Thinking Machines Previews Real-Time Interaction Models

Thinking Machines — the startup founded by former OpenAI CTO Mira Murati and former OpenAI researcher and co-founder John Schulman — announced a research preview of what it calls “interaction models” on May 13, 2026. The framing is architectural: rather than bolting real-time responsiveness onto a standard turn-based model via software wrappers, Thinking Machines claims to have built interactivity into the model architecture itself.

According to the company’s announcement blog post, the approach treats interactivity as “a first-class citizen of model architecture” — meaning the model can begin responding while still processing incoming input, whether that input is text, audio, or video. The company reported gains on third-party benchmarks and reduced latency compared to standard sequential architectures.

Availability and Access

The models are not yet publicly available. Thinking Machines stated it will open a limited research preview in the coming months to collect feedback before a wider release. No pricing or API details have been disclosed.

The practical target for this architecture is clear: AI systems that need to participate in natural human conversation — customer service, real-time translation, live meeting assistance — rather than waiting for a complete prompt before generating output. Current turn-based latency, even at sub-second response times, creates friction that limits how naturally users can interact with AI in voice and video contexts.

Supply-Chain Attacks Expose Release Pipeline Gaps

Separate from model launches, a cluster of security incidents over the past 50 days has highlighted a structural vulnerability in how AI companies ship software — one that model safety evaluations, system cards, and red-team exercises don’t currently address.

VentureBeat reported that four supply-chain incidents hit OpenAI, Anthropic, and Meta between late March and mid-May 2026. Three were adversary-driven; one was a self-inflicted packaging failure. None targeted the AI models themselves.

The TanStack Compromise

On May 11, 2026, a self-propagating worm called Mini Shai-Hulud published 84 malicious package versions across 42 `@tanstack/*` npm packages in six minutes. The worm exploited a chain of vulnerabilities: a `pullrequesttarget` misconfiguration in `release.yml`, GitHub Actions cache poisoning, and OIDC token extraction from runner memory.

The resulting packages carried valid SLSA Build Level 3 provenance — because they were published from the correct repository, by the correct workflow, using a legitimately minted OIDC token. No maintainer credentials were phished. The trust model functioned as designed and still produced 84 malicious artifacts.

OpenAI’s Response

Two days after the TanStack incident, OpenAI confirmed that two employee devices were compromised and credential material was exfiltrated from internal code repositories. OpenAI is revoking its macOS security certificates and requiring all desktop users to update by June 12, 2026.

OpenAI noted it had already been hardening its CI/CD pipeline following an earlier supply-chain incident, but the two affected devices had not yet received the updated configurations at the time of compromise.

Security researcher @EnTr0pY_88 noted on X that the certificate rotation — not the exfiltrated code itself — was the more significant signal, suggesting the blast radius had reached signing trust rather than just source access.

What This Means

The Perceptron and Thinking Machines releases together point toward a maturing second tier of AI model development — companies building purpose-specific architectures rather than general-purpose models, competing on cost and latency rather than benchmark breadth.

Perceptron’s pricing is the more immediately disruptive data point. An 80–90% cost reduction for video analysis doesn’t just make existing use cases cheaper — it makes previously uneconomical use cases viable. Continuous live-feed monitoring at scale, for example, shifts from a budget line item reserved for large enterprises to something a mid-market logistics company or regional hospital network could deploy.

Thinking Machines’ interaction model preview is earlier-stage but architecturally more ambitious. If the claim that interactivity is built into the model rather than layered on top holds up under real-world testing, it would address one of the more persistent friction points in voice and video AI deployment. The limited research preview timeline means enterprise validation is still months away.

The supply-chain incidents are a separate category of concern entirely. The core problem — that SLSA provenance, OIDC tokens, and trusted CI pipelines can all be subverted without compromising credentials — isn’t solved by better model evaluations or red-teaming. It requires hardening at the release infrastructure layer, which sits outside the scope of every major AI safety framework currently in use.

FAQ

How does Perceptron Mk1 compare to GPT-5 and Claude Sonnet 4.5 on price?

Perceptron Mk1 is priced at $0.15 per million input tokens and $1.50 per million output tokens via API, which the company says is 80–90% cheaper than GPT-5, Claude Sonnet 4.5, and Gemini 3.1 Pro. The price gap is specific to video analysis workloads; Mk1 is not a general-purpose model competing across all task types.

What are Thinking Machines’ interaction models, and when can I access them?

Thinking Machines describes interaction models as a new class of multimodal systems where real-time responsiveness is built into the model architecture rather than added via software wrappers. A limited research preview is planned for the coming months, with no public release date or pricing announced yet.

How did the Mini Shai-Hulud worm bypass SLSA provenance protections?

The worm compromised a CI/CD runner and extracted a legitimately minted OIDC token, then used it to publish malicious packages through the correct repository and workflow. Because the token and pipeline were authentic, the resulting packages carried valid SLSA Build Level 3 provenance — the attack bypassed the trust model by operating entirely within it.

Sources

Digital Mind News

Digital Mind News is an AI-operated newsroom. Every article here is synthesized from multiple trusted external sources by our automated pipeline, then checked before publication. We disclose our AI authorship openly because transparency is part of the product.