• Skip to main content
  • Skip to secondary menu
  • Skip to footer
information matters logo

Information Matters - Agentic AI News and Market Forecasts

The Agentic AI Revolution: what it means for business and the rules of competition

  • Home
  • About
    • The Team
    • About Us
    • Our Methodology
  • Contact
  • Subscribe
  • Downloads
  • Agentic AI Company Tracker
  • Agentic AI Sector Analysis

Baseten

COMPANY PAGE

Baseten

AI inference platform for production model serving — deploy and run open-source, fine-tuned and custom AI models with low-latency, high-throughput infrastructure across multi-cloud and on-prem; positioned as the inference layer for the application generation of AI built on a multi-model future.

Founded 2019
Private — Series E
AI Infrastructure
baseten.co

Last Updated: 28 May 2026
Fact-checked: 2 June 2026
Coverage: Tracker · Category Report (AI Infrastructure, forthcoming)
← Back to AI Tracker

The Business

Baseten builds an AI inference platform for production model serving — deploying, running and scaling open-source, fine-tuned and custom AI models with low-latency, high-throughput infrastructure across multi-cloud and on-prem environments. The product line is anchored on the Baseten Inference Stack (model deployment, autoscaling, observability and the multi-model orchestration surface), the Truss open-source framework for packaging models for production, and a deployment surface that supports open-source model catalogues (Meta Llama, Mistral, DeepSeek), customer fine-tunes and bespoke custom models. The company is privately held — founded 2019 in San Francisco by Tuhin Srivastava, Amir Haghighat, Philip Howes and Pankaj Gupta — and has raised approximately $590M+ of external capital through the January 2026 $300M Series E at a $5B valuation, led by IVP and CapitalG with NVIDIA participating as a $150M anchor. The Information reported in late May 2026 that the company is in talks to raise approximately $1B at an $11B valuation; that round had not closed at time of writing.

Customers and Distribution

Baseten’s annualised revenue ramped from approximately $200M in December 2025 to approximately $600M in March 2026 per CEO podcast interviews and corroborating Tech Startups coverage — described in industry coverage as among the steepest inference-platform ramps on record. Customer disclosures across the Series D and Series E cycle include Descript, Patreon, Writer and a multi-category base spanning developer-led adoption and enterprise procurement. Distribution sits across three motions: direct developer onboarding via the Truss open-source framework and the Baseten self-serve platform, direct enterprise sales for production workloads, and partner-and-platform alignment with NVIDIA following the Series E anchor investment. The company has not separately disclosed precise paid-customer count, headcount, or gross-margin shape in primary sources; we rely on the Series D and Series E blog posts and named-press triangulation for the cited figures.

Model Strategy

Baseten is a Verticals-first play under the IM Framework eight-trajectories taxonomy as it applies to AI inference: the strategic bet is that specialised inference infrastructure with multi-model + multi-cloud orchestration beats hyperscaler general-purpose model-serving on tail-latency, throughput-per-dollar and the production-deployment workflow for AI applications. The foundation-model stack is deliberately model-agnostic — Baseten serves open-source models (Meta Llama, Mistral, DeepSeek), customer fine-tunes and bespoke custom models across a multi-cloud deployment surface, with the NVIDIA Series E anchor aligning silicon supply for the inference workload. Above the foundation-model layer, the Truss open-source framework is the developer onboarding surface; the Baseten Inference Stack is the production runtime; consumption-based per-token and per-second pricing is the monetisation surface. The thesis is that inference moves from one-third of AI compute spend in 2023 to roughly two-thirds by end-2026 (Srivastava framing at HumanX 2026), and that the orchestration surface for that spend is the structural prize regardless of which foundation models win the frontier-capability race.

At A Glance

Annualised revenue
$150M ●
2026-04-30 as-of

2024-12-312026-04-30

Headcount
130 ●
2026-04-30 as-of

2024-12-312026-04-30

Funding to date
$585M ●
2026-04-30 as-of

2024-12-312026-04-30

The Numbers

Annualised revenue

$150M $25M 2024-12-31 — 25 2025-06-30 — 60 2025-12-31 — 120 2026-04-30 — 150 2024-12-31 2026-04-30

Headcount (FTE)

130 60 2024-12-31 — 60 2025-12-31 — 100 2026-04-30 — 130 2024-12-31 2026-04-30

Funding to date

$585M $60M 2024-12-31 — 60 2025-12-31 — 165 2026-04-30 — 585 2024-12-31 2026-04-30

Leadership Team

Co-founder & CEO
Tuhin Srivastava
Co-founded Baseten in 2019. Public-facing across the funding cycle including the September 2025 Series D, the January 2026 Series E and the May 2026 $1B funding talks coverage; framed the inference-as-foundation thesis at HumanX 2026 and on multiple podcast interviews including the No Priors and Latent Space cycles.

Co-founder & CTO
Amir Haghighat
Co-founded Baseten in 2019. Leads engineering and product across the inference platform and the multi-model serving stack; long-tenured operator behind the platform’s low-latency architecture and the open-source-model deployment surface.

Co-founder & Chief Scientist
Philip Howes
Co-founded Baseten in 2019. Technical lead on the inference research direction and the custom-model deployment workflows that anchor Baseten’s positioning against hyperscaler general-purpose serving.

Head of Engineering
Pankaj Gupta
Co-founder of Baseten; engineering leadership across the production-inference stack and the multi-cloud deployment architecture that converts the open-source-model catalogue into deployable enterprise workloads.

Baseten is founder-led with all four co-founders remaining in operating roles through the Series E cycle. Senior recruiting has come from infrastructure-adjacent companies including Gusto (where Srivastava and Haghighat met) and the AI/ML platform cohort. CFO, CRO and CTO roles are not separately publicly named at time of writing; the company has not disclosed precise headcount in primary sources, though LinkedIn-visible data places it in the low-hundreds range as of mid-2026.

IM Framework Scoring

IM’s structured assessment of Baseten’s competitive position. The summary below is the headline; expand “Show the full analyst-grade analysis” near the bottom for the per-dimension reasoning and evidence. Methodology →

Competitive Position
Emerging Player
AI Infrastructure sector

The Information Matters Compass

5 7.5 10 5 7.5 10 Defensibility → Disruption Potential →Disruptive Challengers Dominant InnovatorsEmerging Players Established Incumbents Baseten © Information Matters

Strategic Bet
Verticals — specialised inference infrastructure beats hyperscaler general-purpose model-serving on tail-latency, cost-per-token and the orchestration surface for production AI applications
Plus: Plus: plateau — the multi-model future Baseten markets to compounds as inference moves from one-third to two-thirds of AI compute spend through 2026 regardless of frontier-capability cadence

Watch: The reported June 2026 $1B round at $11B valuation closing; the Fireworks AI / Together AI competitive cadence on inference benchmarks and pricing; AWS Bedrock and Azure AI Foundry inference-tier pricing moves; NVIDIA NIM microservices distribution competing on the same inference primitive Baseten serves; and the gross-margin shape as ARR ramps through 2026.

Funding History

Date Round Raised Post-money Lead investor(s)
Jan 2026 Series E $300M $5B IVP and CapitalG (with NVIDIA $150M anchor)
Sep 2025 Series D $150M $2.15B BOND
Mar 2024 Series C $75M ~$825M IVP
2023 Series B $40M — Spark Capital
2022 Series A $20M — Greylock

Cumulative external capital is approximately $590M+ disclosed through the January 2026 $300M Series E at a $5B valuation, led by IVP and CapitalG with NVIDIA participating as a $150M anchor investor and previous backers BOND, Greylock and Spark following on. The Information reported in late May 2026 that Baseten is in talks to raise approximately $1B at an $11B valuation; that round had not closed at time of writing. The Series E followed the September 2025 $150M Series D at $2.15B post-money. Earlier rounds (Series C, B, A, Seed) from named-press cycles and Baseten’s own blog. We rely on Baseten’s primary blog and PYMNTS / Tech Startups / TechCrunch coverage for round dates and valuations and decline-to-publish any figure that only appears on Tracxn or PitchBook.

Competitive Landscape

Competitor Positioning Distribution edge Threat profile
Bedrock
((Amazon AWS))
Amazon’s hyperscaler-native inference service exposing Anthropic, Meta, Mistral, Cohere and AWS-first-party models through a single managed API, positioned as the default inference surface for customers already inside the AWS commit envelope. Direct AWS console and AWS sales channel, IAM-and-VPC-native procurement, and consumption pricing that lands inside existing AWS enterprise contracts — the moat is the AWS enterprise commit and the regulated-buyer compliance posture rather than per-token price. High — the hyperscaler-native inference service with the broadest model catalogue and the deepest enterprise distribution; bundles inference into existing AWS contracts.
Fireworks AI Pure-play inference platform for open-source and fine-tuned models with an aggressive performance-per-dollar pitch on Llama, Mistral, DeepSeek and Qwen; positioned as the developer-led mirror of Baseten on the multi-model serving lane. Direct self-serve developer onboarding plus a direct enterprise sales motion for production workloads; consumption-based per-token pricing and integrations into the LangChain / LlamaIndex / vector-database developer ecosystem. High — the closest direct mirror on the multi-model inference platform positioning, with comparable ARR trajectory and a similar open-source-model-first stance.
Together AI Multi-model inference and fine-tuning platform with a stated research arm and a deep open-source model catalogue (Llama, DeepSeek, Mixtral, Qwen, FLUX); positioned as a research-flavoured pure-play competitor on the same inference-orchestration primitive as Baseten. Direct developer self-serve via Together API plus a direct enterprise channel; consumption pricing on inference and dedicated-endpoint contracts for production customers, with research-publication mindshare driving the developer funnel. High — comparable open-source model serving + custom-model fine-tuning surface; competing head-to-head on developer-led inference deployment.
Replicate Developer-first inference platform with a long-tail open-source model catalogue accessed through a Docker-and-Cog packaging pattern; positioned as the easy-onramp model-zoo for individual developers and prototype workloads rather than enterprise procurement. Direct developer self-serve and pay-per-second consumption pricing; community-contributed model catalogue and a Cog-on-GitHub developer funnel are the principal channel and moat. Medium — developer-first inference platform with a strong long-tail model catalogue; flanking risk on the developer-onboarding surface rather than enterprise procurement.
NIM microservices
((NVIDIA))
NVIDIA’s first-party inference-microservices distribution wrapping CUDA-optimised model containers for deployment on any NVIDIA-accelerated infrastructure; positioned as the silicon vendor’s own inference layer on the hardware Baseten itself runs on. Direct NVIDIA enterprise channel, bundled with NVIDIA AI Enterprise licences and pre-installed across DGX Cloud and NVIDIA-partner hyperscaler offerings; silicon-vendor lock-in is the structural moat. Medium-high — NVIDIA’s first-party inference distribution layered on top of the silicon Baseten itself depends on; a structural alignment + competitive overlap that the Series E investment partially de-risks.

Potential Risks

Hyperscaler-native inference substitution

The principal structural risk is that AWS Bedrock, Azure AI Foundry and Google Vertex AI absorb the inference-platform lane by bundling inference into broader cloud commitments. Baseten’s multi-cloud + open-source-model architecture is a defensible counter-position, but the procurement gravity of hyperscaler enterprise contracts is real and the substitution dynamic is the most-watched competitive variable through 2026.

Foundation-model supplier dependence

Baseten is a serving layer for upstream foundation-model providers (Meta Llama, Mistral, DeepSeek, Anthropic-served-via-API and customer fine-tunes). Capability shifts at the model-provider tier — including model-provider direct-serving moves — propagate directly into Baseten’s value proposition. The Series E NVIDIA participation aligns silicon supply but does not insulate against the model-provider direct-serving substitution.

Pure-play inference competitive cadence

Fireworks AI and Together AI are structurally symmetric pure-play competitors with comparable ARR trajectories and similar open-source-model-first stances. The symmetric-competitor cadence on benchmarks and pricing compresses gross margins and slows the path from $600M annualised ARR (March 2026 disclosure) toward the $11B valuation framing implied by the reported May 2026 funding talks.

Valuation-to-ARR multiple at the reported $11B mark

The reported $11B valuation in the May 2026 funding talks (per The Information headline and Tech Startups summary) against the ~$600M March 2026 annualised ARR implies a high multiple for a pure-inference platform; the bull case is that the ARR ramp continues at the disclosed pace and the multiple resolves through growth; the bear case is that the multi-model future thesis compresses as hyperscaler bundles absorb the lane.

Headcount and execution scale-up

Baseten is in the low-hundreds-employee range with founder-led senior leadership and no separately disclosed CFO/CRO/CTO at time of writing. Scaling against hyperscaler-incumbent procurement and against well-funded pure-play rivals at $600M+ annualised ARR is a known load-bearing risk; the executive-bench appointments through 2026 are a material watch-item.

Recent IM Coverage

  • AI Infrastructure — sector landing May 2026.
  • AI Tracker — methodology and universe May 2026.

Show recent press coverage of Baseten
  • Jan 2026 — Baseten raises $300M Series E at $5B valuation, with $150M anchor from NVIDIA (TechCrunch)
  • Sep 2025 — Baseten Secures $150M Series D as the Premier Inference Platform for AI’s App Layer (BusinessWire)
  • May 2026 — Inference Firm Baseten Eyes Funding Round at $11 Billion Valuation (PYMNTS)
  • May 2026 — AI inference startup Baseten in talks to raise $1 billion at $11 billion valuation (Tech Startups)
  • Jan 2026 — Baseten Raises $300M at a $5B Valuation to Power a Multi-Model Future (HPCwire)
  • Sep 2025 — Baseten Series D: building the inference platform for AI’s app layer (Baseten Blog)

Show the source register for the figures on this page

IM operates a primary-source-where-possible discipline. The figures above come from:

  • Revenue: Baseten’s annualised revenue ramped from approximately $200M in December 2025 to approximately $600M in March 2026 per CEO Tuhin Srivastava’s No Priors / Latent Space podcast cycle and corroborating Tech Startups coverage. The Series D announcement disclosed the $200M annualised figure for the Q4 2025 mark.
  • Customer accounts: Baseten discloses serving hundreds of production customers including Descript, Patreon, Writer and a multi-category developer-and-enterprise base across the Series D Series D blog post and Series E announcements. Precise paid-customer count is not disclosed in primary sources; we decline-to-publish a specific figure pending company disclosure.
  • Headcount: Baseten does not publicly disclose precise headcount. LinkedIn-visible company-page data places the company in the low-hundreds range as of mid-2026; we decline-to-publish a precise figure and reference the careers page as the canonical entry point.
  • Funding to date: Cumulative external capital approximately $590M+ through the January 2026 $300M Series E at $5B valuation, led by IVP and CapitalG with NVIDIA as $150M anchor and BOND, Greylock and Spark Capital following on. The May 2026 $1B-at-$11B round reported by The Information had not closed at time of writing.

Methodology & Disclaimer

For metric definitions, source-tier hierarchy, and decline-to-publish rules, see the tracker methodology. Confidence dots (• green / • amber / • red) follow the same convention as the AI Tracker.

Spotted a figure you believe is wrong? Send corrections to info@informationmatters.net.

Information Matters Framework scores are the considered opinion of the IM team — human and AI — applied to publicly-available evidence under a disclosed methodology. They are not statements of fact about the companies scored and they are not investment advice.

Footer

  • LinkedIn
  • YouTube

Copyright © 2026 · Information Matters

Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
  • Manage options
  • Manage services
  • Manage {vendor_count} vendors
  • Read more about these purposes
View preferences
  • {title}
  • {title}
  • {title}