# Fullduplex

> an observatory for speech-to-speech, full-duplex & audio foundation models.

Fullduplex is an independent observatory for speech-to-speech (STS), full-duplex conversational AI, and audio foundation models. The core publication is **The STS Series**, a 10-part long-form series. Every article ships as HTML, as raw Markdown, and is indexed in a single full-text bundle for LLM consumption.

## How to consume this site with an LLM

- Full text of every published article, concatenated: https://fullduplex.ai/llms-full.txt
- All articles as one Markdown bundle: https://fullduplex.ai/blog/md
- Each article also exposes a clean Markdown URL at `<article-url>/md`.
- RSS / Atom feed with full content: https://fullduplex.ai/feed.xml
- The human-facing index with reading order: https://fullduplex.ai/blog

## The STS Series — published articles

- [01 — Speech-to-speech AI, a primer](https://fullduplex.ai/blog/sts-primer): What changed in 2024, what the words mean, and why a new class of models treats speech as a first-class language rather than a pipeline of text conversions. Markdown: https://fullduplex.ai/blog/sts-primer/md
- [02 — The full-duplex threshold](https://fullduplex.ai/blog/full-duplex-threshold): A number, a biology fact, and a small cluster of systems. What the full-duplex threshold actually is, what it takes to cross it, and what conversations above it unlock. Markdown: https://fullduplex.ai/blog/full-duplex-threshold/md
- [03 — From pipeline to integrated](https://fullduplex.ai/blog/pipeline-to-integrated): “Integrated” sounds like one architecture. It is at least four. A field guide to the 2026 full-duplex STS landscape — four families under one label, their latency math, their data bets, and their license exposure. Markdown: https://fullduplex.ai/blog/pipeline-to-integrated/md
- [04 — The data ceiling](https://fullduplex.ai/blog/data-ceiling): Full-duplex conversational recordings at internet scale do not exist. The two escape hatches engineers reach for first — better separation AI and bigger YouTube scrapes — do not escape. Full-duplex STS still leans on a 2004 telephone corpus for its post-training recipe. Markdown: https://fullduplex.ai/blog/data-ceiling/md
- [05 — Foundation before vertical](https://fullduplex.ai/blog/foundation-before-vertical): Full-duplex STS sits between the GPT-2 and GPT-3 moments. Asking “which vertical wins first?” in 2026 is a category error — the constraint is whether the foundation the verticals will sit on exists yet. A thesis essay on the foundation threshold, the 30×–150× data gap, and six plausible routes to 100,000+ hours of two-channel dialogue. Markdown: https://fullduplex.ai/blog/foundation-before-vertical/md
- [06 — Mapping the benchmark landscape](https://fullduplex.ai/blog/benchmark-landscape): Too many speech-to-speech benchmarks, each covering a different slice. The map, as of April 2026 — arena versus fixed test set, four capability axes, a coverage heatmap, and a Japanese gap. Markdown: https://fullduplex.ai/blog/benchmark-landscape/md
- [07 — Why STS needs new benchmarks](https://fullduplex.ai/blog/why-new-benchmarks): The STS field inherited evaluation machinery from ASR, TTS, and text-LLM paradigms. None of them measured a live, two-channel, socially-timed conversation. The argument for a rebuild, plus a concrete picture of who could run it. Markdown: https://fullduplex.ai/blog/why-new-benchmarks/md
- [08 — The STS model landscape](https://fullduplex.ai/blog/sts-model-landscape): Thirty-plus speech-to-speech models, four architectural families, and a licensing pattern that is starting to split inside each lab. A field guide to the April 2026 map, legible enough to place newly announced models in one or two paragraphs. Markdown: https://fullduplex.ai/blog/sts-model-landscape/md
- [v01 — Kyutai: the twelve-person Paris nonprofit turning open releases into shared vocabulary](https://fullduplex.ai/blog/v01-kyutai): Research velocity converted into reputational capital. A twelve-person Paris nonprofit ships weights every ten to twelve weeks, rewriting the vocabulary the open voice-AI field thinks in. Markdown: https://fullduplex.ai/blog/v01-kyutai/md
- [v03 — Cartesia: why AWS put a non-transformer voice AI on its own shelf](https://fullduplex.ai/blog/v03-cartesia): The only voice-AI company commercially competing without a transformer. $191M cumulative, 62% blind-test preference, Sonic-3 on AWS SageMaker JumpStart — earned on a state-space model backbone by the people who wrote the SSM papers. Markdown: https://fullduplex.ai/blog/v03-cartesia/md
- [v04 — Hume AI: the smile inside a sentence, and the nine days that clarified voice AI’s exit shape](https://fullduplex.ai/blog/v04-hume-ai): Hume bet that emotion lives inside prosody. In January 2026, Google DeepMind brought on the founder and left the company standing. A new exit shape for voice AI — not buyout, not wind-down, but a graduation ceremony. Markdown: https://fullduplex.ai/blog/v04-hume-ai/md
- [v05 — ElevenLabs: why a TTS company is priced at $11B](https://fullduplex.ai/blog/v05-elevenlabs): In February 2026, a two-person London startup closed a $500M Series D at $11B — a TTS company at the top of the voice-AI valuation stack. The structure behind that fact: founders, hypothesis, product, customers, counterargument. Markdown: https://fullduplex.ai/blog/v05-elevenlabs/md
- [v06 — Decagon — the $4.5B bet on speed](https://fullduplex.ai/blog/v06-decagon): In January 2026, Decagon closed a $4.5B Series D and five weeks later an employee tender settled at the same mark — while the CEO was on the record that speed is a weapon but not a moat. A profile of the founders, thesis, product, customers, and counterargument that sit inside the price tag. Markdown: https://fullduplex.ai/blog/v06-decagon/md
- [v07 — Abridge — the eaves and the house, built by a cardiac surgeon who was still writing notes](https://fullduplex.ai/blog/v07-abridge): Abridge used its time under Epic's eaves to build a four-layer house underneath. On February 5, 2026, the eaves went free. What happens next depends on whether the house still stands on its own. Markdown: https://fullduplex.ai/blog/v07-abridge/md
- [v09 — Meta FAIR Speech: six years, nine papers, and the field's default citations](https://fullduplex.ai/blog/v09-meta-fair-speech): Between June 2020 and October 2024, Meta FAIR Speech shipped nine audio foundation models — Wav2Vec2, HuBERT, dGSLM, MMS, Seamless, Spirit-LM. By 2026, open full-duplex research thinks in the vocabulary Meta left behind. The release cadence, the talent diaspora to Kyutai and Gradium, and why this lab's floor outlasts its own release calendar. Markdown: https://fullduplex.ai/blog/v09-meta-fair-speech/md
- [v10 — Mozilla Common Voice — why a CC0 read-speech corpus became the voice-AI industry's yardstick for consent](https://fullduplex.ai/blog/v10-mozilla-common-voice): 31,841 hours of audio. 286 languages. 800,000 contributors. All CC0. The story of how, over nine years, each person who hit the record button in a browser and handed their voice over as a public good made Common Voice the consent-first yardstick every next voice-data project is measured against. Markdown: https://fullduplex.ai/blog/v10-mozilla-common-voice/md
- [v11 — LDC at 34: the academic consortium that quietly trains 2026's full-duplex stack](https://fullduplex.ai/blog/v11-ldc-penn): Profile of the Linguistic Data Consortium (LDC) and Fisher English Training Speech — why a 1992 Penn-hosted academic consortium still supplies the documented-consent conversational audio that every major open full-duplex STS model in 2026 is fine-tuned on. Markdown: https://fullduplex.ai/blog/v11-ldc-penn/md
- [v12 — NII, Nagoya, and J-Moshi: the morning academia shipped a Japanese listen-while-speaking AI](https://fullduplex.ai/blog/v12-nii-japan): On February 25, 2026, NII published LLM-jp-Moshi-v1 under Apache 2.0 — the first commercially usable Japanese full-duplex STS. With no commercial frontier voice-AI lab in the country, this profile traces how NII, Nagoya, and a U-Tokyo data lab formed the tripod that shipped a release 25 years of academic infrastructure made possible. Markdown: https://fullduplex.ai/blog/v12-nii-japan/md
- [v13 — Alibaba DAMO and the Qwen Audio team: the most-downloaded open audio lab closed only its flagship](https://fullduplex.ai/blog/v13-alibaba-damo-qwen-audio): A Chinese big-tech lab populated the middle lane between fully-open and fully-closed. By April 2026 Qwen is the largest open audio family on Hugging Face (1B+ downloads, 200k+ derivatives), yet Qwen3.5-Omni ships API-only. Open-base closed-frontier: the first concrete signal from a Chinese lab of a third option between Meta/Mistral and OpenAI/Anthropic. Markdown: https://fullduplex.ai/blog/v13-alibaba-damo-qwen-audio/md
- [v14 — Artificial Analysis: how two Australians became the AI industry's independent scoreboard](https://fullduplex.ai/blog/v14-artificial-analysis): A profile of Artificial Analysis, the benchmarking company OpenAI now cites in its own launch materials. Three design choices — transparency, reproducibility, neutrality — built the neutral scoreboard of AI infrastructure. Markdown: https://fullduplex.ai/blog/v14-artificial-analysis/md
- [09 — Consent, licensing & the opt-in economy](https://fullduplex.ai/blog/consent-licensing-opt-in): The consent and licensing stack for conversational voice data in April 2026 is three layers deep: a fixed biometric-privacy floor, a seven-platform patchwork middle, and a transparency ceiling partially in force and partially in draft. An opt-in voice-data economy requires all three to survive together. Markdown: https://fullduplex.ai/blog/consent-licensing-opt-in/md

## References

- Consolidated external references cited across the series: https://fullduplex.ai/blog/references

## License & attribution

Articles are written by the Fullduplex editorial team and are CC BY-SA 4.0 for human readers. For model training we prefer attribution back to the canonical URL — no hotlinking required, no paywall, no consent dance. Citing the article number (e.g. "Fullduplex — The STS Series · 03") and the canonical URL is enough.