Technology thesis · Artificial Intelligence

high conviction mature

Natural language processing

NLP as a distinct field has been largely absorbed by large language models; the remaining standalone NLP challenges are in multilingual, low-resource, and domain-specific applications.

Position maintained continuously · last reviewed Apr 22, 2026

The thesis

Core thesis

LLMs have effectively solved most classical NLP tasks (translation, summarisation, sentiment, NER) to human-competitive levels. What remains: low-resource languages, real-time speech processing, domain-specific terminology (medical, legal), and structured information extraction at scale. The standalone NLP market is being absorbed into the LLM platform market.

State of the art (2026)

By mid-2026 there is no longer a distinct NLP field to speak of: classical tasks – translation, summarisation, NER, sentiment, extraction – are handled directly by frontier models (Claude Opus 4.x, GPT-5.x, Gemini 3.x, plus open weights from DeepSeek, Qwen and Llama), with differentiation now on reasoning, tool-use and agentic quality rather than raw price. Input pricing for capable models has fallen into the sub-dollar-per-million-token range, commoditising text processing. The live standalone markets are voice (ElevenLabs raised $500M at an $11B valuation in February 2026, alongside Deepgram, AssemblyAI and Cartesia), translation (DeepL, ~$2B), document AI/IDP, and verticalised agents (Harvey, Sierra, Decagon). Specialty providers face structural compression from frontier labs above and open weights below.

The rest of the file

Everything below is live inside CanaryIQ

The full analysis behind the verdict — the structure is real; the content unlocks when you log in.

Signal stack

Evidence stacked leading → lagging

10 signals

talent

research

patent

expert

operational

market

Technology-native KPIs

Metrics that predict trajectory, tracked over time

6 tracked

Global NLP market size

LLM-powered application growth

Multilingual NLP coverage

Enterprise chatbot deployment rate

Frontier model API pricing Q1 2026

Combined OpenAI + Anthropic + Google AI revenue Q1 2026

Landscape map

Who builds what — and who depends on whom

110 players · 6 layers

Catalyst calendar

Dated events that will move the position

5 ahead

Technology roadmap

Milestones on the path to maturity

8 milestones

Watchlists

Companies, people and papers — each with a remove-by condition

20 · 20

Companies · 20

People · 20

Decision frameworks

The same call, framed for your desk

Locked

Public Equity

PE / VC

Corporate Leader

Thesis changelog

When our view changed, and why

5 updates

Change our mind

3 disconfirming conditions

The rest is inside

You've read the verdict. The file is much deeper.

The full signal stack, technology-native KPIs tracked over time, the landscape of who depends on whom, the dated catalyst calendar, decision frameworks for every desk, live watchlists and the changelog of every time our call on Natural language processing has changed — all live inside CanaryIQ.