Timor Media Monitor

A 4-language news intelligence platform for Timor-Leste — 100+ sources, AI-translated briefings, and entity-level search, built for embassies, NGOs, journalists, and the diplomatic community.

Why this exists

Timor-Leste publishes news in four languages and almost nobody outside the country can read all of them. Tetun is the hardest target — commercial MT engines (Google Translate, DeepL) don't cover it well. Diplomatic missions, donor agencies, foreign press, and large NGOs are stuck either hiring local readers, missing context entirely, or relying on translated PR feeds that lag by 24–48 hours and miss the vernacular debate. TMM closes that gap.

What we shipped

The hardest part: Tetun MT

Tetun is a low-resource language. There is no Wikipedia corpus the size of Portuguese, no parallel-corpus dataset the size of Indonesian. Off-the-shelf MT models fall back to Latin transliteration or guesswork. We solved this by building our own pipeline at tetumdili.com — a glossary-grounded translation system that combines Claude reasoning with a 17k+ entry Tetun corpus and a lint pass that catches the most common machine-translation mistakes (false-cognate Portuguese, dropped clitics, English calque constructions). It's used inside TMM and is independently available as a free public translator.

Architecture, in one paragraph

A scheduled crawler hits each ingest source on its own cadence (RSS, sitemap, or scraped) and pushes new items into a Postgres queue. A worker pool translates each item through tetumdili.com into the three other working languages, runs a Claude summarisation pass with a domain-aware system prompt, and indexes entities (people, ministries, places, companies) into a searchable graph. The frontend is a Next.js dashboard with entity feeds, language toggles, and saved searches. Everything runs on the same Hetzner fleet box as the rest of OniT.

Who it's for

Diplomatic missions tracking the political discourse around energy, fisheries, and ASEAN accession.
Development agencies (UN, DFAT, MDF, Asia Foundation, EU) tracking their sector indicators in real time.
Foreign press who need first-language source material translated quickly and reliably.
NGOs and civil society tracking accountability and human-rights coverage across Tetun-language outlets.

What's next

Public beta access to vetted institutional users in 2026. Paid tier for embassies and corporate compliance teams in 2027. Open the tetumdili.com translator further to developers (we already have a free public version — the next step is a documented API for batch jobs).

The hardest part: Tetun MT

Architecture, in one paragraph

Who it's for

Diplomatic missions tracking the political discourse around energy, fisheries, and ASEAN accession.
Development agencies (UN, DFAT, MDF, Asia Foundation, EU) tracking their sector indicators in real time.
Foreign press who need first-language source material translated quickly and reliably.
NGOs and civil society tracking accountability and human-rights coverage across Tetun-language outlets.

Why this exists

What we shipped

Multi-source ingest

Tetun ↔ English ↔ PT ↔ ID

AI-grade summaries

Entity-level search

The hardest part: Tetun MT

Architecture, in one paragraph

Who it's for

What's next

Timor Media Monitor

Why this exists

What we shipped

Multi-source ingest

Tetun ↔ English ↔ PT ↔ ID

AI-grade summaries

Entity-level search

The hardest part: Tetun MT

Architecture, in one paragraph

Who it's for

What's next