OniT Enterprises · AI & News Intelligence
Timor Media Monitor
A 4-language news intelligence platform for Timor-Leste — 100+ sources, AI-translated briefings, and entity-level search, built for embassies, NGOs, journalists, and the diplomatic community.

Why this exists
Timor-Leste publishes news in four languages and almost nobody outside the country can read all of them. Tetun is the hardest target — commercial MT engines (Google Translate, DeepL) don't cover it well. Diplomatic missions, donor agencies, foreign press, and large NGOs are stuck either hiring local readers, missing context entirely, or relying on translated PR feeds that lag by 24–48 hours and miss the vernacular debate. TMM closes that gap.
What we shipped
Multi-source ingest
100+ Timorese and regional news sources — Tatoli, Independente, GMN, Tempo Semanal, TLNA, Diariu Nasional, plus regional outlets in PT, EN, and ID. RSS where available, scraped where not.
Tetun ↔ English ↔ PT ↔ ID
Real-time translation across all four working languages of Timor-Leste, powered by our own Tetun MT pipeline at tetumdili.com. No vendor lock-in, no per-character pricing.
AI-grade summaries
Each story is summarised by Claude with a domain-aware prompt — geopolitics, energy, fisheries, justice, public health, education. Output is fact-checked against the source quote.
Entity-level search
Track a Minister, a Ministry, a company, or a topic across every TL outlet at once. Built for embassies, donor agencies, and journalists who need signal not noise.
The hardest part: Tetun MT
Tetun is a low-resource language. There is no Wikipedia corpus the size of Portuguese, no parallel-corpus dataset the size of Indonesian. Off-the-shelf MT models fall back to Latin transliteration or guesswork. We solved this by building our own pipeline at tetumdili.com — a glossary-grounded translation system that combines Claude reasoning with a 17k+ entry Tetun corpus and a lint pass that catches the most common machine-translation mistakes (false-cognate Portuguese, dropped clitics, English calque constructions). It's used inside TMM and is independently available as a free public translator.
Architecture, in one paragraph
A scheduled crawler hits each ingest source on its own cadence (RSS, sitemap, or scraped) and pushes new items into a Postgres queue. A worker pool translates each item through tetumdili.com into the three other working languages, runs a Claude summarisation pass with a domain-aware system prompt, and indexes entities (people, ministries, places, companies) into a searchable graph. The frontend is a Next.js dashboard with entity feeds, language toggles, and saved searches. Everything runs on the same Hetzner fleet box as the rest of OniT.
Who it's for
- Diplomatic missions tracking the political discourse around energy, fisheries, and ASEAN accession.
- Development agencies (UN, DFAT, MDF, Asia Foundation, EU) tracking their sector indicators in real time.
- Foreign press who need first-language source material translated quickly and reliably.
- NGOs and civil society tracking accountability and human-rights coverage across Tetun-language outlets.
What's next
Public beta access to vetted institutional users in 2026. Paid tier for embassies and corporate compliance teams in 2027. Open the tetumdili.com translator further to developers (we already have a free public version — the next step is a documented API for batch jobs).
Stack