ChindaMT — open.iapp

What it is#

ChindaMT is an open Thai-English translator that follows instructions in your prompt.

Most translators just give you a fluent target sentence. Real translation work is more demanding — you need a translation that respects your terminology, your format, your length budget, your register (formal, casual, legal, marketing). ChindaMT does that.

It comes in three sizes — 4B, 2B, 0.8B parameters — so you can pick the one that fits your hardware.

Why it matters#

Today’s open Thai-language tools force you to choose:

Thai LLMs (like Typhoon) follow instructions but translate clumsily.
MT specialists (like Hunyuan-MT, NLLB) translate well but ignore prompt rules.

ChindaMT is the first open model that does both for Thai-English. You can ask it:

“Translate this product description into Thai. Use formal language. Keep all numbers and product codes exactly. Do not include any disclaimers.”

…and it actually follows all four rules at once.

How we built it#

Two stages, both built on top of an existing open base model (we use Qwen3.5).

Stage 1 — pick the data that’s worth learning from. We started with 17.85 million Thai-English sentence pairs. Most are too easy or too noisy to teach the model anything. We scored each pair by how much an instruction helps the model predict the output, then kept the hardest 10% that’s still learnable: 1.75M cherry-picked pairs.

Stage 2 — turn each pair into a rules-based training example. For each cherry pair, we extracted every rule the existing reference translation already satisfies — “uses formal Thai,” “preserves all numbers,” “is exactly one sentence,” and so on. Because the rules come from a real human translation, every rule is achievable by definition. We then regenerated the translation under those rules and kept only candidates that pass every single rule. Final: 1.97M rule-following training examples.

We fully fine-tuned each base model on this dataset for one pass. No reinforcement learning. No preference tuning.

Main results#

The recipe has to clear two bars:

Beat its own base model. Does the ChindaMT-IF training actually add Thai translation skill, or is the base good enough already?
Beat the best existing open Thai translator. Is ChindaMT better than what people can already use today?

Both bars are cleared. The win is bigger where it matters most: when the prompt has rules the translation needs to follow.

We measure with length-controlled win-rate (LC%). For every test prompt, both translators produce an output, an LLM judge picks the better one, and we correct for the judge’s known bias toward longer answers. 0 LC% means a tie. +10 LC% means ChindaMT is preferred 10 points more often than the opponent. Higher = better.

ChindaMT-4B head-to-head#

Compared against	Plain translation	Following prompt rules
Its own base model — Qwen3.5-4B (open)	ChindaMT wins	ChindaMT wins by a larger margin
Top open Thai translation specialist at 4B	+11.8 LC% for ChindaMT	+18.4 LC% for ChindaMT

The pattern holds at smaller sizes:

ChindaMT-2B beats both Qwen3.5-2B (its base) and the top open Thai 2B translation specialist.
ChindaMT-0.8B beats both Qwen3.5-0.8B (its base) and the top open Thai 0.8B specialist.

Same recipe at every size — we did not retune hyperparameters per model. The same recipe also transfers to a different base-model generation: applied unchanged to Qwen3-4B (a generation older than Qwen3.5), it still produces consistent gains.

Cross-checks#

We checked the wins three more ways so they don’t depend on a single judge:

Three reference-quality metrics — CometKiwi, GEMBA-DA, GEMBA-MQM. ChindaMT wins on all three.
A different judge family — re-ran the headline 4B comparison with Gemini 3 Flash as the judge instead of the open 35B model. Same verdict.
A second benchmark — WMT24++ en-th. Same pattern.

What you can use#

Models — full open weights on Hugging Face, all under Apache-2.0:

Evaluation suites — paired test prompts for reproducing or extending the comparison:

ChindaMT-CoreEval — 5 deployment-grade domains, 400 paired samples
ChindaMT-BroadEval — 10 domains, 400 paired samples (cross-domain generalization)

The full training recipe is specified in the paper. Anyone with the source corpora can rebuild it.

The fastest way to feel what “translation that follows the rules” actually does: open the ChindaMT-4B model card and try a few prompts with rules in them.

ChindaMT: Open-weight Thai-English translation that follows the rules