Machine Translation

Translation Services and the Benefit of Term Extraction

Apr 04, 2016
3 minutes
term extraction

Extracting terminology from a text prior to translation allows translators to create a language-specific and even job-specific glossary or translation memory database. This greatly increases the speed of translation and reduces the cost of your projects.

Types of Term Extraction

The two main types of term extraction are manual and automatic. Manual term extraction is just like is sounds: a translator goes through the text, catalogues words, and prepares translations. In automatic term extraction, a computer scans the document and, based on preset parameters, extracts words or phrases quickly and efficiently.

As you may expect, there are pros and cons to both systems. The automatic term extraction seems like the obvious choice, however, it has some serious limitations. For example, the computer does not limit the extraction to lexical form (nouns, verbs, adjectives, etc.) or contextual variation. If you were looking at the word “run” it would be extracted regardless of how it was used.

Automatic term extraction is often combined with the “find and replace” feature of computer translation. When used with a word like “run” however, your text will likely be filled with errors. In this case, the word has so many variations that are not defined that an automated extraction would not even be helpful in creating a termbase or glossary.

A variation on the automatic term extractor is the concordance. Concordance software extracts all usages of a particular term and shows it in relations to the text next to it. This allows translators to better determine the word form and context. They can then update the translation memory database prior to running the machine translation.

Note that both the automatic extraction and the concordance require a human to sort through the terms in order to separate the many variation and forms of words and phrases.

The advantage to manual term extraction is obvious. The person extracting the terms sees the word in context and can assign the proper translation right on the spot. Having a human going through each and every word of the document is much the same as just having a human hand-translate it. While there may be some time savings, it may not be a whole lot.

The tradeoff between the two term extraction forms is found in the length of text. For shorter texts, a hands-on approach generally comes out ahead. On longer texts, the computer is often a more efficient choice even with a human post-edit.

Translation Benefits of Term Extraction

If the benefits of term extraction do not seem clear, consider the following:

Today, most commercial translation and localization projects are carried out without a comprehensive, project-specific, up-to-date glossary in place. Some of the important information that may be included in a project that are not in standard glossaries are:

client’s business name             product names             trademarks

idiomatic expressions              neologisms                  buzz words

Further, consider the number of new genes, chemical compounds, drugs, and so forth that may be found in a medical or scientific journal, paper, or book. The creation of new information and words to explain the new information is growing faster than commercial databases can keep up. A concordance will help extract these words and prepare them for translation.

Some terms may be so important that to get them wrong could not only lead to great embarrassment but to the failure of a project, the loss of a contract, or even a loss of trust and reputation. Seeing them in context helps ensure they are properly translated.

Term extraction prior to the start of translation isolates high use words and phrases, new words and phrases, and essential terms. Term extraction also allows the entire translation team to take a common approach and have a unified vocabulary. This not only improves the overall flow of the document or project, but it cuts down on the time needed during post-translation review and correction.

    Stay Ahead in Global Communication

    Translation insights and industry trends — delivered to your inbox every week.