TTC: Terminology Extraction, Translation Tools and Comparable...

About the TTC: Terminology Extraction, Translation Tools and Comparable Corpora Group

Computational applications in the translation field suffer from the terminology bottleneck. This applies to both CAT (computer-assisted translation) and MT (machine... more »
Computational applications in the translation field suffer from the terminology bottleneck. This applies to both CAT (computer-assisted translation) and MT (machine translation). There is a lack of bilingual term-related resources, especially for new or upcoming domains. Moreover, current automatic approaches suffer from the scarcity of parallel corpora.

The TTC project explores term extraction from comparable corpora, i.e. from texts of the same domain (and possibly genre) in different languages which are not translations of each other. TTC develops techniques for the extraction of monolingual term candidates and their contexts for DE, EN, ES, FR, LV, RU and ZH. In a second step, monolingual data are aligned to identify equivalence candidates: TTC explores different symbolic and statistical procedures for this purpose. The outputs of the TTC tool chain are single word terms, multiword terms and their equivalents, as well as contextual data.

The tools will be provided as a standalone package and as a web service. They will include tools for corpus crawling and corpus management, for monolingual term candidate extraction and for term alignment. Integration with EuroTermBank and with selected CAT tools and MT systems will be provided. « less

Have something to say?

Join LinkedIn for free to participate in the conversation. When you join, you can comment and post your own discussions.

Join LinkedIn

About this Group

  • Created: December 8, 2010
  • Type: Networking Group
  • Members: 315
Ad