arabic, anism1a.gif (1984 bytes)arabic software solutions, anism1b1.gif (3409 bytes)arabic, anism1a2.gif (1972 bytes)

Arabic Software  Desktop Publishing  Machine Translation  Document Management  NLP   OCR  ASR  TTS  MultimediA


ON GSA Advantage, Advantage_logos.jpg (1597 bytes)
TranSphere Multilingual Media Speech Machine Translation MT and TM

Machine Translation (MT) for more than a dozen different language pairs; Multilingual Information Retrieval with  Query and Topic Search capabilities; Name-Finding applications; and Integrated Suites providing Speech Recognition and Translation (e.g. for Broadcast News). APIs and SDKs are also available for integration and development. All TranSphere products are on GSA Advantage!®: Arabic, Korean, Japanese, Chinese, Persian\Dari, Pashto and Turkish, Also AppTek provides a line of European languages pairs: German, Spanish, French, Polish, Italian, Russian, Portuguese, Ukrainian, Hebrew, and Dutch with English, as well. Urdu/English translation capabilities are currently under development.



SpeechTrans

The challenge of understanding spoken language was the main incentive behind the development of SpeechTrans, wherein AppTek has integrated its own Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) with Machine Translation (MT). SpeechTrans, which allows for realtime dynamic speech-to-speech machine translation, is deployed on computers, wearable machines, and telephony servers.

SpeechTrans is designed for:

1. Telephone-to-Telephone Machine Translation
2. PC-based Dictation Systems
3. Handheld Devices

SpeechTrans is a system that recognizes spoken utterances, e.g. in Arabic Dialects, and translates them into text in English, which is then synthesized and output. The input is recorded through a telephone channel or microphone, and recognized using different ASR systems capable of recognizing either the source language or English, and tuned to either microphone or telephone quality speech. The recognized utterances are normalized using statistical MT based on finite state automata. The output is then translated by a hybrid MT, combining statistical and rule based features. This hybrid Interlingua approach provides better results for speech input than a direct statistical MT.

The Hybrid Approach

Compared with written language, speech (especially when spontaneous) poses additional difficulties for the task of automatic MT. Typically, these difficulties are caused by errors of the recognition process, which is carried out before translation. As a result, the sentence to be translated is not necessarily well-formed from a syntactic point-of-view. Even without recognition errors, speech translation has to cope with a lack of conventional syntactic structures because the structures of spontaneous speech differ from those of written language.

A prime motivation for a hybrid MT system is to take advantage of the strengths of both rule based and statistical approaches, while mitigating their weaknesses. Thus, for example, we want a rule that covers a rare word combination or construction to take precedence over statistics that were derived from sparse data (and thus not very reliable). Additionally, rules covering long-distance dependencies and embedded structures should be weighted favorably, since these constructions are more difficult to process in statistical MT.

Figure 1: Concept of Speech-to-Speech Translation

Conversely, we would like a statistical approach to take precedence in situations where large numbers of relevant dependencies are available, novel input is encountered or high-frequency word combinations occur. An aspect that is extremely important in regards to the distillation engine is the weakness that statistical MT sometimes has in informativeness (the accurate translation of information) due to the influence of the target-language model. For example, single words that may make a disproportionately heavy contribution to informativeness, such as terms indicating negation or important content words, may be missing.

Statistical MT Module

Our statistical MT is a finite state transducer using alignment templates. Compared to traditional statistical MT systems, these methods have the advantage of being capable of learning translations of phrases, not just individual words, which permits the MT to encompass the functionality of example-based approaches and translation memories. The other advantage is that it allows for the combination of many knowledge sources, by framing them as feature functions that are combined using a Maximum Entropy framework.

Rule-Based MT Module

Our rule-based module employs a Lexical Functional Grammar (LFG) system. The LFG system contains a richly-annotated lexicon containing functional and semantic information. It also produces richly-annotated intermediate outputs that may interact with the statistical MT module:

• Source language c-structure (or "constituent structure") – a phrase-structure tree
• Source language f-structure (or "functional structure") – a directed acyclic graph containing attribute-value pairs specifying grammatical information such as grammatical relation (subject, object, etc.), case, argument structure (predicate, argument, adjunct), semantic disambiguation information (human, animate, concrete, etc.), gender, number, mood, tense, aspect, polarity, speech act, etc.
• Target language f-structure – an f-structure that has been modified and restructured to enable generation of target-language text


Figure 2: Concept of Hybrid Machine Translation


All AppTek Products are on
GSA Advantage!®
ChatSphere | LocalSphere | MediaSphere | MemorySphere | NameFinder
TranSphere | TranSphere Plug-in for Microsoft OfficeTextFinder | WebTrans

 PlainBabel | PlainKnowledge | PlainKnowledge for Windream | PlainSpeech | PlainTranslate

 



Home Page | Arabic Fonts | Arabic Language Tutors | Arabic NewsStand | Arabic Resources | Calligraphy | Children
Educational PC & Mac |
Desktop Publishing DTP - PC & Mac | Dictionaries | Islamic Software | Microsoft Arabic
Multilingual Keyboards |
New Products  | Shopping Cart | Price List | OCR | Sakhr Harf Multimedia
Machine Translation Software | Search Engines
| Sakhr Enterprise Software Solutions
Universal Word | Web Page Design & Hosting | World Resources
Word Processors | The AramediA Sales Policy
Adobe Middle East (ME)
mcart.jpg (2074 bytes)
Search Our Software Center

 

Multilingual Translation and OCR

Join Our Newsletter

AppTek's Multilingual Media Speech
Machine Translation MT, TM & OCR


TranSphere
Machine Translationl MT

GSA Advantage!®

Please go to our Software Center
and join our newsletter mailing

list in the lower left hand corner for
special offers and discounts

 

AramediA write up

Puts AramediA "On Top Of The World"

counterH, Click Here For Your Visitor Number!!

AramediA
61 Adams Street, Braintree, MA 02184 USA
1-781-849-0021   Fax 1-781-849-2922


mailto:AramediA



Copyright © 1995 - 2008 - AramediA . All rights reserved.