Big Data analysis for language translation

Translating languages is a tough problem and there were many players in the market. However, Google seems to dominate the market with their search appliances and other vendors have been reduced to domain specific areas. So why has Google been successful? Is it just a question of resources?

To translate language requires two broad steps. 1) Understand what is being communicated in one language and 2) Express that meaning in another language. Written this way, it is almost impossible to imagine that computers will be able to understand meaning and then communicate meaning. To reduce the problem to more simplistic terms, machine translation has proceeded along simplistic lines.

Word substitution: If a dictionary exists then it is possible to substitute words. However, very often all the meaning is lost so this does not work very well.

Grammar and word substitution: In this case, grammar rules are used to substitute words. Again this is better than just word substitution but again if the nouns are not specifically marked then the process may not work.

Statistical probability based: Once you have a list of nouns and also some understanding of what words generally follow the other, then you can combine the grammar with statistical probability based context and use that for translation. This works well and what most companies use.

Comparison of translated work: This is where Google shines. They have a large collection of documents that have been translated that can serve as a good comparison. Combine that with some basic grammar rules and you have a reasonably accurate engine. This requires serves that are looking up large databases of written work but that is exactly what Google has available.

More information on Machine translation at Machine translation in America’s and Wikipedia.

https://www.kmworld.com/Articles/News/News-Analysis/The-modern-organization-Translation-challenges-86642.aspx

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.