This morning I visited the Harbin Institute of Technology. It is a city of it's own within Harbin. HIT financed the express road to the airport, there are two 4 Star hotels on the University ground....
I was guest of the laboratory of Natural Language Processing and Speech Recognition this morning. In the lab under the leadership of Prof. Tiejun Zhao are working nearly 50 scientists. They are working on machine translation using statistical methods.. They started with word based methods, are using phrase translation as state of the art methodology and they are working on structure recognition based translation. They proudly told that they were among the first half of groups that had been evaluated in the USA last year for quality of Chinese-English-Chinese machine translation.
We did a test with some excerpts from FAO's Best practices WIKI:
The translation went not too bad, bud did not give an immediately usable result, but obviously the machine was not trained on similar material. But we discovered also a discutable translation that human translator had done, when they initially translated the English original into Chinese. :-).
Another team in the group is doing NLP on web pages, doing autocategorization, summarization and similar busines. Could be of interest for OpenCalais like services, which want to work with Chinese.