• vi-VNen-GB
IT fights against "The Devastatation of Spelling"
(27/10/2011)

Vietnamese is probably in the time of "confused spelling" because of unacceptable high spelling error rate which is even popular in the government sector, research institutes, the press... Therefore, Vietnam's IT should handle the issue of Vietnamese misspelling before handling resources of information and knowledge in Vietnamese.

According to Dr. Nguyen Ai Viet, Deputy Director of the Institute of Information Technology, National University of Hanoi, "One of core issues of Vietnam's IT is to process Vietnamese because all knowledge of Vietnamese is recorded in Vietnamese. IT application is to process information and knowledge, thus Vietnamese should be processed firstly.". In his opinion, the basic IT applications need to do firstly is to improve the quality of information system. In specific, the improvement of text quality will be done thank to spelling assistant softwares, a spelling checking softwares. Moreover, Vietnam's sources of knowledge is still limited, so that the best way is to mobilize human knowledge through the computer. They are automatic translation software, search engines (search by keyword, by semantics, by characteristics).

Dr. Nguyen Ai Viet.

From spelling checking…

It can say that despite a very basic issue, the impact of spelling is huge. For legal documents, misspelling or using a word which has two meaning may cause difficulties for the trial; for business contracts, misspelling can cause economic damage, the underestimation from partners toward qualification and product quality; for the websites of administrative agencies, misspelling may cost the confidence from the people

Over the past time, the Institute of Information Technology (National University of Hanoi) in collaboration with Viegrid Company has researched and developed of this product group. Spelling checking software was applied in the project "Ranking Vietnamese text" which was hosted by the Institute. By December 6th, 2010, the project evaluated 177 units and ranked 132 units in seven sectors (Ministry and Central Office; People's Committees of the provinces and cities directly under the Central Government; Government and Ministry agencies; Universities and Research Institutes; Press, Publishers and Media Agencies; Vietnam Business; Foreign Organizations and Agencies in Vietnam). It is worth mentioning that all seven sectors have the spelling rate from 4 to 10% (the acceptable rate according to linguists is only 1%). Detailed results are found on the website www.xephangvanban.com.

A Spelling error in an electronic newspaper. There is no "sáng lạn", only "xán lạn" in Vietnamese Dictionary of the Institute of Linguistics.

Let's not talk about the accuracy of this ranking, the notable point here is that it was the first time, thanks to IT tools, we obtain a quantitative number of Vietnamese misspelling. For a long time, we have had only the sensory evaluations on misspelling state on newspaper or in the society. The project has succeeded in penetrating into difficult-to-reach places such as government and ministry agencies, central and local offices, etc.

Why the state of misspelling is so popular, even in "sensitive" places? According to Dr. Nguyen Ai Viet, misspelling shows the unfulfilled responsibility of the public servant and the businesses to their daily work quality. For example, those books published before 1990 of several hundred of pages has only a few errors, but now, he found 304 spelling errors in a recently published book.

Thing more worth saying is that, according to Dr. Nguyen Ai Viet, although IT can participate into all stages of error checking, error fixing, text preparing assistance..., the issue here is that we do not have a standard for spelling. Currently, no agency is assigned to this task. Institute of Languages mainly does research. The Ministry of Education & Training is only responsible for managing teaching and learning in schools, the Ministry of Interior only has the right to promulgate regulations on administrative document presentations. The remaining areas are not under the management of these two ministries. There are spelling rules such as the rule of using "semicolon" can not be found in any common book, while the writers are rather free in applying such rules. In his opinion, there should have a primary agency which is in charge of publishing spelling dictionaries annually or simple spelling textbooks for people to follow.

…to the cooperation between IT and Linguistics

According to the Dr. Nguyen Ai Viet, IT companies in the field of processing Vietnamese such as Tinh Van, Lac Viet, Viegrid… have had cooperation with linguists. However, according to his observations, the cooperation here is only on the appearance without depth. So far, there's still a gap between IT and linguistics. Meanwhile, in the world, linguistic computing has come a long. Those in this field are very good at linguistics and they know how to use IT to process. In Vietnam, linguistics and IT haven't yet really understood the issue of each other to set standard.

"There are many language issues must be dealt with such as the construction of automatic translation machines, development of Vietnamese search engines, processing and speech recognition engines”, Mr. Viet mentioned. Personally, he expected to have a national project on Vietnamese processing so that two field of IT and linguistics have opportunity to cooperate closely. Those in linguistics may rise questioned, and those in IT will provide the tools to handle.

It is time that Vietnamese issue should be handled properly, as Dr. Nguyen Ai Viet said: "Misspelling a word may not have immediate harm, but it will soon come to the following consequences. Once the language is devastated, its consequences are no less than the famine ".

Vu Nga

According to Computer World Online

http://www.pcworld.com.vn/articles/tin-tuc/binh-luan/2010/09/1220755/cntt-chong-lai-su-tan-pha-chinh-ta