• vi-VNen-GB
Report on state of misspelling in Vietnamese text
(27/12/2011)

The Institute of Information Technology - Hanoi National University, VIEGRID Communication & Technology JSC in collaboration with the e-newspaper VietNamNet held a press conference to announce the report on the state of spelling in Vietnamese text and to launch the website www.xephangvanban.com

On 28 July 2010, in Hanoi, the Institute of Information Technology - Hanoi National University, VIEGRID Communication & Technology JSC in collaboration with the e-newspaper VietNamNet held a press conference to announce the report on the state of spelling in Vietnamese text.

In the current stage of development, Vietnam society is facing with many new challenges. Language is a precious cultural property which is associated with the historical development of each nation. Preserving the clarity of Vietnamese and making it more and more beautiful and rich are always the necessary work of every Vietnamese people.

With the expectation to mobilize Vietnamese community, especially young generation to work together to preserve Vietnamese, the Institute of Information Technology - Hanoi National University in coordination with VIEGRID Company; conducted surveys, scientific studies and evaluations on the quality of spelling in Vietnamese text with a passionate call "Join with us in preserving Vietnamese".

Director General of VIEGRID - Mrs. Le Ngoc Hong said: "We always want to contribute and share to the community through practical and meaning activities. For years, we have developed Vietnamese processing softwares. Therefore, together with the Institute of IT and languists, we construct this report to the raise the first bell and to declare the war against the popular misspelling in Vietnamese text"

The e-newspaper VietnamNet is expected to be one of the first press agencies to voluteer in this campaign, to aware seriously the issue of spelling, and to join with society to preserve Vietnamese.

Before assessing the quality of Vietnamese spelling in text, the authors conducted a small survey on two groups of language and IT professionals: The group of Language Professionals requested that the spelling error rate in Vietnamese text should be under 1%. The group of IT professionals accepted this rate of about 2.5 - 5%. Both groups agreed that the sectors of press and media are most responsible for the state of Vietnamese spelling. The majority of professionals agreed that the rate of 10% was alarming threshold for spelling error, and 30% was the threshold that a misspelling has become accepted as a new correct spelling.

In the ranking in June 2010, 177 units were evaluated on spelling errors and 132 units in seven sectors were ranked (on the web page xephangvanban. com)

1) Ministry and Central Office;

2) People's Committees of the provinces and cities directly under the Central Government;

3) Government and Ministry agencies;

4) Universities and Research Institutes;

5) Press, publishers and media agencies;

6) Vietnamese enterprises;

7) Foreign Organizations and Agencies in Vietnam.

A statistic was made on 67.000 samples. The statistic method, basing on the typical error file, is suitable with the condition of limited resources. The statistics showed that the average rate of misspelling of Vietnamese text was 7.79 %, higher than the minimum requirement level.

The result showed that the words with highest error rate were "soi mói" with 74, 33 %, "Sáng lạn" with 41, 66 %, "cọ sát" with 28, 38 %, "thăm quan" with 20, 61 %.

The sector of press and communication had the highest spelling error rate, nearly at alarming rate of 10 %.

The error rate at the sector of Universities and Research Institutes was approximately to the average rate of society, not matching with its role as a model and pioneer in correct word using.

In particular, both sectors had their representative with error rate over 30%.

The sector of local governments, and agencies under the Ministry also had relatively high spelling error rate. In particular, there were units with error rate nearly 40 %.

Even that better sectors of enterprises and Ministries should further improve themselve in order to achieve the standard rate of 1%.

The detailed evaluation results are published on the Website www.xephangvanban.com

The results above reflect an alarming state of Vietnamese spelling. The group of authors, through this work, expects to make the whole society and ranked units understand the importance of Vietnamese spelling issue. The evaluation will be conducted once for every 3 months and will continue to be expanded in scale to support a public campaign on scanning misspelling.

The introduction of Vietnamese spelling checking software pieces on the website www.xephangvanban.com is necessary for the campaign. Basing on the statistical analysis, the group of authors estimated that the rate between non-word error and the substantive error was 31.69%: 68.31 %. Contrary to conception of some IT professionals, and different from English, in Vietnamese, substantive error is the major one. That may explain why the software pieces which can not scan substantive errors do not get a strong response from users.

For objective evaluation, the authors used criteria such as recognition decree, accuracy decree and the ability to give suggestion to evaluate performance of error checking software piecies. The group of experts also used the measurements VIE - a measurement which considers above factors and the ratio between non-word errors and substantive errors.

In the future, businesses, professionals and users can introduce new products to community to improve Vietnamese on the Web site www. xephangvanban.com

Any solo attempts, despite its great effort, will not bring the results as desired. Through this work, the group of authors expects to have the participation of managers, linguists and cultural activists in this campaign, and hopes that there are more software products to server community. JOIN US IN PRESERVING THE VIETNAMESE LANGUAGE.