Jing-Shin Chang


2012

2011

2010

2009

2007

2006

2005

2004

1999

In this paper, the major problems of the current machine translation systems are first outlined. A new direction, highlighting the system capability to be customizable and self-learnable, is then proposed for attacking the described problems, which are mainly resulted from the very complicated characteristics of natural languages. The proposed solution adopts an unsupervised two-way training mechanism and a parameterized architecture to acquire the required statistical knowledge, such that the system can be easily adapted to different domains and various preferences of individual users.

1997

A brief introduction to the MT research projects in Taiwan is given in this paper. Special attention is given to the more and more popular corpus-based statistics-oriented (CBSO) approaches in MT researches. In particular, the parameterized two-way training philosophy in designing the second generation BehaviorTran, which is the first and the largest operational system in this area, is introduced in this paper.

1996

1995

1994

1993

1992

1991

The ArchTran English-Chinese Machine Translation System is among the first commercialized English-Chinese machine translation systems in the world. A prototype system was released in 1989 and currently serves as the kernel of a value-added network-based translation service. The main design features of the ArchTran system are the adoption of a mixed (bottom-up parsing with top-down filtering) parsing strategy, a scored parsing mechanism, and the corpus-based, statistics-oriented paradigm for linguistic knowledge acquisition. Under this framework, research directions are toward designing systematic and automatic methods for acquiring language model parameters, and toward using preference measure with uniform probabilistic score function for ambiguity resolution. In this paper, the underlying probabilistic models of the ArchTran designing philosophy will be presented.

1990

1989

In a natural language processing system, a large amount of ambiguity and a large branching factor are hindering factors in obtaining the desired analysis for a given sentence in a short time. In this paper, we are proposing a sequential truncation parsing algorithm to reduce the searching space and thus lowering the parsing time. The algorithm is based on a score function which takes the advantages of probabilistic characteristics of syntactic information in the sentences. A preliminary test on this algorithm was conducted with a special version of our machine translation system, the ARCHTRAN, and an encouraging result was observed.

1988