Twenty More Years of Computational Linguistics

by Aravind Joshi

December 7, 2002

The title can be read in two ways. Either I should comment on the past twenty years of ACL (which coincide with the period between ACL 1982 held at Penn and ACL 2002 also held at Penn) or twenty more years from ACL 2002. As you will see many of my comments will be relevant to both these interpretations.

How did I get into computational linguistics (CL)? I did not really get into it. It sort of happened to me, much like Voltaire realizing one day that he was indeed writing prose. This is simply due to the fact that I started doing what is now called CL (or natural language processing) well before there was such a recognized field or a professional society with that name. As a matter of fact for the first couple of years of ACL (until about early 64) I did not officially belong to ACL. This was not because I disliked something about ACL or the people connected with it. Far from it. I knew the people very well and respected them. In retrospect, I am sure part of my reason for not rushing into it was the initial name of the current ACL. More importantly, I guess I was not sure myself that what I was doing was the same as the proclaimed goals of the organization at the start. I was also not sure that what I was doing at that time was either part of linguistics or part of computer science, or combination of the two, and perhaps, therefore, not needing any special name. Of course, once I joined ACL I threw myself into it in every possible way. However, I believe my initial experience described above may reflect in some way in my remarks below.

CL from the very beginning was always rooted in some applications. In my own experience, the very first project I worked on (1958-59), directed by Zellig Harris, had to do with analyses of scientific papers (in biomedical domain, strangely enough) with a view to automatically produce abstracts. Little did anyone working on the project realize how hard the task would be, even the first step of doing the so-called shallow parsing (as it is called at present) was very hard. Although the task was well executed (by the use finite state transducers, the first such application as far as I know) the problem of scaling up was already evident, given the status of the hardware at that time. At this point many computational linguists (including me, of course) turned to more theoretical work, hoping to find a way for attacking the scaling problem. During the early 60’s to almost the mid 80’s CL research was flourishing together with highly productive linguistic research during this period. There was good interaction between computational linguists and linguists, although most of the linguists thought, during this period, that they have more to say to the computational linguists than the other way round. During this period a few computational linguists (and I certainly include myself in this group) believed that CL has something new to say to linguists. In other words, the computational paradigm was not just concerned with implementing the then existing linguistic theories. However, it was only in the late 80’s and early 90’s there was a real blossoming of the computational paradigm as a way of doing linguistics itself. Many linguists began to be more and more open to CL. Now from the late 90’s up to the present, as we all can see, CL is really flourishing. Linguists are even more open to CL than before. However, curiously enough, many computational linguists are moving away from linguistics. This is mostly due to the enormous success of statistical and machine learning techniques applied to corpora, annotated with very little linguistic information or, especially, unlabeled data. So we have a strange situation now. Just when linguists are seriously getting interested in CL, CL seems to be turning away from linguistics. However, I believe this situation is very likely to change again, as more and more richly annotated corpora will be created (for example, with predicate argument and adjunct information, word sense information, discourse annotations for connectives and their argument structure, dialogue structures, among others). These efforts necessarily involve more and more linguistic information. Statistical and machine learning techniques will be developed for these richly annotated corpora, integrating structural and statistical information. So in the next two decades we will see CL and linguists coming close to each other, with CL affecting the very methodology of linguistics (also psycholinguistics and language aspects of cognitive neuroscience) and not just providing the tools – a dream of many of us for a long time.

— Aravind Joshi