Rod Holland


2010

In spite of low literacy levels in Afghanistan and the Tribal Areas of Pakistan, the Pashto and Dari regions of the World Wide Web manifest diverse content from authors with a broad range of viewpoints. We have used cross-language information retrieval (CLIR) with machine translation to explore this content, and present an informal study of the principal genres that we have encountered. The suitability and limitations of existing machine translation packages for these languages for the exploitation of this content is discussed.

2008

Syndicated feeds in RSS, Atom, and related formats have emerged as ubiquitous information sources in World Wide Web language communities including Arabic, Farsi, Chinese, and others, providing subscribers with timely updates on topics of particular interest. We have modified an existing Open Source RSS reader, Sage, for cross-language use, permitting English-speakers to discover, subscribe to, update, and browse RSS feeds in ten languages. This early prototype, called Clip- perRSS, has been integrated with the Clipper cross-language information retrieval tool. The integrated system provides English-speakers with an effective means of exploring the potential of foreign-language syndicated feeds in their domains of interest.

1998