Computational lexicons are among the most important resources for natural language processing (NLP). Their importance is even greater in languages with rich morphology, where the lexicon is expected to provide morphological analyzers with enough information to enable themto correctly process intricately inflected forms. We describe the Haifa Lexicon of Contemporary Hebrew, the broadest-coverage publicly available lexicon of Modern Hebrew, currently consisting of over 20,000 entries. While other lexical resources of Modern Hebrew have been developed in the past, this is the first publicly available large-scale lexicon of the language. In addition to supporting morphological processors (analyzers and generators), which was our primary objective, thelexicon is used as a research tool in Hebrew lexicography and lexical semantics. It is open for browsing on the web and several search tools and interfaces were developed which facilitate on-line access to its information. The lexicon is currently used for a variety of NLP applications.
We describe work in progress whose main objective is to create a collection of resources and tools for processing Hebrew. These resources include corpora of written texts, some of them annotated in various degrees of detail; tools for collecting, expanding and maintaining corpora; tools for annotation; lexicons, both monolingual and bilingual; a rule-based, linguistically motivated morphological analyzer and generator; and a WordNet for Hebrew. We emphasize the methodological issue of well-defined standards for the resources to be developed. The design of the resources guarantees their reusability, such that the output of one system can naturally be the input to another.