Workshop Topics      




Workshop Topics

background audience issues format of workshop
correspondence organisation committee submission format of related interest

Using Toolsets and Architectures To Build NLP Systems

a workshop to be held at Coling 2000
the 18th International Conference on Computational Linguistics
Luxembourg, 5 August 2000

see also this call at http://crl.nmsu.edu/Events/COLING00


Background

The purpose of the workshop is to present the state-of-the-art on NLP toolsets and workbenches that can be used to develop multilingual and/or multi-applications NLP components and systems. Although technical presentations of particular toolsets are of interest, we would like to emphasize methodologies and practical experiences in building components or full applications using an NLP toolset. Combined demonstrations and paper presentations are strongly encouraged.

Many toolsets have been developed to support the implementation of single NLP components (taggers, parsers, generators, dictionaries) or complete Natural Language Processing applications (Information Extraction systems, Machine Translation systems). These tools aim at facilitating and lowering the cost of building NLP systems. Since the tools themselves are often complex pieces of software, they require a significant amount of effort to be developed and maintained in the first place. Is this effort worth the trouble? It is to be noted that NLP toolsets have often been originally developed for implementing a single component or application. In this case, why not build the NLP system using a general programming language such as Lisp or Prolog? There can be at least two answers. First, for pure efficiency issues (speed and space), it is often preferable to build a parameterized algorithm operating on a uniform data structure (e.g., a phrase-structure parser). Second, it is harder, and often impossible, to develop, debug and maintain a large NLP system directly written in a general programming language.

It has been the experience of many users that a given toolset is quite often unusable outside its environment: the toolset can be too restricted in its purpose (e.g. an MT toolset that cannot be used for building a grammar checker), too complex to use, or even too difficult to install. There have been, in particular in the US under the Tipster program, efforts to promote instead common architectures for a given set of applications (primarily IR and IE in Tipster; see also the Galaxy architecture of the DARPA Communicator project). Several software environments have been built around this flexible concept, which is closer to current trends in main stream software engineering.

The workshop aims at providing a picture of the current problems faced by developers and users of toolsets, and future directions for the development and use of NLP toolsets. We encourage reports of actual experiences in the use of toolsets (complexity, training, learning curve, cost, benefits, user profiles) as well as presentation of toolsets concentrating on user issues (GUIs, methodologies, on-line help, etc.) and application development. Demonstrations are also welcome.


Audience
Researchers and practitioners in Language Engineering, users and developers of tools and toolsets.


Issues
Although individual tools (such as a POS taggers) have their use, they typically need to be integrated in a complete application (e.g. an IR system). Language Engineering issues in toolset and architectures include (in no particular order):

  • Practical experience in the use of a toolset;
  • Methodological issues associated to the use of a toolset;
  • Benefits and deficiencies of toolsets;
  • User (linguist/programmer) training and support;
  • Adaptation of a tool (or toolset) to a new kind of application;
  • Adaptation of a tool to a new language;
  • Integration of a tool in an application; Architectures and support software;
  • Reuse of data resources vs. processing components;
  • NLP algorithmic libraries.

more about this workshop

home
 
COLING 2000
 
programme
 
tutorials
 
workshops
 
exhibition
 
registration
 
 
related events
 
ICCL
 
archive
 
sponsors
 
contact
Google
     
   
  DFKI Language Technology Lab
German Research Center
for Artificial Intelligence
Language Technology Lab