Tutorial Announcement      

Coling 2000 Tutorial Announcement

Jacques VERGNE, Emmanuel Giguet: Trends in robust parsing

The aim of this tutorial is to outline and understand fundamental trends of the evolution of robust parsing, among the variety of concepts, techniques, and parsing processes, and to get a synthetic view of the topic.

Today, robust parsing is changing rapidly from tagging to chunking and clause bracketing. And partial parsing is becoming less and less partial, with computational properties which allow a good integration into industrial contexts, where linear complexity is a prerequisite : robust parsers are able to process raw linguistic material at a constant and foreseeable rate with foreseeable results.

  This tutorial is designed for PhD. students and researchers in NLP. The expected prerequisites are basic knowledge on parsing and tagging.

Course outline :

Session 1 : Course (3h)

  • presentation and introduction
  • part-of-speech tagging
  • chunking
  • chunking after tagging ?  or tagging and chunking together ?
  • clause bracketing and computing chunk main functions inside a clause
  • linking chunks
  • non recursive representations of constituent structures
  • architectures and implementations
  • conclusion of the course, and introduction of the practical

Session 2 : Practical (3h)

  The aim of the practical is to illustrate the course and to give participants the opportunity to practice on the "GREYC parser", which is a general platform to design and build parsers.
  It is described (in French) on : http://www.info.unicaen.fr/~jvergne/analyseur_GREYC/analyseur_du_GREYC.html

  The practical will consist in :
  • testing ready-written rules on English or French corpora brought by participants,
  • writing more rules for tagging, chunking, clause bracketing, and chunk linking in English or French, and testing these rules on corpora,
  • and for volunteers, writing rules for tagging and chunking in another language (it will need a little preparation).
  For more details : http://www.info.unicaen.fr/~jvergne/tutorialColing2000.html (in English)

Tutorial speakers :

  Jacques Vergne (Jacques.Vergne@info.unicaen.fr) is a lecturer and researcher in computer science and NLP at the GREYC, the computer science laboratory of the university of Caen (France). His research domain is robust and accurate parsing. He has built the 1998 parser which obtained the best results in the GRACE contest (http://limsi.fr/TLP/grace/), an international evaluation which had the aim to compare taggers for French in a unique protocol.
  Emmanuel Giguet acted as project manager of the team which realized the "GREYC parser". His PhD. thesis has given the 1998 parser a more general design which now is implemented in the "GREYC parser".

Some references for a preliminary insight into the topic :

Abney S. (1991). "Parsing By Chunks". In: Robert Berwick, Steven Abney and Carol Tenny (eds.), Principle-Based Parsing. Kluwer Academic Publishers, Dordrecht. http://www.sfs.nphil.uni-tuebingen.de/~abney/Abney_90e.ps.gz

Abney S. (1995). "Chunks and Dependencies: Bringing Processing Evidence to Bear on Syntax". In: Computational Linguistics and the Foundations of Linguistic Theory. CSLI. pp. 145-164.

Abney S. (1996b). "Partial Parsing via Finite-State Cascades". In Proceedings of the ESSLLI '96 Robust Parsing Workshop.

Aït-Mokhtar S. and Chanod J.-P. (1997). "Incremental Finite-State Parsing". In Proceedings of ANLP'97, Washington, pp.72-79.

Brill E. (1992). "A simple rule-based part-of-speech tagger". In Proceedings of the Third Conference on Applied Natural Language Processing, Trento. ACL.

Church K. and Mercer R. (1993). "Introduction of the special issue of Computational Linguistics Using large corpora". Computational Linguistics, volume 19, number 1, pp.1-24.

Computational Linguistics (1993). "Special issue on Using large corpora". Volume 19, number 1 and 2.

Ejerhed E. (1996). "Finite state segmentation of discourse into clauses". In Proceedings of ECAI'96 Workshop Extended finite state models of language, A. Kornai (Ed.) pp.24-33. http://www.kornai.com/ECAI/ejerhed.html

Giguet E., Vergne J. (1997). "From Part-of-Speech Tagging to Memory-based Deep Syntactic Analysis". In Proceedings of the International Workshop on Parsing Technologies (IWPT'97), MIT, Boston, Massachussets.

Giguet E. (1998). "Méthode pour l'analyse automatique de structures formelles sur documents multilingues". Ph.D thesis, Université de Caen.

Grefenstette G. (1996). "Light Parsing as Finite-State Filtering". ECAI'96 workshop on "Extended finite state models of language". Aug. 11-12, Budapest. http://www.xrce.xerox.com/publis/mltt/mltt-96-12.ps

Vergne J. and Giguet E. (1998).  "Regards Théoriques sur le Tagging". Cinquième conférence annuelle : Le Traitement Automatique des Langues Naturelles, TALN'98, Paris, pp. 22-31.


related events
  DFKI Language Technology Lab
German Research Center
for Artificial Intelligence
Language Technology Lab