|
|
ALTSS 2004 Program
Full courses will have four sessions of 1.5 hours each, usually
distributed in two days. Half courses will have two sessions of 1.5
hours each. The program is scheduled so that introductory courses run
in parallel with advanced courses.
Course Details Introductory Courses
- VoiceXML
Rolf Schwitter - Macquarie University, Sydney
[Course Notes]
This course provides an introduction to VoiceXML for telephony-based
spoken language dialog systems. VoiceXML is a markup language designed
for creating audio dialogs between a user and a machine. It uses
speech recognition and touch-tone for input and text-to-speech
synthesis and pre-recorded audio for output. Any telephone can be used
to access a VoiceXML application via a VoiceXML browser that is
running on a voice server. VoiceXML relies on other markup languages
for describing recognition grammars, speech synthesis, and call
control constructs. This entire suite of markup languages is known as
the W3C Speech Interface Framework. Apart from VoiceXML, we will
briefly touch on the main components of this framework and introduce a
number of freely available VoiceXML development tools so that the
students can start building their own VoiceXML applications by the end
of this course.
The course will cover the following topics:
- Background: Introduction to Spoken Language Dialog Systems;
- VoiceXML and the W3C Speech Interface Framework;
- VoiceXML:
- Dialogs, Forms, and Fields;
- Development Tools;
- Control Flow;
- Grammars;
- Scripting;
- Mixed Initiative.
Note that we will not focus in this course on the speech
recognition and speech synthesis process but we will discuss related
issues involved in building spoken language dialog systems such as
dialog and prompt design.
Bio Rolf Schwitter is a Senior Lecturer in the
Computing Department at Macquarie University and a member of the
Centre for Language Technology at Macquarie. Rolf received his
doctorate degree in Computational Linguistics from the University of
Zurich in 1998. Rolf's area of teaching includes Web Technology,
Spoken Language Dialog Systems as well as Advanced Topics in Natural
Language Processing. In the context of a Spoken Language Dialog
Systems unit, he recently gave introductory lectures to VoiceXML at
the University of Zurich, at the University of Stockholm, and at
Macquarie University. His research interests focuses on controlled
natural language processing, question-answering, knowledge
representation and automatic reasoning. Personal page (external
link)
[top]
- Speech Annotation with EMU
Steve Cassidy - Macquarie University, Sydney
This course will provide a general introduction to collecting,
annotating and working with speech data using the EMU Speech Database System and
related tools. The course assumes a general familiarity with speech
and the desire to make speech databases work in some context. No
particular knowledge of phonetics, prosody or other annotation system
is required.
The course will cover the following topics:
- Recording speech signals
- Making data available to EMU.
- Developing an annotation framework, writing a database template.
- Automating annotation tasks.
- EMU signal processing tools.
- Querying your annotations.
- Strategies for data analysis with R and other tools.
- Building on EMU -- what to do when EMU doesn't work for you
Students will carry out practical work as part of the course which
will involve recording some data, annotating it from scratch and
performing some simple analysis. If you have your own data, please
bring it along on CDROM and we will try to help you work with it.
Bio Steve Cassidy is a Computer Scientist who has
worked in various areas relating to language and cognition over the
last 15 years. He completed a PhD in Wellington, New Zealand on
computer models of reading development and then moved to Macquarie
University, Sydney to work in the Speech Hearing and Language Research
Centre (SHLRC). At SHLRC he worked on applying statistical models to
acoustic phonetics problems and on the development of the EMU Speech
Database System. His work on EMU has led to an involvement with groups
in the US and Europe who are aiming to define standards for Linguistic
annotation. Steve is now working in the Computing department at
Macquarie where he is pursuing research in meeting room speech
processing and Linguistic annotation. Steve is currently a member of
the Executive of the Australian Speech Science and Technology
Association. Personal
page (external link)
[top]
- Grammar Formalisms
Ash Asudeh - University of Canterbury, Christchurch
[Course Notes]
This course is an introduction to three grammar formalisms developed in
theoretical linguistics that have clear applications to computational
linguistics and natural language processing. The formalisms considered are
Categorial Grammar (CG), Head-Driven Phrase Structure Grammar (HPSG), and
Lexical Functional Grammar (LFG). The course presupposes no background in
linguistic theory.
We will begin with an introduction to linguistics that gives students some
background on modern perspectives in theoretical linguistics, goals of the
field and resulting issues. The grammatical architectures of CG, HPSG, and
LFG will then be introduced in relation to this background. We will then
proceed to look at how these three theories address the following topics:
- Syntactic categories and basic combinatorics
- The role of the lexicon: heads, agreement, and complementation
- Modifiers
We will only have time to touch on each topic briefly. In each case we
will concentrate on understanding the intuitions that the formalisms seek
to capture, rather than details of analysis. The aim is to give students
enough background that they can confidently further explore the formalisms
on their own. The course will end with a quick demo of a grammar
engineering environment for large-scale grammar development.
Bio Ash Asudeh is a Fellow in the School of
Classics and Linguistics at the University of Canterbury. He received
an M.Phil. in cognitive science from the University of Edinburgh and a
Ph.D. in linguistics from Stanford University. While at Stanford, he
also worked on the constraint-based semantics project at Xerox
PARC. His primary research interests are syntax, its relationship to
semantics, and the implications of this relationship for linguistic
theory and grammatical architecture. He has also worked on
computational linguistics, psycholinguistics, and the
syntax--phonology interface. Asudeh's current work focuses on
applications to linguistic theory of resource logics developed in
formal logic and theoretical computer science.
[top]
- Speech Processing
David Grayden - The Bionic Ear Institute, Melbourne
[Course Notes]
This course is an introduction to the speech signal and how it is
processed by humans and by machines. We begin with the production of
speech, the properties of the acoustic signal and how it is perceived
by humans. Then we look at the methods of analysing the speech
signal. Speech signal analysis and human perception are tied together
by looking at speech coding, in particular perceptual coding of sound
using MPEG-1 psychoacoustic models, such as MP3. We touch on data
embedding and watermarking and then look at automatic speech
recognition in some detail. Finally there is an introduction to speech
synthesis and areas of ongoing speech processing research.
Bio Dr David Grayden has been working as a
Research Fellow at the Bionic Ear Institute in Melbourne since
1997. His main research involves examination of phoneme confusions
made by people using cochlear implants with the view to designing
strategies that will improve perception by the users. He is currently
developing and evaluating a number of advanced sound processing
strategies. He is also involved in other research areas, including
automatic speech recognition and speech enhancement using auditory
models, auditory physiology, integration of auditory and visual input,
and models of spike-timing dependent plasticity for adaptive learning
of spatiotemporal patterns. Personal page (external link)
[top]
Course Details Advanced Courses
- Multiword Expressions
Timothy Baldwin - University of Melbourne, Melbourne
Multiword expressions (MWEs) are word amalgams which are semantically,
syntactically and/or statistically idiosyncratic in some way, and occur in a
wide range of configurations including verbal idioms (e.g. "kick the bucket"),
verb particle constructions (e.g. "throw up") and coordinate structures
(e.g. "dull and boring"). In recent years, there has been increasing awareness
in computational linguistics of the need for specialised methods to detect and
capture the syntactic flexibility, semantic generalities and productivity of
MWEs. In this course, I will document some of the difficulties posed by MWEs
for real-world NLP applications, and outline a range of methods which have
been proposed to tackle these issues. I will also describe crosslingual
commonalities and divergences of MWEs, and devote some time to discussing
their multilingual implications.
Bio Timothy Baldwin is a Senior Lecturer in the
Department of Computer Science and Software Engineering at the
University of Melbourne (effective September, 2004), and also a Senior
Researcher in the CSLI LinGO Laboratory, Stanford University. His
research interests are in the extraction and syntactico-semantic
classification of multiword expressions, and also machine translation,
computational lexical semantics, the interface between theoretical and
computational linguistics, and computer-assisted language learning
applications for computational linguistics. Personal page (external
link)
[top]
- Information Retrieval
Mark Sanderson - University of Sheffield, Sheffield, UK
[Course Notes]
Across the four sessions, the field of Information Retrieval will
be introduced. The workings of a traditional IR system as well as an
overview of Web search will be covered. In addition the evaluation of
IR systems will be described. One session will be focused on the
retrieval of documents written in different languages covering the
user needs for such technology, use of translation resources and
interface design for such cross-language retrieval systems. Finally
retrieval of speech documents will also be covered featuring a
demonstration of a working system and a discussion of why speech
recognition systems are now operating at a sufficient level of
accuracy to allow almost perfect document retrieval to take place.
Bio Mark Sanderson is a senior lecturer within the
Information Studies department at the University of Sheffield since
1999 where he has taught Information Retrieval and advanced
Information Retrieval. He ran the introduction to the IR tutorial at
ACM SIGIR 2000 and 2001. He is on the editorial board of ACM TOIS
(Transactions On Information Systems), JASIST (Journal of the American
Society of Information Science and Technology), IP&M (Information
Processing and Management), and IR (Information Retrieval). He is also
the TREC advisory panel. Prior to his current post in Sheffield, he
was a research assistant for four years, working first at the
University of Glasgow and then at the Center for Intelligent
Information Retrieval in the University of Massachusetts. His PhD on
Word Sense Disambiguation was carried out at the University of
Glasgow. Currently he is co-investigator on two EU funded projects,
SPIRIT researching geographic-based retrieval and BRICKS a 7 million
Euro Integrated Project exploring digital library provision to
Europe's cultural heritage community. Personal page (external link)
[top]
- Maximum Entropy Modelling
James Curran - University of Sydney, Sydney
[Course Notes]
This course will provide a detailed introduction to Maximum Entropy
(maxent) modelling for Natural Language Processing. The course assumes
only familiarity with basic probability and statistics, but will
include a quick refresher of the necessary background. It aims to
give a strong intuitive understanding of maxent modelling which will
allow students to use maxent models effectively, but will also cover
some of the deeper mathematics.
The course will cover:
- Necessary probability and statistics refresher
- Statistical modelling
- Naive Bayes models
- Information theory concepts: information, entropy
- Maximum entropy models
- Features and constraints
- Training algorithms (GIS, IIS, conjugate gradient, ...)
- Sequence modelling with maxent
- Recent advances in maxent models:
- Smoothing techniques
- Conditional random fields
- Applications of maximum entropy models in NLP:
- Classification tasks: pp-attachment, question classification, ...
- Tagging tasks: POS tagging, chunking, named entity recognition, ...
- Parsing: C&C CCG parser
Bio James Curran is an ARC Postdoctoral Fellow in
the Language Technology Research Group in the School of Information
Technologies at the University of Sydney. He has just returned to
Australia after completing his Ph.D. in computational lexical
semantics at the University of Edinburgh.
His ARC funded project, 'Ask the Net: Intelligent Natural Language
Learning', involves automatically asking contributors simple questions via
email which will be collected to create annotated data for standard NLP
problems, e.g. Named Entity Recognition. An interesting challenge is
finding ways of eliciting linguistic knowledge from those without
linguistic training. His other research interests range from standard
statistical NLP problems such as tagging and parsing, through to system
building such as question answering systems.
[top]
- Text Categorisation (half
course)
Prof. Jon David Patrick - University of
Sydney, Sydney
[Course Notes]
This course will cover the principal topics important to creating a
working text categorisation system. It will focus on the components of
such a system and processes required to create it based on the
practical experiences of the Scamseek project. The role of
computational linguistics will be the centre of the discussion but the
surrounding tasks of language modelling, machine learning and software
engineering will all be discussed to varying degrees.
The course will be grounded in the experience of implementing the
Scamseek system. Scamseek has achieved the automation of
identification of financial scams over a wide range of Internet texts
to a high level of accuracy. The first system for detecting scams on
web pages has been operational since September 2003 and it
successfully discovered cases on its first days of operation that have
gone to prosecution. The complete system to cover all Internet traffic
has been operational since June 2004. These systems are unique in that
they use a linguistic model for computing meaning via Systemic
Functional Grammar. This model of language meaning also solves some of
the problems of identifying very small target sample sizes in very
large corpora, that is texts with <1% footprint in a
corpus. Discussion of some aspects of the Scamseek project are
restricted under secrecy agreements with ASIC.
Bio Professor Jon Patrick currently holds the
Chair of Language Technology at the University of Sydney. He has 5
degrees and is also a registered psychologist. His early research was
in developing information systems for the real-time capture of
language descriptions of human behaviour. In this work he created the
first systems for recording human behaviour by real-time verbal
descriptions. These systems were applied to many sports such as Rugby,
AFL, waterpolo, and surfing. In the late 1980s he created the first
systems for the automatic capture and on-screen presentation of player
statistics in real-time for television broadcasts. In later systems
research he particularly concentrated on the use of subliminal
language and its effectiveness at influencing personal and group
behaviour. He continues this work in research on the identification of
tacit knowledge in IS development through language usage analysis and
the nature of language in psychotherapy. He collaborates with
computational linguists at the University of the Basque Country and
has published the first substantial student grammar of Basque. He is
currently acting as the Director of the Scamseek project, a scam
detection system developed for ASIC by the Capital Markets
Co-operative Research Centre (CMCRC). Personal page (external link)
[top]
- Prosody and Intonation in Australian English (half course)
Janet Fletcher - University of Melbourne, Melbourne
[Course Notes]
This course provides a practical introduction to the study of
prosody and intonation in English. The first part of the course will
provide a brief introduction to current prosodic theory. The second
and major part of the course will focus on practical aspects of the
widely-used model of intonational analysis- E-TOBI (English Tones and
Break Indices). The primary objective is to teach participants how to
interpret acoustic properties of speech relevant to the understanding
of the higher levels of prosodic structure, as well as to provide a
hands-on approach to ToBI transcription.
Bio Janet Fletcher is an Associate Professor of
Linguistics in the School of Languages at the University of
Melbourne. She completed her PhD in experimental phonetics at the
University of Reading, and was a research associate in the Centre for
Speech Technology Research at the University of Edinburgh from
1986-1988. After 2 years of postdoctoral study with Mary Beckman at
the Ohio State University, she worked in the Speech Hearing and
Language Research Centre, Macquarie University on an industry-funded
project on speech synthesis. She has been at the University of
Melbourne since 1993. Her research interests include intonation and
prosody in Australian English, and the segmental/prosody interface in
Northern Australian Indigenous languages.
[top]
|
|