|
|
Line 1: |
Line 1: |
| {{Event | | {{Event |
− | |Acronym=Corpus Profiling 2008
| + | |Title=2. Fachkonferenz Kinder- und Jugendbibliotheken |
− | |Title=Corpus Profiling for Information Retrieval and Natural Language Processing Workshop | + | |Ordinal=2 |
− | |Type=Workshop
| + | |In Event Series=Event Series:Dbc7071a-aa41-495a-8e00-29efb6b47934 |
− | |Homepage=kmi.open.ac.uk/events/corpus-profiling/index.php
| |
− | |City=London | |
− | |Country=Country:GB | |
− | |wikicfpId=3474
| |
− | |pageCreator=127.0.0.1
| |
− | |contributionType=1
| |
| |Single Day Event=no | | |Single Day Event=no |
− | |Start Date=Oct 18, 2008 | + | |Start Date=2017/02/15 |
− | |End Date=Oct 18, 2008 | + | |End Date=2017/02/18 |
− | |Academic Field=Information Retrieval
| |
| |Event Status=as scheduled | | |Event Status=as scheduled |
| |Event Mode=on site | | |Event Mode=on site |
− | }}
| + | |City=Remscheid |
− | {{Event Deadline
| + | |Region=Nordrhein-Westfalen |
− | |Notification Deadline=Sep 12, 2008 | + | |Country=Country:DE |
− | |Camera-Ready Deadline=Sep 26, 2008 | |
− | |Submission Deadline=Aug 15, 2008 | |
| }} | | }} |
| {{Event Deadline}} | | {{Event Deadline}} |
| {{S Event}} | | {{S Event}} |
− |
| |
− | <pre>
| |
− | -----------------------------------
| |
− | PURPOSE
| |
− | -----------------------------------
| |
− |
| |
− | We aim to bring together people from different research communities
| |
− | interested in exploring how corpus characteristics affect the behaviour
| |
− | of techniques in information retrieval and natural language processing,
| |
− | and to set out a roadmap for a shared research agenda.
| |
− |
| |
− | It is well known in NLP and IR that the effectiveness of a technique
| |
− | depends on both the data on which it is deployed and its match with the
| |
− | task at hand. In 1973, Spärck-Jones attributed differing degrees of
| |
− | success at automatic classification to differences in dataset
| |
− | characteristics. Since Croft and Harper (1979), IR performance has
| |
− | repeatedly been related to collection size and other features, though no
| |
− | upper bound has been found.
| |
− |
| |
− | The importance of data and task dependencies has been highlighted in IR,
| |
− | anaphora resolution, automatic summarization and recently, in word sense
| |
− | disambiguation. Many web/enterprise web retrieval systems rely on URL
| |
− | properties, link graph properties, click streams, and so on, with
| |
− | performance dependent on the degree to which this evidence is present
| |
− | and meaningful in a particular corpus.
| |
− |
| |
− | Systematically exploring features that can be used effectively to
| |
− | characterise corpora, has been missing from IR/NLP research. This
| |
− | creates problems with replicability of experimental results and the
| |
− | development of applications.
| |
− |
| |
− | The time is right to pursue this dependence systematically to address
| |
− | topics in tracking the effect of dataset profile on technique
| |
− | performance. Over the past 15 years, the approaches of several subject
| |
− | areas have converged with IR, as large corpora and test collections
| |
− | assume central importance in research methodologies. These areas have
| |
− | highlighted issues surrounding the role of data.
| |
− |
| |
− |
| |
− | -----------------------------------
| |
− | WORKSHOP FORMAT
| |
− | -----------------------------------
| |
− |
| |
− | The workshop will be a day long, in conjunction with the Information
| |
− | Interaction in Context (IIiX'2008, http://irsg.bcs.org/iiix2008/). The
| |
− | workshop will have three components:
| |
− |
| |
− | (1) invited talks in the morning, introducing the background from
| |
− | different perspectives
| |
− |
| |
− | (2) two afternoon sessions, presenting peer-reviewed papers
| |
− |
| |
− | (3) a panel discussion (panel composed of presenters and the organizers).
| |
− |
| |
− |
| |
− | -----------------------------------
| |
− | TOPICS OF INTEREST
| |
− | -----------------------------------
| |
− |
| |
− | We welcome original research or position papers. We particularly
| |
− | encourage postgraduate students or postdoctoral researchers to submit
| |
− | papers. Topics of interest include, but are NOT LIMITED to, the
| |
− | following areas:
| |
− |
| |
− | * Suitable features to characterise text/language variety,
| |
− | capturing known effects on technique performance with respect to a task;
| |
− |
| |
− | * Tasks that depend on aspects of corpus profiles, (e.g., the
| |
− | positive correlation of QA performance with fact frequency in a corpus);
| |
− |
| |
− | * Limitations of context-independent frequency-based measures, and
| |
− | exploration of measures that highlight complex dependencies;
| |
− |
| |
− | * Tools/techniques for characterising a feature or the extent to
| |
− | which it is manifested in a corpus;
| |
− |
| |
− | * Evaluation methodologies for testing feature candidates relative
| |
− | to task/technique;
| |
− |
| |
− | * Learnability of features (cf. meta-level learning for
| |
− | classification algorithms).
| |
− |
| |
− |
| |
− | -----------------------------------
| |
− | IMPORTANT DATES
| |
− | -----------------------------------
| |
− |
| |
− | 15 August 2008: Paper submission due
| |
− |
| |
− | 12 September 2008: Notification of acceptance/rejection
| |
− |
| |
− | 26 September 2008: Camera-ready due
| |
− |
| |
− | 18 October 2008: Workshop
| |
− |
| |
− |
| |
− | -----------------------------------
| |
− | SUBMISSION GUIDELINES
| |
− | -----------------------------------
| |
− |
| |
− | Original technical papers, short papers and position papers are all
| |
− | welcome. Please ensure that your submission does not exceed 5,000 words
| |
− | in length. Use 10 point font size, double column for body text, and 12
| |
− | point bold for headings. Please send your submission in PDF to all the
| |
− | three organizers (A.Deroeck@open.ac.uk; d.song@open.ac.uk;
| |
− | udo@essex.ac.uk) with subject "Corpus Profiling workshop submission".
| |
− |
| |
− | We will publish the accepted papers electronically through BCS's
| |
− | Electronic Workshops in Computing (eWiC), together with the extended
| |
− | abstracts of invited talks, a summary of the panel discussion. We will
| |
− | seek to pursue the research thread through further workshops at relevant
| |
− | conferences. We plan to organize a post-workshop special issue on a
| |
− | suitable IR or NLP related journal.
| |
− |
| |
− | -----------------------------------
| |
− | PROGRAMME COMMITTEE
| |
− | -----------------------------------
| |
− |
| |
− | Anne De Roeck (The Open University)
| |
− | Udo Kruschwitz (University of Essex)
| |
− | Ruslan Mitkov (University of Wolverhampton)
| |
− | Nikolaos Nanas (CERETETH, Greece)
| |
− | Michael Oakes (University of Sunderland)
| |
− | Ian Ruthven (University of Strathclyde)
| |
− | Dawei Song (KMi, The Open University)
| |
− | Tomek Strzalkowski (SUNY Albany)
| |
− | Alistair Willis (The Open University)
| |
− |
| |
− | For further information please visit
| |
− | http://kmi.open.ac.uk/events/corpus-profiling/index.php
| |
− | </pre>This CfP was obtained from [http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=3474&copyownerid=2 WikiCFP][[Category:Natural language processing]]
| |
− | [[Category:Information retrieval]]
| |