|
Gabrielatos, C.
(2007). Selecting query terms to
build a specialised corpus from a restricted-access database. ICAME Journal 31, 5-43. |
|
|
|
|
|
Abstract /
Introduction This paper proposes an accessible measure of the relevance
of additional terms to a given query, describes and comments on the steps
leading to its development, and discusses its utility. The measure, termed
relative query term relevance (RQTR), draws on techniques used in information
retrieval, and can be combined with a technique used in creating corpora from
the world wide web, namely keyword analysis. It
is independent of reference corpora, and does not require knowledge of the
number of (relevant) documents in the database. Although it does not make use
of user/expert judgements of document relevance, it does allow for subjective
decisions. However, subjective decisions are triangulated against two
objective indicators: keyness and, mainly, RQTR. |
|
|
|
|
|
Key words Corpus,
query terms, database, keywords, keyness, term relevance |
|
|
|
|
|
Relevant details An earlier version was presented at the Lancaster
University Corpus Research Group on 30 October 2006. |
|
|
|
|
|
Articles on the
same topic |
|
|
Baroni, Marco and Silvia Bernardini. 2003. The BootCaT
toolkit: Simple utilities for bootstrapping corpora and terms from the web,
version 0.1.2. |
|
|
Baroni, Marco and Serge Sharoff. 2005. Creating specialized and general corpora using
automated search engine queries. Paper presented at Corpus Linguistics
2005, Birmingham University, 14–17 July 2005. |
|
|
Baroni, Marco, Adam Kilgarriff,
Jan Pomikálek and Pavel Rychlý. 2006. Web-BootCaT:
Instant domain-specific corpora to support human translators. Proceedings
of EAMT 2006, 247–252. |
|
|
Sinclair, John. 2004. Appendix:
How to build a corpus. In M. Wynne (ed.). Developing linguistic corpora: A
guide to good practice, 79–83. Oxford: Oxbow Books. |
|
|
Boughanem, Mohand,
Yannick Loiseau and Henri
Prade. 2006. Rank-ordering documents according to
their relevance in information retrieval using refinements of
ordered-weighted aggregations. In M. Detyniecki,
J.M. Jose, A. Nürnberger and C.J. van Rijsbergen (eds.). Adaptive multimedia retrieval:
User, context, and feedback. Third International Workshop, AMR 2005,
Glasgow, UK, July 28–29, 2005: Revised selected papers, 44–54. Berlin:
Springer. |
|
|
Xu, Jinxi
and W. Bruce Croft. 1996. Query expansion using local and global document
analysis. In Proceedings of the 19th Annual international ACM SIGIR
Conference on Research and Development in information Retrieval (SIGIR
‘96), Zurich, Switzerland (August 18–22, 1996). ACM Press, New York, NY,
4–11. |
|
|
|
|
|
If you know of any related publications or discussions
freely available online, please contact me. |
|