Information retrieval and I Guide - Resources and Review

Document discussing the difficulties in Web based information retrieval due to the diversity and largely textual nature of Internet resources. A retrieval method employing statistical correlation analysis techniques is introduced, and its ability to locate parts and services on the Web is evaluated.
This article describes a number of empirical studies investigating the use of the data mining approach for the analysis of health information. The examples described highlight the factors perceived as influencing the success or otherwise of the data mining approach in each case, and illustrate the generic difficulties that may be encountered during the process, and how these difficulties may be overcome.
Provides guidelines for the content, organisation, and presentation of indexes used for the retrieval of parts of, and entire, documents. It deals with the principles of indexing, regardless of the type of material, the method used (intellectual analysis, machine algorithm, or both), the medium of the index, or the method of presentation for searching. It emphasises three processes essential for all indexes; comprehensive design, vocabulary management, and the provision of syntax. Definitions of indexes are included, together with recommendations regarding their design. Available in PDF only.
Paper examining the selection of information within a digital library environment at three levels: selecting which library (repository) to look in, selecting which document(s) within a library to look at, and selecting fragments of data (text, numeric data, images) from within a document. Similarities and differences between these three levels are discussed.
A selective collection of resources compiled to facilitate information research and retrieval. It is organised along three major divisions. Section one is dedicated to public services and has pointers to search engines, topical guides, ready reference and databases. Section two is devoted to professional development, knowledge management and library literature on the web. Section three covers technical services and provides guides for cataloguing, Internet and intranet development tools and acquisitions sourcing.
Articles by Tefko Saracevic, Professor at the School of Communication, Information and Library Studies at Rutgers. Articles cover a range of topics including education, evaluation, research, and practice in the area of digital libraries, as well as feedback, interactivity, modelling, and relevance in information retrieval. Articles are available in Word or PDF format.
Aims to support large scale resource discovery across the Yorkshire and Humberside region by using the Z39.50 protocol to create a distributed union catalogue.
Details of a study into the relevancy of information retrieval using a visual user interface as opposed to the more common keyword interface. Spatial relationships between documents are plotted on a matrix, enabling the relevancy amongst papers to be assessed, using information based on the distance between points and the direction of connecting axes.
CIQM acts as a clearing house to which database users may report problems relating to the the quality of a database being used.
Online resources relevant for evaluation, development and administration of high quality factual and scholarly networked information systems.
A guide to Internet, library and commercial information resources. Describes how to find Web pages, books, articles, patents, statistics, theses and dissertations, from sources such as discussion groups, commercial databases, journals and government publications.
Article discussing the notion that different people use different terms or phrases to access the same information. An approach is presented which recognises that one concept may be represented by several different terms, many concepts may be represented by a single term, and that concepts may overlap.
Attempts to measure the annual production of information content by storage media type. Coverage includes print resources, film, optical and magnetic storage, the Internet, and broadcast media. Estimates total production at around 250 megabytes for every human on earth, and argues that digital information accounts for the vast majority, with print documents only making up 0.003 percent of the total quantity produced. Concludes that better understanding and tools are required to make use of the ever
increasing supply of information, and to avoid total information overload.
Web resource about the TREC conference series which aims to provide the infrastructure necessary for large
scale evaluation of information retrieval methodologies. Includes proceedings of each conference in PS or PDF formats and a comprehensive data section to enable organisations to evaluate their own retrieval systems at any time. Sections for active participants and past results require a password which can be obtained by emailing the program manager.
Aims to establish a pilot virtual clump to provide single search access to the library catalogues of six members of the M25 Consortium of Higher Education Libraries. The project will consist of a seamless search tool to the library OPACs of the six pilot partners and will provide monograph and serial information using the Z39.50 protocol.
Group of the International Federation of Library Associations and Institutions which aims to stimulate, develop and promote concepts and tools for the improvement of quality of modern library services, involving advanced technologies for creation, storage, retrieval, manipulation and display of information.
Set of text search tools which can be downloaded free of charge. Includes Callable Personal Librarian, PLWeb Turbo, PLWeb
CD and Personal Librarian. An overview of each tool is provided, together with information about applications, features, benefits and technical requirements.
Full text book on information retrieval, related concepts, and system design. Chapters are offered in PDF and html formats.
Discussion of technical approaches to cross
language information retrieval looking at user needs and highlighting some worldwide initiatives.
User group for those wanting to get the most from online and CD
ROM products and services.
A free electronic newsletter on data mining and knowledge discovery research and applications, together with a guide to commercial and public
domain tools for data mining and knowledge discovery, links to companies providing software, consulting, and other data mining services solutions, a directory of data-mining-based solutions to industry-specific problems, web sites, research projects, reference materials, journals and meetings relevant to data mining and knowledge discovery
An overview of the Z39.50 standard explaining what it is and how it works. Identifies the benefits of using Z39.50 implementations, offers a brief history, describes some key technical features, provides examples of some Z39.50 applications, and forecasts the future direction of the standard. Available in PDF only.
Links to databases of terms covering science and technology, leisure and entertainment, law, administration, and commerce. Each glossary comprises terms of value for information retrieval in a particular subject field.
Collection of Welsh language resources catering for learners of all abilities. Information includes a bibliography, software packages, magazines, grammar guides, and spelling and vocabulary checkers.
An article describing a semantic information retrieval system using concepts as opposed to keywords or phrases. Describes methods by which word and concept distribution is determined.
Project aiming to integrate 25 Z39.50 compliant catalogues or information services of CAIRNS sites across Scotland into a functional and user
adaptive test-bed service.
Collection of pointers to conferences, software, contacts and other resources in the field of artificial intelligence and knowledge discovery.
Project aiming to produce a sustainable approach to making searching, finding, browsing, accessing and retrieving multimedia resources as easy as possible, tools for the maintenance and security of information resources and mechanisms for monitoring the quality of information.
Full text of a report which reviewed the literature on information
seeking behaviour in fields outside information science, and proposed a new model of information behaviour.
VTLS develops, markets, and supports solutions for managing library collections and accessing information via computer networks.
Provides the widely used collection description schema as proposed by the UK Office of Library and Information Networking and derived from Michael Heaney's related report, 'An Analytical Model of Collections and their Catalogues'. Provides a definitive set of metadata attributes to ensure the effective description of library, archive and museum collections.
Peer reviewed journal covering research and application issues in many fields, including statistics, databases, pattern recognition and learning, data visualisation, uncertainty modelling, data warehousing, optimisation, and high performance computing.
Directory including detailed sections on print or back of the book indexing, archival material, databases, software, special formats, manual indexers, information retrieval, the World Wide Web, and professional associations.
A project aimed at dealing with the problems of large and complex retrieval sets, possible from multiple databases, obtained via the Z39.50 protocol. The trial service allows cross
searching of at least 12 university library catalogues, using Z39.50 from within a Web browser.
An interactive tutorial comprising three modules designed to introduce the user to online research methods, with a view to improving their information retrieval skills. The first section focuses on selecting the most appropriate sources depending on the user's research interests, the second concentrates on the search process itself, and the final module considers how best to assess the credibility and usefulness of material retrieved.
Draft profile identifying a subset of specifications from the Z39.50 Information Retrieval Protocol for use in Z39.50 client and server software. An explanation is given as to how this will improve search and retrieval among library catalogues, union catalogues, and other electronic resource discovery services worldwide, together with information about scope, functional requirements, and conformance issues.
The main focus of standardisation in the library arena has moved from that of supporting efficiency to allowing library users to access external resources and facilitating remote access to library resources. The potential of full interoperability in the field is examined along with its likely impact. Some of the gaps in current standards are examined, with a focus on information retrieval.
Article focusing on aspects of information retrieval including automatic extraction of key concepts or names, collaborative filtering, visualisation techniques, and classification and clustering. Topics are discussed in relation to information retrieval history and the development of the Web.
Article describing Korean document retrieval techniques and assessing a novel approach for retrieving SGML / HyTime documents from large document databases.
Nb = 39