Alert:
Limited Availability of Full-Text Documents. Click here for more information, or here to request the return of a PDF online.

ED523081 - Long-Term Information Preservation and Access

Help Help Help Movie Tutorial Help Help | Help Movie Tutorial Help Help | Help Movie Tutorial Help With This Page Help With This Page

back Back to Search Results  permalink Help Help Permalink    Share this clipboard Share this record

Record Details

Full-Text Availability Options:

More Info:
Help Help | Help Movie Tutorial
Help Finding Full Text
More Info:
Help Help
Find in a Library
Publisher's website

Related Items: Show Related Items
Click on any of the links below to perform a new search
ERIC #:ED523081
Title:Long-Term Information Preservation and Access
Authors:Song, Sang Chul
Descriptors:Information NeedsInformation SourcesIndexingAccess to InformationPreservationArchivesCatalogingInformation RetrievalMetadataSearch StrategiesReplication (Evaluation)Program ValidationComputer System DesignNavigation (Information Systems)UsabilityOnline SearchingDatabase DesignDatabase Management SystemsElectronic PublishingInformation StorageComputer Storage DevicesInformation TechnologyInformation TheoryInformation Management
Source:ProQuest LLC, Ph.D. Dissertation, University of Maryland, College Park
More Info:
Help Help
Peer Reviewed:
Publisher:ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Publication Date:2010-00-00
Pages:159
Pub Types:Dissertations/Theses - Doctoral Dissertations
Abstract:An unprecedented amount of information encompassing almost every facet of human activities across the world is generated daily in the form of zeros and ones, and that is often the only form in which such information is recorded. A good fraction of this information needs to be preserved for periods of time ranging from a few years to centuries. Consequently, the problem of preserving digital information over a long-term has attracted the attention of many organizations, including libraries, government agencies, scientific communities, and individual researchers. In this dissertation, we address three issues that are critical to ensure long-term information preservation and access. The first concerns the core requirement of how to guarantee the integrity of preserved contents. Digital information is in general very fragile because of the many ways errors can be introduced, such as errors introduced because of hardware and media degradation, hardware and software malfunction, operational errors, security breaches, and malicious alterations. To address this problem, we develop a new approach based on efficient and rigorous cryptographic techniques, which will guarantee the integrity of preserved contents with extremely high probability even in the presence of malicious attacks. Our prototype implementation of this approach has been deployed and actively used in the past years in several organizations, including the San Diego Super Computer Center, the Chronopolis Consortium, North Carolina State University, and more recently the Government Printing Office. Second, we consider another crucial component in any preservation system--searching and locating information. The ever-growing size of a long-term archive and the temporality of each preserved item introduce a new set of challenges to providing a fast retrieval of content based on a temporal query. The widely-used cataloguing scheme has serious scalability problems. The standard full-text search approach has serious limitations since it does not deal appropriately with the temporal dimension, and, in particular, is incapable of performing relevancy scoring according to the temporal context. To address these problems, we introduce two types of indexing schemes--a location indexing scheme, and a full-text search indexing scheme. Our location indexing scheme provides optimal operations for inserting and locating a specific version of a preserved item given an item ID and a time point, and our full-text search indexing scheme efficiently handles the scalability problem, supporting relevancy scoring within the temporal context at the same time. Finally, we address the problem of organizing inter-related data, so that future accesses and data exploration can be quickly performed. We, in particular, consider web contents, where we combine a link-analysis scheme with a graph partitioning scheme to put together more closely related contents in the same standard web archive container. We conduct experiments that simulate random browsing of preserved contents, and show that our data organization scheme greatly minimizes the number of containers needed to be accessed for a random browsing session. Our schemes have been tested against real-world data of significant scale, and validated through extensive empirical evaluations. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Abstractor:As Provided
Reference Count:0

Note:N/A
Identifiers:N/A
Record Type:Non-Journal
Level:N/A
Institutions:N/A
Sponsors:N/A
ISBN:ISBN-978-1-1242-7017-3
ISSN:N/A
Audiences:N/A
Languages:English
Education Level:Adult Education
Direct Link:http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3426394
 

back Back to Search Results



Notice of Language Assistance: English  |  español  |  中文: 繁體版  |  Việt-ngữ  |  한국어  |  Tagalog  |  Русский