Nearline Web Archiving
Edward Fox
Abstract
Based on the acquisition method, web archiving may be categorized into client-side, transactional, and server-side archiving [1]. Transactional web archiving happens at the gateway of the origin server. As the name suggests, it archives the HTTP responses to user requests, typically in real time, when the HTTP transactions occur [2, 3]. Despite its distinctive temporal coverage, transactional web archiving suffers an inherent technical disadvantage. Similar to the server-side archiving, it requires the cooperation of the website owner and/or operator to install server-side add-ons. Much like the client-side archiving, it also hinges on the HTTP protocol and its “inability to provide bulk copy of server’s content” [1]. Archiving web documents one- by-one inevitably injects extra workload directly onto the origin server, making it rather difficult to seek the owner's cooperation.
People
Publication Details
- Date of publication:
- Journal:
- Joint Conference on Digital Libraries
- Publication note:
Zhiwu Xie, Krati Nayyar, Edward A. Fox: Nearline Web Archiving. Bull. IEEE Tech. Comm. Digit. Libr. 13(1) (2017)