The DPLA is launching an open-source tool for fast, large-scale data harvests from OAI repositories. The tool uses a Spark distributed processing engine to speed up and scale up the harvesting operation, and to perform complex analysis of the harvested data. It is helping us improve our internal workflows and provide better service to our hubs.
Linux
AtoM stands for Access to Memory. It is a web-based, open source application for standards-based archival description and access in a multilingual, multi-repository environment.
Key features:
VIPS is a free image processing system. It includes a range of filters, arithmetic operations, colour processing, histograms, and geometric transforms. It supports ten pixel formats, from 8-bit unsigned int to 128-bit complex.
ePADD is a software package developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives.
The software is comprised of four modules:
FixityBerry is software that runs on a Raspberry Pi computer that runs fixity scans on all hard drives connected via USB. The Pi is able to read a wide variety of drive formats because packages are available for Linux for doing this. It sends an email once scanning is complete, and shuts down the device.
Siegfried is a PRONOM-based file format identification tool.
Key features are:
A Rails engine for metadata aggregation, enhancement, and quality control.
Digital Public Library of America uses Kri-Kri as part of
Heiðrún, its metadata ingestion system.
WebArchivePlayer is a brand new desktop tool which provides a point-and-click wrapper for viewing web archive files (WARC and ARC). The player allows users to browse web archive files, such as those created via https://webrecorder.io locally on their desktop. Once downloaded, no internet connection is necessary in order to browse the archive.
The PERICLES Extraction Tool (PET) is an open source (Apache 2 licensed) Java software for the extraction of significant information from the environment where digital objects are created and modified. This information supports object use and reuse, e.g. for a better long-term preservation of data.
Pages

RSS Updates | All Comments