pdf-extract

Last updated October 31, 2012. Created by Peter Murray on October 31, 2012.
Log in to edit this page.

Pdf-extract is an open source set of tools and libraries for identifying and extracting semantically significant regions of a scholarly journal article (or conference proceeding) PDF. The pdf-extract tools allow you to identify and extract the individual references from a scholarly journal article. References extracted using pdf-extract can, in turn, be resolved to the appropriate CrossRef DOI using CrossRef's citation resolution tools, Simple Text Query and the experimental CrossRef Metadata Search.

Technology
Package Type: 
Development Status: 
Operating System: 
Programming Language: