pdf-gold-digger/README.md

pdf-gold-digger
====

Pdf information extraction library based on [pdf.js](https://mozilla.github.io/pdf.js/)
and [node.js](https://nodejs.org).

## Work in progress

### Usage
``git clone https://github.com/vane/pdf-gold-digger``  
``gd -f some.pdf``

### Supports:
- extract text
  - separate each page
  - separate each line
  - separate font information
  - bounding box position 

### TODO:
- specify output format and output directory    
- output to xml format
- output to json format
- extract images to files
- extract font
- extract tables
- advanced font information
- extract forms
- extract drawings
Add LICENSE, README update package.json with valid repository url 2019-07-22 20:20:37 +02:00			`pdf-gold-digger`
			`====`

			`Pdf information extraction library based on [pdf.js](https://mozilla.github.io/pdf.js/)`
			`and [node.js](https://nodejs.org).`

			`## Work in progress`

Add proper command line interface for extraction 2019-07-22 20:58:24 +02:00			`### Usage`
			``git clone https://github.com/vane/pdf-gold-digger``
			``gd -f some.pdf``

Add LICENSE, README update package.json with valid repository url 2019-07-22 20:20:37 +02:00			`### Supports:`
			`- extract text`
			`- separate each page`
			`- separate each line`
			`- separate font information`
			`- bounding box position`

			`### TODO:`
			`- specify output format and output directory`
			`- output to xml format`
			`- output to json format`
			`- extract images to files`
			`- extract font`
			`- extract tables`
			`- advanced font information`
			`- extract forms`
			`- extract drawings`