Update README.md

2019-07-24 21:21:37 +02:00 · 2019-07-24 21:21:37 +02:00 · b75b5c73c6
commit b75b5c73c6
parent 2211abb068
1 changed files with 23 additions and 21 deletions
--- a/README.md
+++ b/README.md
@ -4,15 +4,18 @@ pdf-gold-digger
 Pdf information extraction library based on [pdf.js](https://mozilla.github.io/pdf.js/)
 and [node.js](https://nodejs.org).
-### Install
+## Install
 ```npm install -g pdf-gold-digger```
-
+## Usage
 ### Usage
 ```pdfdig -i some_file.pdf```  
 for help use :  
 ```pdfdig -h```
 ```bash
 pdfdig -i some_file.pdf
 ```  
 ## Avaliable commands
 ```bash
 pdfdig -h
 ex. pdfdig -i input-file -o output_directory -f json
  --input  or  -i   pdf file location (required)
@ -22,36 +25,35 @@ ex. pdfdig -i input-file -o output_directory -f json
  --help   or  -h   display this help message
 ```
-
+## Advanced usage
-#### or test by clonning repository
+```bash
-```git clone https://github.com/vane/pdf-gold-digger```    
+git clone https://github.com/vane/pdf-gold-digger
-then run   
+sh demo.sh
-```sh demo.sh```  
+```
 and see results in ```out``` directory 
-
+## Documentation
 ### Documentation url
 [pdf-gold-digger](https://vane.pl/pdf-gold-digger/)
-
+## Features:
 ## Work in progress
 ### Supports:
 - extract text
  - separate each page
  - separate each line
  - separate font information
  - bounding box position (probably buggy now)
 - extract images
- output to text ```-f text (default)```
+- output formats
- output to json ```-f json```
+  - text ```-f text (default)```
  - json ```-f json```
 - specify output directory
-### TODO:
+## TODO:
 - extract text
  - bounding box position
 - load pdf from remote location
  - from url    
 - output to xml format
 - output to html format
 - output to markdown format
 - output to zip
 - extract font
 - extract tables