Commit Graph

36 Commits

Author SHA1 Message Date
Michal Szczepanski cc2002078b Extract images with VisitorImage #2
- output to file based on format data.{format}
- more documentation
2019-07-23 04:53:17 +02:00
Michal Szczepanski 1b6bfbbe13 Documentation FormatterJSON 2019-07-23 04:35:41 +02:00
Michal Szczepanski 94c05cf064 Commandline change -i for file input -f for format 2019-07-23 04:05:48 +02:00
Michal Szczepanski 63710c2f8b Add demo.sh and test.sh for automating stuff 2019-07-23 01:59:55 +02:00
Michal Szczepanski 23c4586d28 VisitorBase for common visitors constructor 2019-07-23 01:43:57 +02:00
Michal Szczepanski bf4698df59 dummy 2019-07-23 01:22:54 +02:00
Michal Szczepanski ab4e0ffdae Update README.md 2019-07-23 01:22:16 +02:00
Michal Szczepanski 81f4de1c29 Documentation generate scripts 2019-07-23 01:19:17 +02:00
Michal Szczepanski 5cff7ee4c0 Fix after move lib to src 2019-07-23 01:05:11 +02:00
Michal Szczepanski 5b2c8eded3 Move lib to src 2019-07-23 01:04:28 +02:00
Michal Szczepanski 4a01a382cf Documentation 2019-07-23 00:58:58 +02:00
Michal Szczepanski 12536bbc21 Text elements move to separate files 2019-07-23 00:44:49 +02:00
Michal Szczepanski ac54beceba Documentation for Visitor 2019-07-23 00:37:13 +02:00
Michal Szczepanski 4aea804772 dummy GoldDigger comment 2019-07-23 00:34:20 +02:00
Michal Szczepanski 263fe310f2 Update README.md with simplified usage information 2019-07-23 00:31:09 +02:00
Michal Szczepanski cefca38fa9 Fix missing pdfdig shebang 2019-07-23 00:26:04 +02:00
Michal Szczepanski b797a2f192 Fix package.json 2019-07-23 00:22:42 +02:00
Michal Szczepanski c141efb7a6 Formatter move each formatter to separate file 2019-07-23 00:17:48 +02:00
Michal Szczepanski b638f75bca FormatterText return empty output as print is handled elsewhere 2019-07-23 00:14:17 +02:00
Michal Szczepanski a2b9bcb1d7 Fix Visitor dependency paths 2019-07-23 00:09:16 +02:00
Michal Szczepanski e3d25d2e77 Visitor classes move to separate files make universal visit method 2019-07-23 00:07:23 +02:00
Michal Szczepanski 9f9126e2c0 Documentation for Visitor 2019-07-22 23:34:41 +02:00
Michal Szczepanski 2f4fa0c474 Rename Executor to Visitor, fix GoldDiger digPage 2019-07-22 23:32:51 +02:00
Michal Szczepanski cbfdc29fde Documentation for Constraints, Extract, FontObject 2019-07-22 23:27:52 +02:00
Michal Szczepanski f7fc939832 Add FONT_IDENTITY_MATRIX for calculating glyph x position 2019-07-22 23:20:37 +02:00
Michal Szczepanski 99bd258317 Documentation add to GoldDigger 2019-07-22 23:16:36 +02:00
Michal Szczepanski 53c053542d Formatter JSON refactoring move nested forEach to separate methods 2019-07-22 23:08:46 +02:00
Michal Szczepanski 0cf9707d86 Update package.json description 2019-07-22 22:51:49 +02:00
Michal Szczepanski fc54756c50 Add option to run from command line using 'pdfdig' command 2019-07-22 22:50:26 +02:00
Michal Szczepanski 0fa89f0c04 Valid formatting options fix 2019-07-22 22:48:58 +02:00
Michal Szczepanski b3d9b10317 Add output formatter and json output 2019-07-22 22:46:05 +02:00
Michal Szczepanski 54af1b3753 Move config to gd.js and move feature extraction to Executor 2019-07-22 21:26:01 +02:00
Michal Szczepanski 738aecf068 Add proper command line interface for extraction 2019-07-22 20:58:24 +02:00
Michal Szczepanski 195ecdad63 Add LICENSE, README update package.json with valid repository url 2019-07-22 20:20:37 +02:00
Michal Szczepanski c8d15caca3 Use proper example file inside package.json 2019-07-22 20:03:59 +02:00
Michal Szczepanski 64a7df2f75 Start of pdf extracting nodejs library based on pdfjs
- add ability to extract text with:
  * font information
  * each line separated
  * each text have stored positions
- skipping text inside diagrams
2019-07-22 19:59:22 +02:00