Commit Graph

50 Commits

Author SHA1 Message Date
9b314bda4a
Update README.md 2019-07-23 08:56:13 +02:00
b58f364fa0
Update README.md 2019-07-23 08:55:39 +02:00
57e594402e Update README.md with new todo / done 2019-07-23 06:21:52 +02:00
068f56db5c Version 0.0.4 0.0.4 2019-07-23 06:18:00 +02:00
c2ebe7f526 Add todo to Extract 2019-07-23 06:12:55 +02:00
3810d2fcc0 Fix text extraction based on pdf.js samples 2019-07-23 06:04:55 +02:00
263b318029 Update README.md TODO list 2019-07-23 05:47:58 +02:00
30d673a455 Add missing OPS beginAnnotations, endAnnotations 2019-07-23 05:45:16 +02:00
9c2baab2a6 Fix Unimplmemented operator message 2019-07-23 05:44:02 +02:00
d0a2e44cdf Update README.md with correct package url 2019-07-23 05:34:59 +02:00
10443b009b Update package.json repository url 0.0.3 2019-07-23 05:28:39 +02:00
44e59dfe8f Version 0.0.2 0.0.2 2019-07-23 05:16:41 +02:00
c2f597d899 Update README.md with documentation location and package.json keywords 2019-07-23 05:16:28 +02:00
2dee872d40 Change output directory structure / Closes #2 0.0.1 2019-07-23 05:00:53 +02:00
cc2002078b Extract images with VisitorImage #2
- output to file based on format data.{format}
- more documentation
2019-07-23 04:53:17 +02:00
1b6bfbbe13 Documentation FormatterJSON 2019-07-23 04:35:41 +02:00
94c05cf064 Commandline change -i for file input -f for format 2019-07-23 04:05:48 +02:00
63710c2f8b Add demo.sh and test.sh for automating stuff 2019-07-23 01:59:55 +02:00
23c4586d28 VisitorBase for common visitors constructor 2019-07-23 01:43:57 +02:00
bf4698df59 dummy 2019-07-23 01:22:54 +02:00
ab4e0ffdae Update README.md 2019-07-23 01:22:16 +02:00
81f4de1c29 Documentation generate scripts 2019-07-23 01:19:17 +02:00
5cff7ee4c0 Fix after move lib to src 2019-07-23 01:05:11 +02:00
5b2c8eded3 Move lib to src 2019-07-23 01:04:28 +02:00
4a01a382cf Documentation 2019-07-23 00:58:58 +02:00
12536bbc21 Text elements move to separate files 2019-07-23 00:44:49 +02:00
ac54beceba Documentation for Visitor 2019-07-23 00:37:13 +02:00
4aea804772 dummy GoldDigger comment 2019-07-23 00:34:20 +02:00
263fe310f2 Update README.md with simplified usage information 2019-07-23 00:31:09 +02:00
cefca38fa9 Fix missing pdfdig shebang 2019-07-23 00:26:04 +02:00
b797a2f192 Fix package.json 2019-07-23 00:22:42 +02:00
c141efb7a6 Formatter move each formatter to separate file 2019-07-23 00:17:48 +02:00
b638f75bca FormatterText return empty output as print is handled elsewhere 2019-07-23 00:14:17 +02:00
a2b9bcb1d7 Fix Visitor dependency paths 2019-07-23 00:09:16 +02:00
e3d25d2e77 Visitor classes move to separate files make universal visit method 2019-07-23 00:07:23 +02:00
9f9126e2c0 Documentation for Visitor 2019-07-22 23:34:41 +02:00
2f4fa0c474 Rename Executor to Visitor, fix GoldDiger digPage 2019-07-22 23:32:51 +02:00
cbfdc29fde Documentation for Constraints, Extract, FontObject 2019-07-22 23:27:52 +02:00
f7fc939832 Add FONT_IDENTITY_MATRIX for calculating glyph x position 2019-07-22 23:20:37 +02:00
99bd258317 Documentation add to GoldDigger 2019-07-22 23:16:36 +02:00
53c053542d Formatter JSON refactoring move nested forEach to separate methods 2019-07-22 23:08:46 +02:00
0cf9707d86 Update package.json description 2019-07-22 22:51:49 +02:00
fc54756c50 Add option to run from command line using 'pdfdig' command 2019-07-22 22:50:26 +02:00
0fa89f0c04 Valid formatting options fix 2019-07-22 22:48:58 +02:00
b3d9b10317 Add output formatter and json output 2019-07-22 22:46:05 +02:00
54af1b3753 Move config to gd.js and move feature extraction to Executor 2019-07-22 21:26:01 +02:00
738aecf068 Add proper command line interface for extraction 2019-07-22 20:58:24 +02:00
195ecdad63 Add LICENSE, README update package.json with valid repository url 2019-07-22 20:20:37 +02:00
c8d15caca3 Use proper example file inside package.json 2019-07-22 20:03:59 +02:00
64a7df2f75 Start of pdf extracting nodejs library based on pdfjs
- add ability to extract text with:
  * font information
  * each line separated
  * each text have stored positions
- skipping text inside diagrams
2019-07-22 19:59:22 +02:00