Warning: Trying to access array offset on value of type bool in /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php on line 16

Warning: Trying to access array offset on value of type bool in /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php on line 16

Warning: Trying to access array offset on value of type bool in /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php on line 16

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758

Warning: Cannot modify header information - headers already sent by (output started at /home/feltexco/public_html/felix/wp-content/plugins/google-maps-ready/modules/options/models/options.php:16) in /home/feltexco/public_html/felix/wp-includes/rest-api/class-wp-rest-server.php on line 1758
{"id":1667,"date":"2014-12-25T20:13:28","date_gmt":"2014-12-25T22:13:28","guid":{"rendered":"http:\/\/www.feltex.com.br\/felix\/?p=1667"},"modified":"2015-02-12T15:47:08","modified_gmt":"2015-02-12T17:47:08","slug":"leitura-de-pdf-com-pdfbox","status":"publish","type":"post","link":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/","title":{"rendered":"Dica R\u00e1pida: Leitura de PDF com PDFBox"},"content":{"rendered":"

Dica R\u00e1pida: Leitura de PDF com PDFBox<\/H1>
\nOl\u00e1 amigos,
\n A dica de hoje \u00e9 o PDFBox, mais um projeto Apache. Como o nome j\u00e1 sugere \u00e9 este framework nos ajuda a manipular arquivos no formado PDF. Para isso mostraremos um c\u00f3digo que faz a leitura de um arquivo simples e depois faremos altera\u00e7\u00e3o no programa para criar novas funcionalidades.<\/p>\n

\"PDFBox\"<\/p>\n

<\/p>\n

1. Fazendo a leitura do PDF<\/H2>
\n Crie um novo projeto e adicione a classe conforme o c\u00f3digo abaixo:<\/p>\n
\r\npackage br.com.feltex.lerpdf;\r\n\r\nimport java.io.File;\r\nimport java.io.FileInputStream;\r\nimport java.io.IOException;\r\nimport java.io.PrintWriter;\r\nimport java.util.Scanner;\r\n\r\nimport org.apache.pdfbox.cos.COSDocument;\r\nimport org.apache.pdfbox.pdfparser.PDFParser;\r\nimport org.apache.pdfbox.pdmodel.PDDocument;\r\nimport org.apache.pdfbox.util.PDFTextStripper;\r\n\r\npublic class LeituraPDFBox {\r\n\r\n\tpublic static void main(String args[]) {\r\n\t\tSystem.out.println(\"Inicio\");\r\n\t\tPDFTextStripper pdfStripper = null;\r\n\t\tPDDocument pdDoc = null;\r\n\t\tCOSDocument cosDoc = null;\r\n\t\tFile file = new File(\"MeuArquivo.pdf\");\r\n\t\ttry {\r\n\t\t\tPDFParser parser = new PDFParser(new FileInputStream(file));\r\n\t\t\tparser.parse();\r\n\t\t\tcosDoc = parser.getDocument();\r\n\t\t\tpdfStripper = new PDFTextStripper();\r\n\t\t\tpdDoc = new PDDocument(cosDoc);\r\n                        \/\/Come\u00e7a a leitura do arquivo a partir da p\u00e1gina informada\r\n                        \/\/ neste exemplo \u00e9 a p\u00e1gina \"1\"\r\n\t\t\tpdfStripper.setStartPage(1);\r\n\r\n\t\t\tpdfStripper.setEndPage(pdfStripper.getEndPage());\r\n\t\t\tString parsedText = pdfStripper.getText(pdDoc);\r\n\r\n\t\t\tScanner s = new Scanner(parsedText);\r\n\t\t\ts.useDelimiter(\"\\n\");\r\n\r\n\t\t\tString linha = \"\";\r\n\t\t\twhile (s.hasNext()) {\r\n\t\t\t\tlinha = s.next();\t\t\t\t\r\n\t\t\t\tSystem.out.println(linha);\t\t\t\t\r\n\t\t\t}\r\n\t\t\ts.close();\r\n\t\t} catch (IOException e) {\r\n\t\t\te.printStackTrace();\r\n\t\t}\r\n\t\tSystem.out.println(\"Fim\");\r\n\t}\r\n}\r\n<\/pre>\n

2. Criando um arquivo a partir da leitura do PDF<\/H2>
\n Agora vamos melhorar o nosso c\u00f3digo gerando uma arquivo TXT a partir do PDF que foi lido.<\/p>\n
\r\npackage br.com.feltex.lerpdf;\r\n\r\nimport java.io.File;\r\nimport java.io.FileInputStream;\r\nimport java.io.IOException;\r\nimport java.io.PrintWriter;\r\nimport java.util.Scanner;\r\n\r\nimport org.apache.pdfbox.cos.COSDocument;\r\nimport org.apache.pdfbox.pdfparser.PDFParser;\r\nimport org.apache.pdfbox.pdmodel.PDDocument;\r\nimport org.apache.pdfbox.util.PDFTextStripper;\r\n\r\npublic class LeituraJava2 {\r\n\r\n\tpublic static void main(String args[]) {\r\n\t\tSystem.out.println(\"Inicio\");\r\n\t\tPDFTextStripper pdfStripper = null;\r\n\t\tPDDocument pdDoc = null;\r\n\t\tCOSDocument cosDoc = null;\r\n\t\tFile file = new File(\r\n\t\t\t\t\"MeuArquivo.pdf\");\r\n\t\ttry {\r\n\t\t\tPDFParser parser = new PDFParser(new FileInputStream(file));\r\n\t\t\tparser.parse();\r\n\t\t\tcosDoc = parser.getDocument();\r\n\t\t\tpdfStripper = new PDFTextStripper();\r\n\t\t\tpdDoc = new PDDocument(cosDoc);\r\n\t\t\tpdfStripper.setStartPage(1);\r\n\t\t\tpdfStripper.setEndPage(pdfStripper.getEndPage());\r\n\t\t\tString parsedText = pdfStripper.getText(pdDoc);\r\n\r\n\t\t\tPrintWriter saida = new PrintWriter(\r\n\t\t\t\t\tnew File(\"D:\/Temp\/saidapdf.txt\"));\r\n\r\n\t\t\tScanner s = new Scanner(parsedText);\r\n\t\t\ts.useDelimiter(\"\\n\");\r\n\r\n\t\t\tString linha = \"\";\r\n\t\t\twhile (s.hasNext()) {\r\n\t\t\t\tlinha = s.next();\r\n\t\t\t\tsaida.print(linha);\r\n\t\t\t}\r\n\t\t\tsaida.close();\r\n\t\t\ts.close();\r\n\t\t} catch (IOException e) {\r\n\t\t\te.printStackTrace();\r\n\t\t}\r\n\t\tSystem.out.println(\"Fim\");\r\n\t}\r\n}\r\n<\/pre>\n

3. Conclus\u00e3o<\/H2>
\n Conseguimos, atrav\u00e9s do PDFBox, realizar a leitura de um arquivo PDF e tamb\u00e9m criamos um novo arquivo “txt” com o conte\u00fado encontrado no arquivo de origem. Com esse framework \u00e9 poss\u00edvel tamb\u00e9m gerar arquivos, manipular imagens entre muitas outras a\u00e7\u00f5es com PDFs.
\n Na se\u00e7\u00e3o de Links relacionados acesse a p\u00e1gina oficial do projeto e veja os v\u00e1rios exemplos dispon\u00edveis.<\/p>\n

Abra\u00e7os e bons estudos. No mais Vida que segue!!<\/p>\n

Links relacionados


\nSite oficial do PDFBox<\/a>
\n
Criar arquivos PDF em Java iText<\/a>
\n
Projetos Completos em Java \u2013 aprenda na pr\u00e1tica<\/a><\/p>\n

N\u00e3o esque\u00e7a de curtir este post nas redes sociais. D\u00ea a sua contribui\u00e7\u00e3o social e ajude o autor:<\/H2><\/p>\n","protected":false},"excerpt":{"rendered":"

Dica R\u00e1pida: Leitura de PDF com PDFBox Ol\u00e1 amigos, A dica de hoje \u00e9 o PDFBox, mais um projeto Apache. Como o nome j\u00e1 sugere \u00e9 este framework nos ajuda a manipular arquivos no formado PDF. Para isso mostraremos um …<\/p>\n

Dica R\u00e1pida: Leitura de PDF com PDFBox<\/span> Read More »<\/a><\/p>\n

<\/p>\n","protected":false},"author":1,"featured_media":1724,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0},"categories":[1],"tags":[],"yoast_head":"\nDica R\u00e1pida: Leitura de PDF com PDFBox<\/title>\n<meta name=\"description\" content=\"A dica de hoje \u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\u00e1 sugere \u00e9 nos ajuda a manipular arquivos no formado PDF.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/\" \/>\n<meta property=\"og:locale\" content=\"pt_BR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dica R\u00e1pida: Leitura de PDF com PDFBox\" \/>\n<meta property=\"og:description\" content=\"A dica de hoje \u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\u00e1 sugere \u00e9 nos ajuda a manipular arquivos no formado PDF.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/\" \/>\n<meta property=\"og:site_name\" content=\"Aprenda Java\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/feltex.br\" \/>\n<meta property=\"article:published_time\" content=\"2014-12-25T22:13:28+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2015-02-12T17:47:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png\" \/>\n\t<meta property=\"og:image:width\" content=\"512\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:label1\" content=\"Escrito por\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andr\u00e9 F\u00e9lix\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. tempo de leitura\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/#website\",\"url\":\"https:\/\/www.feltex.com.br\/felix\/\",\"name\":\"Aprenda Java\",\"description\":\"Cursos de java, SQL e Engenharia de Software\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.feltex.com.br\/felix\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"pt-BR\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#primaryimage\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png\",\"contentUrl\":\"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png\",\"width\":512,\"height\":512},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#webpage\",\"url\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/\",\"name\":\"Dica R\\u00e1pida: Leitura de PDF com PDFBox\",\"isPartOf\":{\"@id\":\"https:\/\/www.feltex.com.br\/felix\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#primaryimage\"},\"datePublished\":\"2014-12-25T22:13:28+00:00\",\"dateModified\":\"2015-02-12T17:47:08+00:00\",\"author\":{\"@id\":\"https:\/\/www.feltex.com.br\/felix\/#\/schema\/person\/1e49f842c6254b4561b66ccf573c2069\"},\"description\":\"A dica de hoje \\u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\\u00e1 sugere \\u00e9 nos ajuda a manipular arquivos no formado PDF.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#breadcrumb\"},\"inLanguage\":\"pt-BR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Dica R\\u00e1pida: Leitura de PDF com PDFBox\"}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/#\/schema\/person\/1e49f842c6254b4561b66ccf573c2069\",\"name\":\"Andr\\u00e9 F\\u00e9lix\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/www.feltex.com.br\/felix\/#personlogo\",\"inLanguage\":\"pt-BR\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d2d9cc82cab40245e6f803982b1448e6?s=96&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d2d9cc82cab40245e6f803982b1448e6?s=96&r=g\",\"caption\":\"Andr\\u00e9 F\\u00e9lix\"},\"sameAs\":[\"http:\/\/www.feltex.com.br\"],\"url\":\"https:\/\/www.feltex.com.br\/felix\/author\/andre.felix\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dica R\u00e1pida: Leitura de PDF com PDFBox","description":"A dica de hoje \u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\u00e1 sugere \u00e9 nos ajuda a manipular arquivos no formado PDF.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/","og_locale":"pt_BR","og_type":"article","og_title":"Dica R\u00e1pida: Leitura de PDF com PDFBox","og_description":"A dica de hoje \u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\u00e1 sugere \u00e9 nos ajuda a manipular arquivos no formado PDF.","og_url":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/","og_site_name":"Aprenda Java","article_publisher":"https:\/\/www.facebook.com\/feltex.br","article_published_time":"2014-12-25T22:13:28+00:00","article_modified_time":"2015-02-12T17:47:08+00:00","og_image":[{"width":512,"height":512,"url":"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png","type":"image\/png"}],"twitter_misc":{"Escrito por":"Andr\u00e9 F\u00e9lix","Est. tempo de leitura":"3 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/www.feltex.com.br\/felix\/#website","url":"https:\/\/www.feltex.com.br\/felix\/","name":"Aprenda Java","description":"Cursos de java, SQL e Engenharia de Software","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.feltex.com.br\/felix\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"pt-BR"},{"@type":"ImageObject","@id":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#primaryimage","inLanguage":"pt-BR","url":"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png","contentUrl":"https:\/\/www.feltex.com.br\/felix\/wp-content\/uploads\/2014\/12\/PDFBox2.png","width":512,"height":512},{"@type":"WebPage","@id":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#webpage","url":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/","name":"Dica R\u00e1pida: Leitura de PDF com PDFBox","isPartOf":{"@id":"https:\/\/www.feltex.com.br\/felix\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#primaryimage"},"datePublished":"2014-12-25T22:13:28+00:00","dateModified":"2015-02-12T17:47:08+00:00","author":{"@id":"https:\/\/www.feltex.com.br\/felix\/#\/schema\/person\/1e49f842c6254b4561b66ccf573c2069"},"description":"A dica de hoje \u00e9 o PDFBOX mais um projeto Apache. Ele como o nome j\u00e1 sugere \u00e9 nos ajuda a manipular arquivos no formado PDF.","breadcrumb":{"@id":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#breadcrumb"},"inLanguage":"pt-BR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.feltex.com.br\/felix\/leitura-de-pdf-com-pdfbox\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Dica R\u00e1pida: Leitura de PDF com PDFBox"}]},{"@type":"Person","@id":"https:\/\/www.feltex.com.br\/felix\/#\/schema\/person\/1e49f842c6254b4561b66ccf573c2069","name":"Andr\u00e9 F\u00e9lix","image":{"@type":"ImageObject","@id":"https:\/\/www.feltex.com.br\/felix\/#personlogo","inLanguage":"pt-BR","url":"https:\/\/secure.gravatar.com\/avatar\/d2d9cc82cab40245e6f803982b1448e6?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d2d9cc82cab40245e6f803982b1448e6?s=96&r=g","caption":"Andr\u00e9 F\u00e9lix"},"sameAs":["http:\/\/www.feltex.com.br"],"url":"https:\/\/www.feltex.com.br\/felix\/author\/andre.felix\/"}]}},"_links":{"self":[{"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/posts\/1667"}],"collection":[{"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/comments?post=1667"}],"version-history":[{"count":11,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/posts\/1667\/revisions"}],"predecessor-version":[{"id":1889,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/posts\/1667\/revisions\/1889"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/media\/1724"}],"wp:attachment":[{"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/media?parent=1667"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/categories?post=1667"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.feltex.com.br\/felix\/wp-json\/wp\/v2\/tags?post=1667"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}