Monday, December 13, 2010

Pdf Text Extractor using java

We can extract text from pdf file using itext 2.1.6

 PdfReader readerN = new PdfReader("pdfFilename");
            OutputStreamWriter out = new OutputStreamWriter( new FileOutputStream(new File("textFileName")),"Cp1252");

            PdfTextExtractor parse = new PdfTextExtractor(readerN);
            int nrPages = readerN.getNumberOfPages();
           for (int i=2; i<nrPages-2 ; i++) {
                index++;
                 String page = parse.getTextFromPage(i);
                if(page != null){
                    page = page.replace(new StringBuffer("null"), new StringBuffer("??"));
           
                    out.write(page);

No comments:

Post a Comment

Spring Boot Config Server and Config Client.

 In Spring cloud config we can externalise our configuration files to some repository like GIT HUT, Amazon S3 etc. Benefit of externalising ...