December 16, 2009

HTML convert to Pdf in Java

« Python XML Processing - libxml2 | Main | Java Swing Example »

The IText library represents a Java API for creation PDF files. The following example describes how the developer can use IText to convert HTML to PDF.

Add Itext Maven dependencies:
    <dependency>
      <groupId>com.lowagie</groupId>
      <artifactId>itext</artifactId>
      <version>2.1.7</version>
    </dependency>
Example Pdf creation Java code:
package org.developers.blog.html2pdf;

import com.lowagie.text.Document;
import com.lowagie.text.Element;
import com.lowagie.text.html.simpleparser.HTMLWorker;
import com.lowagie.text.html.simpleparser.StyleSheet;
import com.lowagie.text.pdf.PdfWriter;
import com.lowagie.text.pdf.codec.Base64;
import java.io.BufferedReader;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.util.ArrayList;

public class Html2Pdf {

    public static void main(String[] args) throws Exception {
        Document pdfDocument = new Document();
        Reader htmlreader = new BufferedReader(new InputStreamReader(
                                 new FileInputStream("/home/rafsob/OpenSourceProjekte/simple-monitor-plugin/target/site/index.html")
                                ));
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        PdfWriter.getInstance(pdfDocument, baos);
        pdfDocument.open();
        StyleSheet styles = new StyleSheet();
        styles.loadTagStyle("body", "font", "Bitstream Vera Sans");
        ArrayList arrayElementList = HTMLWorker.parseToList(htmlreader, styles);
        for (int i = 0; i < arrayElementList.size(); ++i) {
            Element e = (Element) arrayElementList.get(i);
            pdfDocument.add(e);
        }
        pdfDocument.close();
        byte[] bs = baos.toByteArray();
        String pdfBase64 = Base64.encodeBytes(bs); //output
        File pdfFile = new File("/tmp/pdfExample.pdf");
        FileOutputStream out = new FileOutputStream(pdfFile);
        out.write(bs);
        out.close();
    }
}

Regards
Rafael Sobek

Technorati Tags:

Posted by rafael.sobek at 8:16 AM in Java

 

[Trackback URL for this entry]

Comment: Noman at Mi, 9 Jun 8:36 AM

hello
thanx for your post abt html to pdf conversion

Comment: Raghu at Fr, 17 Sep 5:36 AM

I am just googling and found this useful link. I would like to add images and styles to the html string. I have a java program that generates html string. Does this iText API allows to read the specified images styles specified in the html string? Appreciate if you could provide a sample code.

Comment: Bora at Fr, 18 Mrz 5:49 PM

Hi,
I have tried this example at my computer and it worked. After that I wrote it as a java stored procedure. But when I compiled it, it gave me an error at "Document pdfDocument = new Document();" line, but it says nothing about the problem. When remove this line and other lines which include "pdfDocument", it works.

How got any idea about that?

Thanks.

Comment: John at Di, 19 Apr 2:23 PM

Hi,
I tried this code and it's working . However if html page in complex type (header, footer, mixed tables and lots of images) then resulting PDF is in disturbed format , totally unacceptable one. So if you could brief over this kind of issue then it'll be a huge help for me.

thanks..

Comment: John at Di, 19 Apr 2:35 PM

It's working fine however if HTML page is complex with style sheet and all stuff , the formatting gets worsened...
Can anybody address this Issue..
M waiting for reply..
thanks...

Comment: JasmineSH at Do, 28 Apr 9:13 AM

Thanks !!!!

This post helped me a lot.

Comment: asd at Di, 24 Mai 1:03 PM

this program convert html to pdf. but not work proper when image file have inside html.

Comment: Nicolas at Mo, 6 Jun 11:55 PM

Hello, I want to use itext with html tag images but I have had problems. Have you used itext with images??
Thanks!

Comment: Kalimaha at Di, 14 Jun 9:48 AM

Thanks for this tutorial, it's very useful! But, I've tried to convert an HTML containing an tag, and I don't see it in the final PDF, do I need to add something to the code?

Thanks a lot!

Comment: pavan at Di, 2 Aug 1:15 PM

how to convert MS doc file to PDF in java??
have u any code for that?

Comment: Mubasshar Ahmad at Fr, 12 Aug 11:35 AM

Great. Thanks a lot :)

Comment: edwini at Di, 4 Okt 5:46 PM

hi, how can iText do this?

http://pdfmyurl.com/

Comment: tazman at Mi, 12 Okt 6:13 AM

See wkhtmltopdf -- it does a very good job of preserving styles, formatting, images, links. Tested on MacOSX. Also available for Linux and Windows.

Comment: vinay at Mi, 23 Nov 10:59 AM

This is good. But the one mentioned by tazman "wkhtmltopdf " is really awesome... it can be used in automated command line pdf generation...

Your comment:

(not displayed)
 
 
 

Live Comment Preview:

 
« December »
SunMonTueWedThuFriSat
  12345
6789101112
13141516171819
20212223242526
2728293031