In earlier article “Generate PDF using Java from scratch without any library”, we generated very basic minimal PDF using pure Java without any library. Now in this article we will improve that code further.
PDF improvements in this article:
- Add more pages to PDF
- Add more text in page.
- Add Graphics like rectangle, line, curve etc. Will also use colors to fill up graphics.
As we are going to create bigger PDF now, its better to improve our code design so that it will be easier to build PDF.
Design & code Improvements
- Make code more object oriented so that we can build bigger PDFs easily.
- Automate PDF object numbering so that we can keep adding more and more PDF objects easily.
- Convert PDF keywords to readable constants so that its easy to recognize code.
Below diagram shows the design of classes for PDF objects that we will follow in this article along with their hierarchy & there inter-linking with each other.
As per PDF specifications, this is the general structure of PDF file.
1 2 3 4 5 6 7 8 9 10 11 12 |
[PDF version header] [Catalog object] --- <Links to Pages Object> [Pages object] --- <Links to all Page objects> [Page object] --- <Links to stream object> [Stream object] --- <Contains text, graphics or other streams> . . . [PDF Object n] [Cross Reference] [Trailer] [End of file] |
So this will be our corresponding java object linking representing above PDF structure.
Abstract PDF Object
This will be abstract class which all PDF objects will extend. Each PDF object needs object number & generation number. So we will put it separately in PDFObjectReference class as shown below. We won’t be setting these numbers manually, but we will populate them programmatically. We have also provided method to add custom attributes which can be String or reference to other PDF objects.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
/** * Abstract Representation of PDF objects. All objects in PDF must extend this. * */ abstract class PDFObject { private PDFObjectReference reference = new PDFObjectReference(); private Map<String, Object> attributes = new HashMap<>(); public PDFObject(String type) { super(); this.attributes.put("Type", type); } public void addAttribute(String key, Object value) { this.attributes.put(key, value); } public abstract void addSpecificAttributes(); public void setObjectNumber(int objectNumber) { this.reference.setObjectNumber(objectNumber); } PDFObjectReference getReference() { return reference; } } /** * Representation of reference to any PDF object. * */ class PDFObjectReference { private int objectNumber; private int generation = 0; // Hardcode as it remains same always int getObjectNumber() { return objectNumber; } int getGeneration() { return generation; } void setObjectNumber(int objectNumber) { this.objectNumber = objectNumber; } } |
As per out knowledge from earlier article, we have improved & prepared building methods in this abstract class. For attributes, depending on type of object, we differently append value while building.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
public String build() { addSpecificAttributes(); StringBuilder pdfObject = new StringBuilder(); pdfObject.append(reference.getObjectNumber()).append(" ").append(reference.getGeneration()).append(" obj\n ") .append(buildObject()).append("\nendobj\n\n"); return pdfObject.toString(); } public StringBuilder buildObject() { StringBuilder pdfObject = new StringBuilder(); pdfObject.append("<< \n"); for (String key : attributes.keySet()) { Object value = attributes.get(key); if (value instanceof String) { pdfObject.append("\n /").append(key).append(" ").append(((String) value).contains("[") ? "" : "/") .append(value); } else if (value instanceof Integer) { pdfObject.append("\n /").append(key).append(" ").append(value); } else if (value instanceof PDFObject) { pdfObject.append("\n /").append(key).append(" \n").append(((PDFObject) value).buildObject()); } else if (value instanceof PDFObjectReference[]) { pdfObject.append("\n /").append(key).append(" ["); for (PDFObjectReference ref : (PDFObjectReference[]) value) { pdfObject.append(ref.getObjectNumber() + " " + ref.getGeneration() + " R "); } pdfObject.append("]"); } else if (value instanceof PDFObjectReference) { pdfObject.append("\n /").append(key).append(" ") .append(((PDFObjectReference) value).getObjectNumber() + " " + ((PDFObjectReference) value).getGeneration() + " R "); } } pdfObject.append(" >>"); return pdfObject; } |
Catalog Object
This is the first object in PDF which links to pages object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
/** * Representation of catalog object * */ class CatalogObject extends PDFObject { private PageCollectionObject pages; public CatalogObject(PageCollectionObject pageCollectionObject) { super("Catalog"); this.pages = pageCollectionObject; } @Override public void addSpecificAttributes() { addAttribute("Pages", pages.getReference()); } PageCollectionObject getPages() { return pages; } } |
Pages & Page objects
These objects contain list of pages & individual page. Page object can contain streams of different types like text, graphics etc.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
/** * Representation of pages object * */ class PageCollectionObject extends PDFObject { private List<PageObject> pages = new ArrayList<>(); public PageCollectionObject() { super("Pages"); } public void addPages(PageObject... pageObjects) { for (PageObject pageObject : pageObjects) { addPage(pageObject); } } public void addPage(PageObject pageObject) { this.pages.add(pageObject); pageObject.addAttribute("Parent", getReference()); } @Override public void addSpecificAttributes() { addAttribute("Count", Integer.valueOf(pages.size())); PDFObjectReference[] refArr = new PDFObjectReference[pages.size()]; for (int i = 0; i < pages.size(); i++) { refArr[i] = pages.get(i).getReference(); } addAttribute("Kids", refArr); } List<PageObject> getPages() { return pages; } } /** * Representation of page object. * */ class PageObject extends PDFObject { private StreamObject content; public PageObject() { super("Page"); } public void addContent(StreamObject streamObject) { content = streamObject; } @Override public void addSpecificAttributes() { addAttribute("Contents", content.getReference()); } StreamObject getContent() { return content; } } |
Font Object
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
/** * Representation of font object * */ class FontObject extends PDFObject { public FontObject(String fontAliasName, String fontName) { super(null); PDFObject fontDef = new PDFObject("Font") { @Override public void addSpecificAttributes() { addAttribute("Subtype", "Type1"); addAttribute("BaseFont", fontName); } }; fontDef.addSpecificAttributes(); PDFObject fontAlias = new PDFObject(null) { @Override public void addSpecificAttributes() { addAttribute(fontAliasName, fontDef); } }; fontAlias.addSpecificAttributes(); addAttribute("Font", fontAlias); } @Override public void addSpecificAttributes() { } } |
Abstract stream object
This is abstract representation of stream object which all stream objects will inherit.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
/** * Abstract Representation of stream object * */ abstract class StreamObject extends PDFObject { public StreamObject() { super(null); } public abstract String buildStream(); public void addSpecificAttributes() { addAttribute("Length", Integer.valueOf(100)); } @Override public StringBuilder buildObject() { StringBuilder sb = super.buildObject(); sb.append("\nstream").append(buildStream()).append("\nendstream"); return sb; } } |
Text stream object
Text stream uses different keywords from PDF specification. We have pulled out String constants with readable names. This object will also create stream format as per specifications.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
/** * Representation of text stream object * */ class TextStreamObject extends StreamObject { private static final String BEGIN_TEXT = "BT"; private static final String END_TEXT = "ET"; private static final String TEXT_FONT = "Tf"; private static final String TEXT_OFFSET = "Td"; private static final String SHOW_TEXT = "Tj"; private List<String> texts = new ArrayList<>(); public TextStreamObject(String fontAlias, int fontSize, int xPos, int yPos, String text) { add(fontAlias, fontSize, xPos, yPos, text); } public void add(String fontAlias, int fontSize, int xPos, int yPos, String text) { this.texts.add(" \n " + BEGIN_TEXT + " \n /" + fontAlias + " " + fontSize + " " + TEXT_FONT + " \n " + xPos + " " + yPos + " " + TEXT_OFFSET + "\n (" + text + ") " + SHOW_TEXT + "\n" + END_TEXT + "\n"); } @Override public String buildStream() { return texts.stream().collect(Collectors.joining()); } } |
Graphics stream object
Graphics stream objects are bit complicated. This deals with lot of co-ordinates to draw graphics. To get more details about this refer to PDF Specifications. We have fetched PDF keywords in String constants to make it more readable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
/** * Representation of graphics stream object * */ class GraphicStreamObject extends StreamObject { private static final String MOVE_POINTER = "m"; private static final String LINE = "l"; private static final String LINE_WIDTH = "w"; private static final String RECTANGLE = "re"; private static final String FILL = "f"; private static final String BEZIER_CURVE = "c"; private static final String BORDER_COLOR = "rg"; private static final String FILL_COLOR = "RG"; private static final String STROKE = "S"; private static final String CLOSE_FILL_STROKE = "b"; private List<String> graphics = new ArrayList<>(); public void addLine(int xFrom, int yFrom, int xTo, int yTo) { this.graphics.add( "\n " + xFrom + " " + yFrom + " " + MOVE_POINTER + " " + xTo + " " + yTo + " " + LINE + " " + STROKE); } public void addRectangle(int a, int b, int c, int d) { this.graphics.add("\n " + a + " " + b + " " + c + " " + d + " " + RECTANGLE + " " + STROKE); } public void addFilledRectangle(int a, int b, int c, int d, String color) { this.graphics.add("\n" + color); this.graphics.add("\n " + a + " " + b + " " + c + " " + d + " " + RECTANGLE + " " + FILL + " " + STROKE); } public void addBezierCurve(int movex, int movey, int a, int b, int c, int d, int e, int f, String borderColor, int borderWidth, String fillColor) { this.graphics.add("\n" + borderWidth + " " + LINE_WIDTH); this.graphics.add("\n" + fillColor + " " + FILL_COLOR); this.graphics.add("\n" + borderColor + " " + BORDER_COLOR); this.graphics.add("\n" + movex + " " + movey + " " + MOVE_POINTER); this.graphics.add("\n " + a + " " + b + " " + c + " " + d + " " + e + " " + f + " " + BEZIER_CURVE + " \n " + CLOSE_FILL_STROKE); } @Override public String buildStream() { return graphics.stream().collect(Collectors.joining()); } } |
PDF object
Now this is the final encapsulating PDF object. This class will take care of numbering all the PDF objects so that we don’t have to do this manually. This class will also add
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
/** * Representation of entire PDF file. * */ class PDF { private CatalogObject catalogObject; private int objectCount = 0; public PDF(CatalogObject catalogObject) { this.catalogObject = catalogObject; } public String build() { populateObjectNumbers(); StringBuilder pdf = new StringBuilder(); pdf.append("%PDF-1.1\n\n"); pdf.append(catalogObject.build()); pdf.append(catalogObject.getPages().build()); for (PageObject page : catalogObject.getPages().getPages()) { pdf.append(page.build()); if (page.getContent() != null) { pdf.append(page.getContent().build()); } } pdf.append("trailer\n << /Root " + catalogObject.getReference().getObjectNumber() + " " + catalogObject.getReference().getGeneration() + " R" + "\n /Size " + (objectCount + 1) + "\n >>\n" + "%%EOF"); return pdf.toString(); } /** * Populate object numbers to avoid manual numbering. */ private void populateObjectNumbers() { catalogObject.setObjectNumber(++objectCount); catalogObject.getPages().setObjectNumber(++objectCount); for (PageObject page : catalogObject.getPages().getPages()) { page.setObjectNumber(++objectCount); if (page.getContent() != null) { page.getContent().setObjectNumber(++objectCount); } } } } |
Now let build PDF
Now we generate a PDF like this,
- Page 1 – Header text, then 2 lines of normal text.
- Page 2 – Rectangle & line
- Page 3 – Curve with colors
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
public class PDFWithTextAndGraphics { public static void main(String[] args) throws IOException { /* * Create text stream with few lines */ TextStreamObject textStreamObject = new TextStreamObject("F1", 18, 30, 100, "Hello World"); textStreamObject.add("F1", 11, 30, 40, "Hope you all are enjoying Its All Binary articles!"); textStreamObject.add("F1", 11, 30, 30, "Now let's create PDF with 3 pages, texts & graphics."); /* * First page with above text stream */ PageObject page1 = new PageObject(); page1.addAttribute("Resources", new FontObject("F1", "Times-Roman")); page1.addContent(textStreamObject); page1.addAttribute("MediaBox", "[0 0 300 200]"); /* * Create graphic stream with few graphics. */ GraphicStreamObject graphicStreamObject = new GraphicStreamObject(); graphicStreamObject.addFilledRectangle(100, 600, 50, 75, "0.75 g"); graphicStreamObject.addLine(100, 100, 400, 500); /* * Second page with above graphics */ PageObject page2 = new PageObject(); page2.addContent(graphicStreamObject); /* * Create curve & color graphics. */ GraphicStreamObject graphicCurveStreamObject = new GraphicStreamObject(); graphicCurveStreamObject.addBezierCurve(300, 300, 300, 400, 400, 400, 400, 300, "0.0 0.0 0.5", 10, "0.5 0.1 0.2"); /* * Third page with above curve & color graphics. */ PageObject page3 = new PageObject(); page3.addContent(graphicCurveStreamObject); /* * Prepare pages & catalog objects. */ PageCollectionObject pageCollectionObject = new PageCollectionObject(); pageCollectionObject.addPages(page1, page2, page3); CatalogObject catalogObject = new CatalogObject(pageCollectionObject); /* * Build final PDF. */ PDF pdf = new PDF(catalogObject); /* * Write PDF to a file. */ FileWriter fileWriter = new FileWriter("generatedPDFWithGraphics.pdf"); fileWriter.write(pdf.build()); fileWriter.close(); } } |
Here is the complete code Complete code with all PDF object classes and PDFWithTextAndGraphics.java
Output:
Here is the text output of this program.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
%PDF-1.1 1 0 obj << /Pages 2 0 R /Type /Catalog >> endobj 2 0 obj << /Type /Pages /Count 3 /Kids [3 0 R 5 0 R 7 0 R ] >> endobj 3 0 obj << /Type /Page /Contents 4 0 R /Parent 2 0 R /Resources << /Font << /F1 << /Type /Font /BaseFont /Times-Roman /Subtype /Type1 >> >> >> /MediaBox [0 0 300 200] >> endobj 4 0 obj << /Length 100 >> stream BT /F1 18 Tf 30 100 Td (Hello World) Tj ET BT /F1 11 Tf 30 40 Td (Hope you all are enjoying Its All Binary articles!) Tj ET BT /F1 11 Tf 30 30 Td (Now let's create PDF with 3 pages, texts & graphics.) Tj ET endstream endobj 5 0 obj << /Type /Page /Contents 6 0 R /Parent 2 0 R >> endobj 6 0 obj << /Length 100 >> stream 0.75 g 100 600 50 75 re f S 100 100 m 400 500 l S endstream endobj 7 0 obj << /Type /Page /Contents 8 0 R /Parent 2 0 R >> endobj 8 0 obj << /Length 100 >> stream 10 w 0.5 0.1 0.2 RG 0.0 0.0 0.5 rg 300 300 m 300 400 400 400 400 300 c b endstream endobj trailer << /Root 1 0 R /Size 9 >> %%EOF |
PDF opened in PDF reader