Thursday, January 23, 2020

Poor Image Quality in Generated PDF

Happy New Year!

We'd recently had the opportunity to upgrade some codes from codes related to PDF generation. The source was previously coded to manually position elements (e.g. 100px from the left, 20px from the top) for the output. The revised strategy was to adopt a Word Document as the template, with placeholders prepared. An XML will then bind to the template at runtime. But this is trivial only for text content. The images were slightly more complicated, but we were able to overcome it, with slight adjustments to Docx4J.

The problem arose however, when we noticed that an image with fine detail would then appear to look very poor in quality when viewed on the PDF. Yet, if the image were to be copied from the PDF, and pasted into a separate image viewer, it looks clean and crisp.

The couple of assessments I had initially led to deadends:
  1. It was not due to AffineTransform having lousy output;
  2. It was not due to the process of converting the DOCX to PDF format;
  3. It did not make a difference regardless of the image format provided (PNG/JPEG/BMP);
  4. It was not due to the difference of colour spaces (e.g. BufferedImage.TYPE_INT_ARGB);
  5. It did not make a difference setting the DPI into the PNG metadata;

After a bit of investigation, by stepping into the Docx4J classes at runtime, I started noticing that the DPI use was suspect. Diving into the library sources, I noticed that the preprocessing stage for generating the PDF will attempt to determine the dimensions of each image. From there, I was able to surmise that I could further change the codes on my end.

//create the image part
BinaryPartAbstractImage imagePart = BinaryPartAbstractImage.createImagePart(wordMLPackage, imageBytes);
//derive inline element
Inline inline = imagePart.createImageInline( null, "image alt", 0, 1, true);
//retrieve dimensions to fix
CTPositiveSize2D ext = inline.getExtent();
ext.setCx((long) (ext.getCx()*0.75));
ext.setCy((long) (ext.getCy()*0.75));
inline.setExtent(ext);

After adding the fix (in bold), the image appeared much cleaner. When copied into an external image editor, the image looked much closer to scale in comparison to it's counterpart viewed from the PDF viewer at 100% zoom. The image dimensions (in pixels) would then of course have to be adjusted larger to compensate.

As an aside, I'd also learnt about 2 new units of measurement:
  1. "mpt" - millipoints
  2. "twip" - twentieth of a point 
 Not that they are useful in any way outside of this situation.