CentraDoc 7.2.4 - Summary of Changes since 7.1.3

Table Of Contents

Introduction

These are the changes to CentraDoc since the Version 7.1.3 release 2022-03-01.

Highlights

  • Alternate multithreaded RIP - thread per page v thread per band
  • GYP - Generate Your Projects - improved cross platform builds
  • Cython - added support for building Python extensions
  • Doxygen rebuilds - doxygen html in separate packaging
  • Third party library updates
  • #2444 RIP changes for PostScript page handling
  • #2432 more RIP examples & distribution includes more code examples
  • #2412 Generic Logger mechanism

GYP

Generate-Your-Projects is a build system originally used for Chromium and V8 (which has moved on to GN - Generate Ninja). GYP is still in use by the node.js project, as gyp-next: https://github.com/nodejs/gyp-next

GYP can be used to build Visual Studio projects for Windows and XCode projects for Mac, as well as Makefiles for Linux. GYP can also generate CMake and Ninja build files, although this doesn't work on all platforms. FYI, the Ninja build on Linux is very fast.

This makes it easier to produce the build you need, e.g.:

etc.

Prebuilt projects are available in cdoc/build. The older builds in cdoc/ps will go away. The scripts that use GYP to build are in cdoc/src:

as are most of the .gyp files. The gyp files use a Python program to expand wildcards (cdoc/src/tools/exp.py) which isn't built in to gyp. GYP is not used to compile the .lidl IDL files, which are shipped precompiled. The various lidl tools are in cdoc/src/tools, including the Liberty cross platform build engine (build.py), which can be used to run lidl. The cdoc/src/tools folder wasn't previously included in the drop.

PDF

PDF Signature support improvements

  • #2433 C# OcspSignature.IsTrusted property & SSL reference count fixes
  • #2375 Now requires OpenSSL 1.1.1m and later - 1.0.2 no longer supported
  • #2394 Improved support for signed-and-modified PDFs. Includes:
    • Better support for change detection and Signature permissions
    • Fix memory leaks and object ownership issues with SSL related objects, including bomb when parsing OCSP trust chain.
    • PDF reader supports explicitly numbered revisions of a file (these are not the same revision numbers that Acrobat displays in the Signature panel).
    • no longer autoloading cacert.pem in default trust store, all trust stores must be explicit.
    • some changes and additions to the Signature API for clarity
    • Certificate_GetFingerprint method for tracking multiple references

Signatures are a complex topic and will require additional reading. The PdfSignatures.pdf document provides a basic overview of the feature and the available API. Sample programs for validating signatures are available for C and C#.

OpenSSL builds

The CentraDoc build typically uses a static link of OpenSSL, built with the same compiler and runtime library options. DLL linkage requires use of the openssl applink.c stub to work around some compatibility issues.

The relevant OpenSSL 1.1.1 library is libcrypto. Due to its size and complexity, OpenSSL source isn't provided in vendor, but the ideally OpenSSL is installed in vendor/openssl111 with libraries in vendor/openssl111/lib or vendor/openssl111/lib64, with includes in vendor/openssl111/include. Support for OpenSSL 3 is still pending.

Depending on the options used building OpenSSL, the WS2_32.lib may be required. CentraDoc does not require these features at this time, OpenSSL may be built with no-sock to avoid this linkage.

PDF Reader

  • #2473 PDF Structure reader bombed when unloading MCID entries
  • #2459 PDF with degenerate JBIG2 image caused a 0 byte alloc
  • #2456 PDFs with nested softmasks losing intermediate softmask state
  • #2450 PDF transparency group clip checking glitch
  • #2445 PDF driver may need to know if the JPEG2000 image is indexed, because Acrobat doesn't handle indexed JPEG2000 images correctly. This required a hack fo the openjp2 API via openjp2_hack.h.
  • #2443 PDF utlayer example didn't handle AI layers correctly (crashing).
  • #2421 PDF pathological pattern trap didn't handle some common image formats
  • #2419 Make cap->multichar logic slightly less obscure
  • #2418 Handle bogus encoding in Type 3 font
  • #2414 For colorspace Separation /None, don't draw anything. * also fix for bug in Type 4 function roll operator
  • #2395 ignore abandoned layers and layer structure in optional content
  • #2391 handle bogus object number <= 0 in a reference without choking
  • #2385 handle rotation (MK/R) when generating annotation appearances; also, text widget annotations should default to a single line unless multiline is explicitly set.
  • #2377 DvImgGrab didn't provide information about JBIG2 global stream, which is needed to prevent grabbing the JBIG2 without the globals.
  • #2359 pass the correct page extents to PDI BeginPage
  • #2267 Transfer function affects soft mask scope

PDF Writer

  • #2480 PDF marks not handling extended characters in annotation names correctly
  • #2477 PDF marks merge annotations now handles field name collision. This helps for the case of merging multiple copies of the same form, but would be problematic for PDFs that actually use duplicate names (rare).
  • no ticket embedding or referencing fonts, using the PostScript version of the font name in the PDF file is now the default. This makes Acrobat happier.
  • #2398 noAppearance flag for PdfFreeTextParms (PdfAnnot_CreateFreeText)

PDF Marks

  • #2376 PDF export unrotate didn't handle annotation origins correctly

Distiller

  • #2485 Character capture issues with PDF Type 3 fonts
  • #2468 improved font glyph capture for Distiller (ps2pdf) (qa/FontTest2.ps)
  • #2390 Distiller glyph capture didn't track matrix correctly, causing OVERSIZE glyphs when the font size (but not font) changes.
  • #2389 Distiller didn't preserve character encoding when capturing glyphs. When capturing glyphs, don't render one character at a time, batch up into strings.
  • #2377 improved driver DvImgGrab example in distiller
  • #2376 distiller set pdfCropped for correct origin reading PDFs

PDF Enhancements

  • #2378 PDF-on-PDF placement and JSON stamp definitions
  • #2400 Improved PDF merging API, additional support pending

XFA

  • #2483 XFA issue with numeric picture format containing spaces (credit card)
  • #2481 XFA not cleaned up correctly causing crash in pvsdk usage
  • #2479 XFA handle right to left subform layout
  • #2478 XFAF file confused by embedded Calibri fragment, causing crash * No longer loads Type 0 fonts from the PDF * No longer loads page fonts from the PDF, unless XFAEmbedPageFonts=Y * Fixed object leak if XFAEmbedFonts=N
  • #2458 Fixed case of infinite loop in HTML layout
  • #2452 changes from #2378 broke gray JPEGs
  • #2449 V8 glue updated for 10.8.168.22 for security enhancements
  • #2447 "Missing font" crashing on Linux. Font issues need revisiting.
  • #2442 V8 build enhancements * build and test V8 on Linux, both embedded and as jsvm.so * build is based on monolithic V8 library * FFS search path is used to load jsvm.dll and jsvm.so * V8 libraries are not distributed by default, contact Liberty for binaries
  • #2424 XFA/XFAF fixes * XFAF=N disables XFAF processing, for testing * CurrentDate() handle time zone * some locale support for date and number parsing, german ('de') picture
  • #2409 issues with AcroForm/DR usage on export/import of annotation blob
  • #2404 fixed some XFA regressions
  • #2399 some XFA fixes * fix for #2384 accidentally centered text fields (PdfXfaToAcroform=Y) * enhance #1268 to use specified justification of text fields (PdfXfaToAcroform=Y) * fix for #2303 caused missing data in #2129

C#

C# wrappers updated for signatures and the PDF-on-PDF / JSON stamp logic

Preview SDK

RIP enhancements

Knowing the PostScript page size requires executing the PostScript code. PostScript can set the page size while executing, and change it for later pages. This led to some issues when using the RIP. There are some new workarounds for this.

First:

// Query if source is PostScript, if we need special handling
// returns TRUE or FALSE
int CDocRIP_IsPostScript(CDocRIPPtr self)

And:

// get the page size for the next page
// returns ESimpleError
int CDocRIP_GetNextPageBBox(CDocRIPPtr self, double oBBox[4])

CDocRIP_GetNextPageBBox will run the PostScript from it's current location until it gets a setpagedevice specifying the actual page size (which may or may not be there). PDF sources have direct access to this information.

Also:

// turn on auto targeting
void CDocRIP_AutoTarget(CDocRIPPtr self, const char *name, const RIPTargetSpec *spec);

The RIP can automatically create the appropriate sized page buffer based on the input page size and requested PPI. This is especially helpful when running CDocRIP_MTAutoRun, the thread-per-page RIP. See the threadrip.cpp example.

RIP multiple threads and examples

Multithreading & Stability

#if defined(_MSC_VER) || defined(__BORLANDC__)
    #define USE_SEH     /* Structured Exception Handling */
#endif

Miscellaneous

#include "ffsstd.h"

void main() {
    IFFsStdFactory *ffsStdFactory = GetIFFsStdFactory();
    IFFs *out = ffsStdFactory->NewOut();
    out->WriteCStr("Hello world\n");
    out->Release();

    IFFsOSFactory *ffsOSFactory = GetIFFsOSFactory();
    out = ffsOSFactory->New("TESTFILE.TXT", FFS_WRITE);
    out->WriteCStr("Hello world\n");
    out->Release();
}

Third-party libraries

References