CentraDoc 7.2.4 - Summary of Changes since 7.1.3

Table Of Contents

Introduction

These are the changes to CentraDoc since the Version 7.1.3 release 2022-03-01.

Highlights

Alternate multithreaded RIP - thread per page v thread per band
GYP - Generate Your Projects - improved cross platform builds
Cython - added support for building Python extensions
Doxygen rebuilds - doxygen html in separate packaging
Third party library updates
#2444 RIP changes for PostScript page handling
#2432 more RIP examples & distribution includes more code examples
#2412 Generic Logger mechanism

Generate-Your-Projects is a build system originally used for Chromium and V8 (which has moved on to GN - Generate Ninja). GYP is still in use by the node.js project, as gyp-next: https://github.com/nodejs/gyp-next

GYP can be used to build Visual Studio projects for Windows and XCode projects for Mac, as well as Makefiles for Linux. GYP can also generate CMake and Ninja build files, although this doesn't work on all platforms. FYI, the Ninja build on Linux is very fast.

This makes it easier to produce the build you need, e.g.:

Which version of Visual Studio?
Static or Dynamic Runtime library?
OpenSSL included or not?

etc.

Prebuilt projects are available in cdoc/build. The older builds in cdoc/ps will go away. The scripts that use GYP to build are in cdoc/src:

gyp-linux
gyp-macos
gyp-win.cmd

as are most of the .gyp files. The gyp files use a Python program to expand wildcards (cdoc/src/tools/exp.py) which isn't built in to gyp. GYP is not used to compile the .lidl IDL files, which are shipped precompiled. The various lidl tools are in cdoc/src/tools, including the Liberty cross platform build engine (build.py), which can be used to run lidl. The cdoc/src/tools folder wasn't previously included in the drop.

PDF

PDF Signature support improvements

#2433 C# OcspSignature.IsTrusted property & SSL reference count fixes
#2375 Now requires OpenSSL 1.1.1m and later - 1.0.2 no longer supported
#2394 Improved support for signed-and-modified PDFs. Includes:
- Better support for change detection and Signature permissions
- Fix memory leaks and object ownership issues with SSL related objects, including bomb when parsing OCSP trust chain.
- PDF reader supports explicitly numbered revisions of a file (these are not the same revision numbers that Acrobat displays in the Signature panel).
- no longer autoloading cacert.pem in default trust store, all trust stores must be explicit.
- some changes and additions to the Signature API for clarity
- Certificate_GetFingerprint method for tracking multiple references

Signatures are a complex topic and will require additional reading. The PdfSignatures.pdf document provides a basic overview of the feature and the available API. Sample programs for validating signatures are available for C and C#.

OpenSSL builds

The CentraDoc build typically uses a static link of OpenSSL, built with the same compiler and runtime library options. DLL linkage requires use of the openssl applink.c stub to work around some compatibility issues.

The relevant OpenSSL 1.1.1 library is libcrypto. Due to its size and complexity, OpenSSL source isn't provided in vendor, but the ideally OpenSSL is installed in vendor/openssl111 with libraries in vendor/openssl111/lib or vendor/openssl111/lib64, with includes in vendor/openssl111/include. Support for OpenSSL 3 is still pending.

Depending on the options used building OpenSSL, the WS2_32.lib may be required. CentraDoc does not require these features at this time, OpenSSL may be built with no-sock to avoid this linkage.

PDF Reader

#2473 PDF Structure reader bombed when unloading MCID entries
#2459 PDF with degenerate JBIG2 image caused a 0 byte alloc
#2456 PDFs with nested softmasks losing intermediate softmask state
#2450 PDF transparency group clip checking glitch
#2445 PDF driver may need to know if the JPEG2000 image is indexed, because Acrobat doesn't handle indexed JPEG2000 images correctly. This required a hack fo the openjp2 API via openjp2_hack.h.
#2443 PDF utlayer example didn't handle AI layers correctly (crashing).
#2421 PDF pathological pattern trap didn't handle some common image formats
#2419 Make cap->multichar logic slightly less obscure
#2418 Handle bogus encoding in Type 3 font
#2414 For colorspace Separation /None, don't draw anything. * also fix for bug in Type 4 function roll operator
#2395 ignore abandoned layers and layer structure in optional content
#2391 handle bogus object number <= 0 in a reference without choking
#2385 handle rotation (MK/R) when generating annotation appearances; also, text widget annotations should default to a single line unless multiline is explicitly set.
#2377 DvImgGrab didn't provide information about JBIG2 global stream, which is needed to prevent grabbing the JBIG2 without the globals.
#2359 pass the correct page extents to PDI BeginPage
#2267 Transfer function affects soft mask scope

PDF Writer

#2480 PDF marks not handling extended characters in annotation names correctly
#2477 PDF marks merge annotations now handles field name collision. This helps for the case of merging multiple copies of the same form, but would be problematic for PDFs that actually use duplicate names (rare).
no ticket embedding or referencing fonts, using the PostScript version of the font name in the PDF file is now the default. This makes Acrobat happier.
#2398 noAppearance flag for PdfFreeTextParms (PdfAnnot_CreateFreeText)

PDF Marks

#2376 PDF export unrotate didn't handle annotation origins correctly

Distiller

#2485 Character capture issues with PDF Type 3 fonts
#2468 improved font glyph capture for Distiller (ps2pdf) (qa/FontTest2.ps)
#2390 Distiller glyph capture didn't track matrix correctly, causing OVERSIZE glyphs when the font size (but not font) changes.
#2389 Distiller didn't preserve character encoding when capturing glyphs. When capturing glyphs, don't render one character at a time, batch up into strings.
#2377 improved driver DvImgGrab example in distiller
#2376 distiller set pdfCropped for correct origin reading PDFs

PDF Enhancements

#2378 PDF-on-PDF placement and JSON stamp definitions
#2400 Improved PDF merging API, additional support pending

XFA

#2483 XFA issue with numeric picture format containing spaces (credit card)
#2481 XFA not cleaned up correctly causing crash in pvsdk usage
#2479 XFA handle right to left subform layout
#2478 XFAF file confused by embedded Calibri fragment, causing crash * No longer loads Type 0 fonts from the PDF * No longer loads page fonts from the PDF, unless XFAEmbedPageFonts=Y * Fixed object leak if XFAEmbedFonts=N
#2458 Fixed case of infinite loop in HTML layout
#2452 changes from #2378 broke gray JPEGs
#2449 V8 glue updated for 10.8.168.22 for security enhancements
#2447 "Missing font" crashing on Linux. Font issues need revisiting.
#2442 V8 build enhancements * build and test V8 on Linux, both embedded and as jsvm.so * build is based on monolithic V8 library * FFS search path is used to load jsvm.dll and jsvm.so * V8 libraries are not distributed by default, contact Liberty for binaries
#2424 XFA/XFAF fixes * XFAF=N disables XFAF processing, for testing * CurrentDate() handle time zone * some locale support for date and number parsing, german ('de') picture
#2409 issues with AcroForm/DR usage on export/import of annotation blob
#2404 fixed some XFA regressions
#2399 some XFA fixes * fix for #2384 accidentally centered text fields (PdfXfaToAcroform=Y) * enhance #1268 to use specified justification of text fields (PdfXfaToAcroform=Y) * fix for #2303 caused missing data in #2129

C#

C# wrappers updated for signatures and the PDF-on-PDF / JSON stamp logic

#2471 JSG stroke & fill objects didn't register in stamp bounds
#2455 Add GCO.IsDebugBuild & explorer version debug & assemblyinfo debug markers
#2431 change signature API for consistency and thread safe trust store
#2411 Improved out of memory handling, other miscellaneous fixes; BestPPI API
#2410 Improved exception handling, fix for FirstAndLastPage API
#2407 * C# API requires clarifying semantics of Close operations for Pdf, PdfDoc, and PdfUpd classes in C API * C# API PdfStampPages::AllPages didn't work * C# API add GCO FailureHandler, WarningHandler

Preview SDK

no ticket Added VERSIONINFO resource to Preview SDK DLL.

RIP enhancements

#2444 RIP changes for PostScript page handling

Knowing the PostScript page size requires executing the PostScript code. PostScript can set the page size while executing, and change it for later pages. This led to some issues when using the RIP. There are some new workarounds for this.

First:

// Query if source is PostScript, if we need special handling
// returns TRUE or FALSE
int CDocRIP_IsPostScript(CDocRIPPtr self)

And:

// get the page size for the next page
// returns ESimpleError
int CDocRIP_GetNextPageBBox(CDocRIPPtr self, double oBBox[4])

CDocRIP_GetNextPageBBox will run the PostScript from it's current location until it gets a setpagedevice specifying the actual page size (which may or may not be there). PDF sources have direct access to this information.

Also:

// turn on auto targeting
void CDocRIP_AutoTarget(CDocRIPPtr self, const char *name, const RIPTargetSpec *spec);

The RIP can automatically create the appropriate sized page buffer based on the input page size and requested PPI. This is especially helpful when running CDocRIP_MTAutoRun, the thread-per-page RIP. See the threadrip.cpp example.

RIP multiple threads and examples

Added -mp # option for multithreaded RIP by page instead of by band, only works for PDF files with multiple pages. threadrip.cpp example.
RIP example minirip.cpp, much simpler than riptest.cpp, which has examples of lots of features, including the multithreaded band approach.

Multithreading & Stability

#2441 requesting too large an image buffer should fail more cleanly. Previous (intermediate) error logic for other cases was incorrect.
#2439 SSL_SetPersistence for testing. The OpenSSL library will not start up again if shut down, this enables repeated shutdown testing that leaves the OpenSSL library up and running.
#2434 Threads created outside of the CentraDoc library that call CentraDoc must call the Thread_Cleanup function to free thread local storage before exiting.
#2422 S_PdfRW_LoadJBIG2Into Return from inside ERR_TRY block, linux crash
#2412 Generic logging engine. See rt_log.lidl & updated examples.
#2408 Exceptions bypassing critical section unlocks in some cases could leave locks permanently locked, leaving multithreaded program hung
#2402 Multithreaded RIP fixes and enhancement Some unsafe global accesses broke multithreading safety, fixed. GCO object reference counts used Atomic Increment/Decrement for Safety.
#2386 More safety checks in the setjmp based ERR/exception module. By default, Windows uses SEH instead of setjmp, and Linux/other builds use setjmp, which has restrictions on what's allowed inside the ERR_TRY block. It's possible to test the setjmp logic on Windows by changing rt_except.h and commenting out the logic that turns on SEH, here:

#if defined(_MSC_VER) || defined(__BORLANDC__)
    #define USE_SEH     /* Structured Exception Handling */
#endif

Miscellaneous

Support for GYP builds
#2470 FFS_IsUnbounded(ffs) provides better detection of unbounded PostScript data sources for DvImgGrab
#2461 Automatically generated Cython interface definitions, provides substrate for building Python extensions.
#2453 Axial shader divide by zero on "difficult" inputs
#2451 Improved axial function digest, affects RIP rendering and custom driver implementations of gradients. Integrated Type 4 via resampling.
#2427 ImageResizerMaxWH=# to automatically downscale large images
#2316 More cleanup of compiler warnings & odd constructs
#1631 Add Windows default mapping for Helvetica-Narrow font as Arial Narrow.
GCO C+++ HClass wrappers removed in preference for the GCO COM wrappers. Here is an example accessing FFs classes using the GCO COM wrappers:

#include "ffsstd.h"

void main() {
    IFFsStdFactory *ffsStdFactory = GetIFFsStdFactory();
    IFFs *out = ffsStdFactory->NewOut();
    out->WriteCStr("Hello world\n");
    out->Release();

    IFFsOSFactory *ffsOSFactory = GetIFFsOSFactory();
    out = ffsOSFactory->New("TESTFILE.TXT", FFS_WRITE);
    out->WriteCStr("Hello world\n");
    out->Release();
}

Some of the code examples added to the distribution:
- pdfdoc/ut
  - pdfobject.c - for inspecting objects in a pdf file
  - pdfstream.c - for inspecting streams in a pdf file
- pdfupd/examples - various create examples
  - pdfcreate1.c
  - pdfFormExample.c
  - pdflayers.c
  - pdftrimesh.c
  - pdftype1font.c
  - pdftype3font.c
- pdfupd/examples/docs - pretty printed versions of above
- pdfupd/ut
  - pdfcp.c - pdf file copy
  - pdfmarkscmd.c - pdf marks command line utility
  - pdfpgcpy.c - pdf page copy
  - pdfrestr.c - pdf replace stream (re: pdfstream.c above)
  - pdfupdobj.c - pdf update object (re: pdfobject above)
  - pdfxpc.c - generate transparency example files
- ut/
  - minirip.cpp - tiny RIP
  - riptest.cpp - extensive RIP
  - threadrip.cpp - multithreaded RIP
  - ps2pdf.c - distiller
- ut/docs - pretty printed versions of above

Third-party libraries

AntiGrain - 2.4
FreeType - 2.12.1 UPDATED
JPEG - JPEG-9e
LCMS - 2.12
libdmtx - 0.7.7 UPDATED
libpng - 1.6.38 UPDATED
OpenJPEG - 2.5.0 UPDATED
QR-Code-generator-c - ~2018
sha2 - 1.0.1
YAJL - 2.0.1
ZLib - 1.2.13 UPDATED