@imgly/pdf-importer
PDF Importer for the CE.SDK
Overview
The PDF Importer for the CE.SDK allows you to seamlessly integrate PDF files into the editor while retaining essential design attributes.
Here's an overview of the main features:
- File Format Translation: The importer converts PDF files into the CE.SDK scene file format using
pdfjs-dist(Mozilla's pdf.js), in its legacy/CJS-safe build. - Bulk Importing: The codebase is adaptable for bulk importing, streamlining large-scale projects.
- Color Translation: RGB, CMYK, and Separation spot colors from PDFs are translated into CE.SDK's native
RGBAColor,CMYKColor, andSpotColorvariants. CMYK values are preserved end-to-end instead of collapsed to sRGB; Separation inks are registered on the document's spot-color registry (viaengine.editor.setSpotColorCMYK) with their declared alternate CMYK values. DeviceN inks degrade to their alternate-space solid for now.
The following PDF design elements will be preserved by the import:
- Positioning and Rotation: Elements' positioning and rotation are accurately transferred.
- Image Elements: Embedded images (JPEG, PNG) are extracted and placed as graphic blocks. Note that only images with formats that are supported by CE.SDK will be rendered.
- Text Elements: Font family continuity is maintained, with options to supply font URIs or use Google fonts. Bold, italic, and weight styles are supported.
- Vector Paths: SVG path data from the PDF is imported as vector path blocks.
- Colors and Gradients: Solid colors, linear gradients, and radial gradients are faithfully reproduced.
- Spot Color Detection: Cut/fold marks using spot colors (CutContour, Thru-cut, etc.) are detected and can be handled separately. Brand spot inks (Separation / DeviceN entries in the page's
/ColorSpaceresource dictionary) are preserved asSpotColorfills and registered on the CE.SDK document-level spot registry.
How It Works
The importer runs a three-stage pipeline on each PDF page:
- Extract — a
pdfjs-distoperator walker emits drawable blocks (images, vector paths, text outlines) in paint order.page.getTextContent()produces one editable text run per line. - Post-process — adjacent text runs with the same font/size and horizontal overlap are merged into multi-line paragraph blocks.
- Emit — the intermediate representation is written as CE.SDK blocks: text, image, vector, outline.
PDF points are converted to inches (1pt = 1/72 inch) for CE.SDK design units. Embedded images become buffer:// URIs; engine-provided fonts use bundle:// URIs.
Installation
You can install @imgly/pdf-importer via npm or yarn. Use the following commands to install the package:
npm install @imgly/pdf-importer
yarn add @imgly/pdf-importerBrowser Quick-Start Example
import CreativeEngine from "@cesdk/engine";
import { PDFParser, addGfontsAssetLibrary } from "@imgly/pdf-importer";
const blob = await fetch("https://example.com/document.pdf").then((res) =>
res.blob()
);
const engine = await CreativeEngine.init({
license: "YOUR_LICENSE",
});
// We use google fonts to replace well known fonts in the default font resolver.
await addGfontsAssetLibrary(engine);
const parser = await PDFParser.fromFile(engine, blob);
await parser.parse();
const image = await engine.block.export(
engine.block.findByType("//ly.img.ubq/page")[0],
"image/png"
);
const sceneExportUrl = window.URL.createObjectURL(image);
console.log("The imported PDF file looks like:", sceneExportUrl);
// You can now e.g export the scene as archive with engine.scene.saveToArchive()Saving Scenes with Stable URLs
By default, the PDF importer creates internal buffer:// URLs for embedded images. These are transient resources that work well when saving to an archive (engine.scene.saveToArchive()), which bundles all assets together.
However, if you want to save scenes as JSON strings (engine.scene.saveToString()) with stable, permanent URLs (e.g., for storing in a database or referencing CDN-hosted assets), you need to relocate the transient resources first.
Why Relocate?
- Scene Archives (
saveToArchive): Include all assets in a single ZIP file. Transientbuffer://URLs work fine. - Scene Strings (
saveToString): Only contain references to assets. Transient URLs won't work when reloading the scene later. You need permanent URLs (e.g.,https://).
How to Relocate Transient Resources
After parsing the PDF file, use CE.SDK's native APIs to find and relocate all transient resources:
// 1. Parse the PDF file
const parser = await PDFParser.fromFile(engine, blob);
await parser.parse();
// 2. Find all transient resources (embedded images from the PDF)
const transientResources = engine.editor.findAllTransientResources();
// 3. Upload each resource and relocate to permanent URL
for (const resource of transientResources) {
const { URL: bufferUri, size } = resource;
// Extract binary data from the buffer
const data = engine.editor.getBufferData(bufferUri, 0, size);
// Upload to your backend/CDN (implement your own upload logic)
const permanentUrl = await uploadToBackend(data);
// Relocate the resource to the permanent URL
engine.editor.relocateResource(bufferUri, permanentUrl);
}
// 4. Now save to string - all URLs will be permanent
const sceneString = await engine.scene.saveToString();Note on Font URLs
When using addGfontsAssetLibrary() (the default font resolver), the resulting scene string will contain Google CDN URLs for fonts. If you need fonts hosted on your own infrastructure, configure a custom font resolver instead of using the default Google Fonts integration.
Font Strategies
The importer ships with three font-handling presets that trade editability against visual fidelity. Pick one via PDFParser.fromFile(engine, blob, { fontStrategy }), or compose your own with createFontStrategy / createFontCascade.
| Preset | Behavior | When to use |
|---|---|---|
editableFirstStrategy (default) |
perfect-match → PDF-embedded subset bytes → any-match substitution | General-purpose import. Prefers asset-library typefaces for editability, falls back to the PDF's embedded subset for fidelity, substitutes when neither is available. |
exactFidelityStrategy |
perfect-match → PDF-embedded subset bytes | Print finalization. Never substitutes; falls through to vector outline when no matching typeface or embedded font is available. |
assetLibraryStrategy |
perfect-match → any-match substitution | Brand-locked tools. Skips the embedded-subset stage so only asset-library typefaces are used; non-matching fonts go through substitution or vector outline. |
import { PDFParser, exactFidelityStrategy } from "@imgly/pdf-importer";
const parser = await PDFParser.fromFile(engine, blob, {
fontStrategy: exactFidelityStrategy,
});
await parser.parse();NodeJS Quick-Start Example
Prerequisite — emoji handling. Two CE.SDK settings need attention when running the importer headlessly under
@cesdk/node:
ubq://forceSystemEmojis = false— by default the engine routes any codepoint that ICU classifies asRGI_Emoji(e.g.♥,★, the dingbats block) through the emoji font even when the active typeface has a glyph for it. Customer PDFs frequently embed real text fonts (ZapfDingbats, Webdings, …) that map these codepoints to actual glyphs; forcing the substitution discards the producer's intended glyph and pulls in a generic color emoji. Setting the flag tofalsemakes the engine respect the embedded/substituted font when it covers the codepoint.ubq://defaultEmojiFontFileUri = <CDN URL>—@cesdk/nodeships onlyassets/core/, notassets/emoji/NotoColorEmoji.ttf. Even withforceSystemEmojis=false, true color emoji (, , …) that no embedded text font covers still need a working emoji font URI, orengine.block.export(page, "image/png")aborts withFILE_FETCH_FAILEDfor the engine's synthesised local-file URL. Point the engine at the IMG.LY-hosted preset, or self-host the file and supply your own URI /bundle://path.engine.editor.setSettingBool("ubq://forceSystemEmojis", false); engine.editor.setSettingString( "ubq://defaultEmojiFontFileUri", "https://cdn.img.ly/assets/v4/emoji/NotoColorEmoji.ttf", );See the CE.SDK Emojis guide for the full set of options. Browser consumers initialised with the default IMG.LY-CDN
baseURLalready get the emoji font for free, and most integrations also wantforceSystemEmojis=falsefor the same embedded-font-respect reason.
// index.mjs
import CreativeEngine from "@cesdk/node";
import { promises as fs } from "fs";
import { PDFParser, addGfontsAssetLibrary } from "@imgly/pdf-importer";
async function main() {
const engine = await CreativeEngine.init({
license: "YOUR_LICENSE",
});
// Respect embedded fonts for emoji-class codepoints (♥, ★, …) and
// give true color emoji a working font URI — see the prerequisite
// note above.
engine.editor.setSettingBool("ubq://forceSystemEmojis", false);
engine.editor.setSettingString(
"ubq://defaultEmojiFontFileUri",
"https://cdn.img.ly/assets/v4/emoji/NotoColorEmoji.ttf",
);
await addGfontsAssetLibrary(engine);
const pdfBuffer = await fs.readFile("./document.pdf");
const parser = await PDFParser.fromFile(engine, pdfBuffer.buffer);
await parser.parse();
const image = await engine.block.export(
engine.block.findByType("//ly.img.ubq/page")[0],
"image/png"
);
const imageBuffer = await image.arrayBuffer();
await fs.writeFile("./example.png", Buffer.from(imageBuffer));
engine.dispose();
}
main();Issues
If you encounter any issues or have questions, please don't hesitate to contact us at support@img.ly.
Limitations and Unsupported Features
The PDF importer has some limitations and unsupported features that you should be aware of:
Linked Images
- Only embedded images are supported. External image references are not resolved.
Font Support
- Fonts not available as a typeface asset source fall back through the configured
fontStrategy(see Font Strategies above): embedded subset bytes when present, then resolver substitution, then a vector-outline rendering. The default strategy substitutes; configureexactFidelityStrategyto disable substitution.
- Fonts not available as a typeface asset source fall back through the configured
Complex Vector Paths
- Some complex clipping paths or compound shapes may experience minor distortion.
Annotations and Forms
- PDF annotations, form fields, and interactive elements are not imported.
Transparency Groups
- Advanced transparency group blending modes may not be fully reproduced.
Image SMask Compositing
- Per-pixel soft masks (image-modulated luminosity SMasks) are supported and composited into the image as an RGBA PNG. As a consequence, JPEG images carrying an SMask lose the JPEG pass-through optimization — they are decoded and re-encoded as PNG, which increases file size.
Changelog
See CHANGELOG.md for release notes.
License
The software is free for use under the AGPL License.