npm.io
0.2.0 • Published yesterday

@catlabtech/webcvt-epub

Licence
MIT
Version
0.2.0
Deps
3
Size
172 kB
Vulns
0
Weekly
0

@catlabtech/webcvt-epub

Read-only EPUB (EPUB 3.3 OCF + OPF) reader for webcvt — extract an ebook's metadata and reading order, and convert it to text, HTML, or JSON, entirely client-side.

It is composed from existing hardened webcvt packages rather than re-rolling any parsing:

Install

npm install @catlabtech/webcvt-core \
  @catlabtech/webcvt-archive-zip \
  @catlabtech/webcvt-data-text \
  @catlabtech/webcvt-epub

Usage

import { parseEpub } from '@catlabtech/webcvt-epub';

const book = await parseEpub(epubBytes);   // async: ZIP entry reads are async
book.metadata;   // { title?, creators: string[], language?, identifier? }
book.spine;      // ordered EpubChapter[] — { href, mediaType, bytes }
book.manifest;   // EpubManifestItem[]

Convert via the backend (opt-in registration — never auto-registers):

import { convert, defaultRegistry } from '@catlabtech/webcvt-core';
import { EpubBackend } from '@catlabtech/webcvt-epub';

defaultRegistry.register(new EpubBackend());
const txt = await convert(epubBlob, { format: 'txt' });   // spine text, concatenated
const html = await convert(epubBlob, { format: 'html' });  // one HTML document
const json = await convert(epubBlob, { format: 'json' });  // metadata + structure

A conformant EPUB (a ZIP whose first entry is an uncompressed mimetype of application/epub+zip) is recognised by detectFormat; a plain ZIP stays a ZIP.

Security

256 MiB input cap, ≤10,000 manifest items, ≤5,000 spine items, 64 MiB concatenated-output cap, depth-64 XML walk, and ../ path-traversal rejection on every manifest/spine href. ZIP and XML bomb / XXE protection are inherited from the two composed packages.

Out of scope

EPUB authoring/writing, recursive nav-document / NCX table-of-contents modelling, and font/CSS rendering.

License

MIT

Keywords