File algorithms and architecture

An open-source, browser-first stack — how Ask Jeeves detects formats, routes conversions, and lazy-loads heavy libraries.

Jeevez in a file processing workshop, overseeing DOC, PDF, PNG, and XLS conversions

Ask Jeeves is an open-source project (MIT license) built as a static Astro site. Conversion logic lives in your browser — not on a server. This article outlines the architecture and the libraries that power each format.

What you can do

You can convert images, spreadsheets, CSV/JSON, Word documents, PDFs, and plain text — all from one page. Pick a file, choose an output format, and download the result.

Architecture overview

Every conversion follows the same path:

  1. Detect — infer format from extension and MIME type (src/lib/conversions/detect.ts)
  2. Registry — look up valid outputs in a central conversion table (src/lib/conversions/registry.ts)
  3. Processor — run the matching handler in memory (eager or lazy-loaded)
  4. Download — return a new Blob for save; nothing is stored on our servers
File in → detect → registry → processor → Blob out

Lighter jobs (images, CSV, Word) load with the main page. Heavier ones (PDF and Excel) load extra code only when you need them, so the first visit stays fast.

Open source

The project is MIT-licensed. You can inspect the conversion registry, processors, and UI flow in the GitHub repository. We use well-known libraries rather than proprietary black boxes — see the table below.

Libraries we build on

AreaLibraryRole
ImagesCanvas APIResize, encode PNG/JPEG/WebP
PDF editpdf-libCopy page ranges, split PDFs
PDF renderPDF.jsRasterize pages to PNG
Wordmammoth.jsDOCX → HTML or plain text
SpreadsheetsSheetJS (xlsx)XLSX → CSV/JSON
CSV/JSONPapaParseParse and serialize tabular data
ArchivesJSZipBundle multi-page PDF/PNG exports

Each processor is a thin wrapper around these tools, wired through the registry so the UI only shows valid source → target pairs.

Why lazy-load PDF and Excel

PDF.js and SheetJS are powerful but large. Loading them on every page visit would slow down users who only need a quick image or CSV conversion.

Instead, the registry marks those conversions as lazy: the chunk downloads when you first run a PDF or XLSX job. After that, the browser caches it for repeat use.

Supported conversions

Every conversion below runs entirely in your browser:

FromToWhat it does
PNG, JPEG, WebPEach otherChange image format; optional resize and quality
CSVJSONTurn rows into structured data
JSONCSVTurn structured data into a spreadsheet-friendly file
CSVXLSXBuild an Excel workbook from comma-separated data
TXTTXTNormalize line endings (Windows ↔ Mac/Linux)
DOCXHTML or plain textExtract readable content from Word files
PDFPDF (subset)Copy a page range into a new PDF
PDFZIP of PDFsOne PDF per page, bundled
PDFPNG (ZIP if many)Rasterize pages as images
XLSXCSV or JSONExport sheet data

Privacy and limits

  • Privacy: Processing uses your CPU and RAM locally. See Data privacy for the full guarantee.
  • File size: Default maximum is 50 MB per file (configurable when the site is built).
  • PDF jobs: Operations are capped at 50 pages per run to keep the tab responsive.

What powers the site

Ask Jeeves is built with Astro as a static site and hosted on Cloudflare Pages. Conversion logic uses browser APIs (File, Blob, Canvas) plus the libraries above — each covered in the format-specific articles below.

See also