How do i turn a collection of xhtml files into a pdf?

irelephant [he/him]🍭@lemm.ee · 9 months ago

How do i turn a collection of xhtml files into a pdf?

we_avoid_temptation@lemmy.zip · 9 months ago

Pretty sure calibre makes this easy if you don’t wanna reinvent the wheel

sirpuppy@lemmy.dbzer0.com · 9 months ago

came here to say calibre! it works and the converting is super simple. takes a little while for pdf files since its a big file but it works

irelephant [he/him]🍭@lemm.ee · 9 months ago

Oh, I already have that installed. I’ll try it.

deegeese@sopuli.xyz · 9 months ago

There are a ton of options depending on your tech level.

How are you with basic Python scripts?

irelephant [he/him]🍭@lemm.ee · 9 months ago

I made the script to rip them in bash. I know python, lua, js, bash and powershell, anything using these works.

Daniel Quinn@lemmy.ca · 9 months ago

I’ve used pdfkit to considerable success. It has a few system-level dependencies, but the instructions are pretty straightforward:

# apt-get install wkhtmltopdf
$ pip install pdfkit

deegeese@sopuli.xyz · 9 months ago

Surely you can figure out how to use existing libraries for this task, or is there something you’re stuck on?

irelephant [he/him]🍭@lemm.ee · 9 months ago

Can’t really find many good ones. Google isn’t returning much, just pdfs about python libraries and the odd abandoned github repo

deegeese@sopuli.xyz · 9 months ago

I’d start with wkhtmltopdf/pdfkit

irelephant [he/him]🍭@lemm.ee · 6 months ago

Just coming back to this a bit later, wkhtmltopdf is abandoned, is there any alternatives? It works fine for now, but it may not in future.

Cousin Mose@lemmy.hogru.ch · edit-2 9 months ago

In a production web app I use Gotenberg. It’s definitely overkill for the task at hand, but if you find yourself doing this often I would highly recommend it. It’s dead easy to convert HTML (and I imagine XHTML) to PDF.

rnercle@sh.itjust.works · 9 months ago

Scribus may help you with that

Moonrise2473@feddit.it · 9 months ago

If when opened with a browser they have the right stylesheet, you can pirate m0nkrus’ acrobat pro, then select all => right click => convert to pdf