I ripped a lot of xhtml files from a crappy ebook reader online, how do combine these into a pdf?
Scribus may help you with that
If when opened with a browser they have the right stylesheet, you can pirate m0nkrus’ acrobat pro, then select all => right click => convert to pdf
There are a ton of options depending on your tech level.
How are you with basic Python scripts?
I made the script to rip them in bash. I know python, lua, js, bash and powershell, anything using these works.
Surely you can figure out how to use existing libraries for this task, or is there something you’re stuck on?
Can’t really find many good ones. Google isn’t returning much, just pdfs about python libraries and the odd abandoned github repo
I’d start with wkhtmltopdf/pdfkit
I’ve used pdfkit to considerable success. It has a few system-level dependencies, but the instructions are pretty straightforward:
# apt-get install wkhtmltopdf $ pip install pdfkit
In a production web app I use Gotenberg. It’s definitely overkill for the task at hand, but if you find yourself doing this often I would highly recommend it. It’s dead easy to convert HTML (and I imagine XHTML) to PDF.