Search results
May 10, 2014 · I prefer converting PDF files first to HTML using pdftohtml included in the poppler-utils package, for example by means of a file manager context menu script merely consisting of this command: pdftohtml -noframes -q -p -c "$1" "${1%\.pdf}-img.html"
Jul 16, 2020 · I am using the following libreoffice command to convert my pdf files into word(doc) lowriter --headless --infilter='writer_pdf_import' --convert-to doc:"MS Word 2007 XML" sample.pdf After converting the output document is having each line bordered with a rectangular box. But if I use the same command with MS Word 97 it is working perfectly.
Oct 29, 2020 · I would like to convert a pdf to an word or ODF file. If found several programs like Okular, but I don't know how to do that. Edit: It is a text pdf, not an image pdf.
Oct 17, 2018 · I need to batch convert a set of .doc or .docx files to .pdf in terminal, not using a GUI. It would be helpful if I could batch-process multiple files. I would also like to maintain as much meta...
Sep 10, 2011 · Since here the wanted output format is PDF, we have another choice of different paths, provided by what Pandoc is calling a pdf-engine. Here is the list of currently available PDF engines (valid for Pandoc v2.7.2 and later -- previous versions may support only a smaller list):
To get a single page from gm convert, add [N] (with N the page number starting at 0) to the PDF name, ie gm convert foo.pdf[11] out.png to get the 12th page from the PDF. For pdftoppm use -f N -singlefile, where N is the page number starting at 1, ie pdftoppm -f 12 -singlefile foo.pdf out for the same result.
Aug 23, 2019 · It allows all images to have different dimensions, while keeping the output PDF page size the same (A4 in the example below). It centers the images on the PDF page. It allows you to guarantee a minimum space between image borders and PDF page borders, to allow printing the PDF without problems. It does not change the image data.
Feb 19, 2014 · This is not a direct answer to the question, but for people who want to read PDFs on an e-reader and are finding it cumbersome, one solution may be to crop the margins of the PDF, and read in landscape mode. That works acceptably well for me on my e-reader in any case. There are many PDF croppers out there. Just search for "pdf crop" or "pdf trim".
Oct 16, 2023 · Unfortunately, none of the solutions above worked :(. Then I realized that somewhere in the scanner->jpeg->pdf process the size of the page had gotten bloated by a factor of aprx 4. The documents I scanned were all Letter sized, but the pdf had size of . identify -verbose doc_orig.pdf | grep "Print size" Print size: 35.4167x48.7222
The easiest way: use gscan2pdf to import the djvu, then OCR it with tesseract, and finally save it as a pdf. The OCR'd text in the pdf might be slightly different from the original djvu, and the conversion may take a while, but this method is a no-brainer and it works.