[Lab] PDF to webpages conversion

Andrew Plumb andrew at plumb.org
Sat Oct 3 10:23:08 EDT 2015


There’s also this pdf2htmlEX project:

https://github.com/coolwanglu/pdf2htmlEX <https://github.com/coolwanglu/pdf2htmlEX>

Seems to be under active development, and is GPL-licensed open source.

Andrew.

> On Oct 2, 2015, at 11:54 PM, Jason Cobill <jason.cobill at gmail.com> wrote:
> 
> 
> Google crawls PDFs! You just need a page to link to it. It'll index the entire contents. I can't guarantee that the search ranking will be very high.
> 
> You could try PDFtoHTML http://pdftohtml.sourceforge.net/ <http://pdftohtml.sourceforge.net/> or a web service like http://www.pdftohtml.net/ <http://www.pdftohtml.net/> (be careful where you click) if you're really set on converting it.
> 
> 
> 
> On Fri, Oct 2, 2015 at 6:36 PM, Richard Sloan <rsloan at themindfactory.com <mailto:rsloan at themindfactory.com>> wrote:
> Anyone able to recommend a method for taking a PDF, which has an index on the first few pages and create webpages that also have an index page, and a separate page for each indexed section in the PDF? I would like a method that google can easily crawl, so I think, could be wrong however, that the online PDF type readers would not index properly in google....
> 
> Thanks in advance!
> Richard.
> 
> 
> _______________________________________________
> Lab mailing list
> 1. subscribe http://artengine.ca/mailman/listinfo/lab <http://artengine.ca/mailman/listinfo/lab>
> 2. then email Lab at artengine.ca <mailto:Lab at artengine.ca> to send your message to the list
> 
> 
> _______________________________________________
> Lab mailing list
> 1. subscribe http://artengine.ca/mailman/listinfo/lab
> 2. then email Lab at artengine.ca to send your message to the list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://artengine.ca/pipermail/lab/attachments/20151003/46adadfa/attachment.html>


More information about the Lab mailing list