Translate PDF file using Google Translate API -


i want use google translate in project. completed formalities google. have api key me. key can translate word javascript. how translate pdf file can in google translate site? found 1 thing this:

http://translate.google.com/translate?hl=fr&sl=auto&tl=en&u=http://www.example.com/pdf.pdf

but here cannot use key, result takes time translate. want use key , translate pdf file. please me out. approach this:

1. 1 html page have. 2. 1 browse button pdf 3. upload file 4. transalte pdf google api , show in html page. 

i searched pdf translate did not find anything. please me out.

tl:dr: use headless browser render pdf google's pdf translation service.

pdf complex format , can include many components text. translate describe solution easy 1 more advanced.

translate raw text

if need translation without visual output, can extract text , give google translate.

since did not provide information on project (language, environment, ...) redirect thread on how extract text

translate text

if need text in pdf, that's pretty hard. avoid headache (partially) can convert pdf image (using imagemagick tools or similar) , have 3 options:

  • ocr text image, give google, again loosing original form.
  • ocr text, saving position (some libraries can that, again since did not specify project information, see theses links: #1, #2, #3, #4).

    then translate google api, , write result image. great results need take account of text font, color , background color. pretty difficult, feasible.

  • translate image using google translate image service. unfortunately feature not available in public api, unless doing reverse engineering, not possible.

translate using google's pdf translation service

the solution provide using translate site can automated quite easily. reason it's long because heavy process , won't beat google.

using headless browser, can translation page pdf, observe translated content sitting in iframe, iframe , print pdf.

here short example using slimerjs (should compatible phantomjs)

var page = require("webpage").create();  // here may want setup page size , options      // page page.open('https://translate.google.fr/translate?hl=fr&sl=en&u=http://example.com/pdf-sample.pdf', function(status) {     if (status !== 'success') {         console.log('unable access network');     } else {         // find iframe queryselector         var iframe_src = page.evaluate(function() {             return document.queryselector('#contentframe').queryselector('iframe').src;         });          console.log('found iframe: ' + iframe_src);          // render iframe         page.open(iframe_src, function(status) {             // wait bit javascript translate             // can optimized triggered in javascript when translation done             settimeout(function() {                 // print page pdf                 page.render('/tmp/test.pdf', { format: 'pdf' });                  phantom.exit(0);             }, 2000);          });     } }); 

giving file: http://www.cbu.edu.zm/downloads/pdf-sample.pdf
produce result (translated in french): (i posted screenshot since cannot embed pdf ;) ) pdf result


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -