PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Monday, September 19, 2022

[FIXED] How do I create a better quality image converting a .pdf to a .jpg with Imagick/PHP?

 September 19, 2022     imagick, pdf, php     No comments   

Issue

I currently have a single page PDF (http://reljac.com/so_1/all.pdf) which is a basic scan of several paper receipts. If you look at the PDF the text is clear and legible. The original is a scan of an 8.5" x 11" sheet of paper (shouldn't matter)

I've created a very simple file to convert that PDF into a .jpg using this code:

<?php     
    $im = new imagick('all.pdf[0]');
    $im->setImageFormat('jpg');
    $im->setImageCompression(imagick::COMPRESSION_LOSSLESSJPEG); 
    $im->setImageCompressionQuality(80);
    header('Content-Type: image/jpeg');
    echo $im;
?>

When I run that (http://reljac.com/so_1/pdf_jpg.php) the resulting image is illegible.

I'm working off of two servers at the moment, one tells me:

Version: ImageMagick 6.2.8 10/06/10 Q16 file:/usr/share/ImageMagick-6.2.8/doc/index.html

the other:

Version: ImageMagick 6.6.0-4 2012-05-02 Q16 http://www.imagemagick.org

Both servers create a similar quality .jpg

I've changed several of the settings including:

  • $im->setImageCompressionQuality(40);
  • $im->setImageCompressionQuality(100);
  • $im->setImageCompressionQuality(80);
  • $im->setImageCompression(imagick::COMPRESSION_JPEG); (various others from http://www.php.net/manual/en/imagick.constants.php)

I've tried adding $im->scaleImage(600,0);

Nothing seems to make anything more legible. I'd like the end result to be a legible .jpg of the original PDF - it does not have to fill the screen, it just needs to be legible. The original PDFs may be different sizes so I need to keep in mind that the source is not always 8.5" x 11".

Is there anything else I can do to enhance the quality of the resulting image or is this the best I should expect? Do I need to process these files in some other way to get a better image?

UPDATE Based on @VadimR's answer I'm now using the following:

$src = 'all.pdf';
$src_parts = pathinfo($src);

shell_exec('pdfimages ' . $src . ' ' . $src_parts['filename']);
shell_exec('convert ' . $src_parts['filename'] . '-000.pbm -resize 25% -sharpen -2 ' . $src_parts['filename'] . '.jpg');

$myImage = imagecreatefromjpeg($src_parts['filename'] . '.jpg');
header("Content-type: image/jpeg");
imagejpeg($myImage);
imagedestroy($myImage);

shell_exec('rm ' . $src_parts['filename'] . '-000.pbm');

That results in a nice, legible image.


Solution

ImageMagick delegates PDF rendering to Ghostscript, therefore for troubleshooting specify not only IM, but GS version, too, if necessary. Second, I think it's better to start with command line, and only after appropriate quality is achieved, put it into php code.

Command that gives quality (more or less):

convert -density 300 all.pdf out.jpg

Here we set rendering resolution 300 dpi. Note, it's not the same as

convert all.pdf -density 300 out.jpg

because here rendering goes at 72 dpi, then bad quality result is assigned (i.e. w/o resampling) with 300 dpi.

But, I think better approach can be to extract scans as is i.e. without transformations:

pdfimages all.pdf all

that gives all-000.pbm image -- 1-bit per sample, 3424*4400 px. I definitely can't agree, that "text is clear and legible" - some digits can only be guessed.

Then use convert command to resample and maybe try to improve e.g.

convert all-000.pbm -resize 25% -sharpen 2 out.jpg


Answered By - user2846289
Answer Checked By - Marie Seifert (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing