Issue
As an example if I upload an image to imgur twice and once on another website there's a fair chance that all 3 images will have different checksums. jpeg is lossy so I can't simply check if the pixels match.
How do I check if have the same picture encoded different? I don't want to write an algorithm, I want to use a library or offline app via CLI
Additional information: I prefer it to be considered different pictures if it's cropped differently but for my use case it won't matter (and I can simply check the width and height if I want that?)
Solution
There is a Imgur bot called "repoststatistics" that uses dHash to compare images
How the bot works: https://www.hackerfactor.com/blog/index.php?/archives/2013/01/21.html
What library you can use to do the same:
https://github.com/benhoyt/dhash
https://github.com/Rayraegah/dhash
"I've found that dhash is great for detecting near duplicates, but because of the simplicity of the algorithm, it's not great at finding similar images or duplicate-but-cropped images -- you'd need a more sophisticated image fingerprint if you want that. However, the dhash is good for finding exact duplicates and near duplicates, for example, the same image with slightly altered lighting, a few pixels of cropping, or very light photoshopping."
Answered By - luzede Answer Checked By - Pedro (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.