Score: 4.50 Votes: 2
rate this

How are duplicates found?

Starter: star1962 Posted: 8 years ago Views: 1.4K
#4968574
Lvl 13
I am wondering what software the moderators use to find duplicate images - can you enlighten me?
I have 100.000 or more images and I would like to find duplicates in my own collection, and also to be able to find duplicates on WBW before I upload.

I have a nice software on my mac (PhotoSweeper from Overmacs) which helps me compare smaller collections and find duplicates, but it can not just find image X in the collection. Instead it compares all images in the collection to all images in the collection. Needles to say this takes too much time.

So, if you can tell me what you use, perhaps I can use that myself.

Thanks in advance.
#4968588
Lvl 24
I assume it's the same software that can also apparently tell when a photo has allegedly been altered or cropped. It would be good to know though, as having so many photo submissions rejected is a pain in the hoop.
[Deleted], [Deleted] find this awesome.
#4968960
Lvl 70
We don't use a software ourselves but the website automatically detects them. I think it compares hashes of the picture uploaded to others from the database but can't really explain exactly how it works.
We also use our memory of course since we see a lot of the same pics uploaded regularly.

As for cropped pics, there are several things that can help : google reserve search, picture ratio/quality/framing and memory of course.
#4968970
Quote:
Originally posted by omuh
As for cropped pics, there are several things that can help : google reserve search, picture ratio/quality/framing and memory of course.


This is of interest to me too because many of my uploads seem to go the way of "cropped/altered" and are rejected. I understand if there is a legal element to these images being rejected but I actually perform a google reverse image search on EVERY photo I upload and generally I'll select the largest/best quality version. I wish I knew how to tell if an image had been cropped/altered from the original as it would save me uploading them in the first place. As I've mentioned before, I don't alter them myself.

Another query along the same lines - if I had an amateur beach photo of a very attractive topless woman (for example) with either a naked dude or a child in the background, is there a particular reason why a cropped photo in this instance wouldn't be acceptable?
#4969047
Lvl 70
Quote:
Originally posted by jhope1
...

This is of interest to me too because many of my uploads seem to go the way of "cropped/altered" and are rejected. I understand if there is a legal element to these images being rejected but I actually perform a google reverse image search on EVERY photo I upload and generally I'll select the largest/best quality version. I wish I knew how to tell if an image had been cropped/altered from the original as it would save me uploading them in the first place. As I've mentioned before, I don't alter them myself.

Another query along the same lines - if I had an amateur beach photo of a very attractive topless woman (for example) with either a naked dude or a child in the background, is there a particular reason why a cropped photo in this instance wouldn't be acceptable?

I can confirm many of your uploads have been cropped pictures that were already on the website. I still stumble upon some duplicate reports clearly showing the 2 pics cropped differently. Not much you can do about it since google will not always find the original one.

As for the second question, if there's a dude in the background, the picture will be accepted anyway. If there's a child, I personally consider that if a picture was taken with a child in it (even if there's also a lovely lady), it wasn't meant to land on a porn website (and it's often the case since this kind of pictures is mainly facebook-like pictures of random friends in bikini). Objectively, it could be accepted but I'd rather delete a few more pics like that by being picky on the cropping aspect than letting many cropped pics (that would end up being copyrighted or duplicate) swarm the galleries.
#4969231
Lvl 13
When I upload a zip file containing images I get 0 in the duplicates "index" shown after the upload. Still, some pics apparantly are duplicates. It is not easy...

Besides, when you tag an image as a duplicate it would be helpful if there was a link in the delete log to the duplicate - it could be that it is part of a series containing images I don't already have, and therefore could supplement my collection.
#4969259
Lvl 70
Quote:
Originally posted by star1962
When I upload a zip file containing images I get 0 in the duplicates "index" shown after the upload. Still, some pics apparantly are duplicates. It is not easy...

Besides, when you tag an image as a duplicate it would be helpful if there was a link in the delete log to the duplicate - it could be that it is part of a series containing images I don't already have, and therefore could supplement my collection.

Duplicates aren't automatically detected/deleted, the only thing that is automatically filtered is the picture size. I think it used to be different but can't remember when and why it changed.