Educational Institution and Student Discounts.Aperture has been spinning gay all day now and nothing seems to be happening. is Tidy Up now Aperture3 compatible, and the best solution we have?įWIW, enabling Auto-Stack appears to be broken. More edit: I just found this, which also promises to assist, and should be included in our script, IMHO: Separating out the Original photos from the Edited photos seems to be a good idea. If I have somehow overlooked a workable solution, please poke me before I begin my arduous trek down the road of mind-numbing photo library editing. I am utterly bemused that these "computers" that were supposed to save us this kind of horribly repetitive manual labor are unable to do this, and that Apple doesn't seem to think it's flagship photo app. At this point it seems that the only way I can clean this up is to devote about 150 man-hours to going through the entire library, clicking on each visually duplicate photo, finding the one with the good metadata, and deleting the other 7. I have put a lot of work into the meta-data that (I hope) got brought over from the iPhoto imports, so I don't relish losing that. My Aperture library now contains about 200,000 photos, probably 70% of which are dupes. I have had to import about five versions of broken iPhoto libraries, so I now have up to 10 duplicates of many photos, but probably only one copy of many. the data portion then compare the metadata and pick the one with the most fields filled out, then mark those in Aperture as #1 Best Dupe #2 Less Quality Dupe (#3-18 deleted because they are exact dupes of #2)ĭuplicate Annihilator is on the right track, and the Aperture version actually claims to use CRC and MD5, but the fact that it marks/deletes UNIQUE photos (it even admits so) is totally unacceptable. it would kill all the definite duplicates, leaving just two or three - one with good metadata, two dupes - to manually work on)Įven better would be if the AppleScript could find the duplicates and near-dupes by efficient multi-pass: filename, file date, CRC head, MD5, SHA-1, etc. That would at least make the manual checking of the slightly different photos a lot easier (i.e. One way functions are often used to hash passwords- so you don't have to store the actual password anywhere- you simply store the hash, the you run the user's typed in password through the algorithm and produce a hash, then if they match, you "know" with a fair degree of certainty that the password is correct.Ĭompuwar has the perfect solution (as long as it doesn't destroy Aperture's stored metadata) Can't we develop this into a useful app? Could this be implemented in an AppleScript script?! Maybe an AppleScript could use compuwar's ideas, and send a "delete this photo" command to Aperture. 1 second to do a CRC32 and 1 second to do a SHA1 hash- you're already looking at a 10:1 rate, but if you add in the fact that say 95% of the time, you're reading 100 bytes instead of 12000000 bytes things get much more efficient. So if a file containing "ABCDE" gives a hash output of f2342965 and a file containing "ABCDF" gives an output of 31c2485, you can be sure they're not the same file. A one-way hash algorithm produces a "relatively" unique output from any given input- "collisions" happen when two different inputs give the same output- but that's very, very unlikely from two files of the same size but differing content. Unfortunately, I just don't have the time to write a lot of code anymore. By winnowing away at the problem like that, I can do literally thousands more files in the same amount of time as it takes to read, checksum and compare a handfull of full-sized raw files with an "expensive" checksum or hash algorithm. My "solution" would be to sort by size first (if they aren't the same size, they can't be the same file,) then to checksum a relatively "cheap" amount of data with a more "cheap" checksum algorithm (CRC32)- if they don't match at that point, they're not the same file. A one-way hash function like SHA-1 or MD5 takes a lot of CPU (relatively speaking) and reading 12 or 25M raw files takes a comparitively long amount of time. I tend to be dealing with tens to hundreds of thousands of files at once.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |