SammaPix
Back to Blog
Tools··5 min read

Find and Remove Duplicate Photos Free (No Upload Required)

TwinHunt finds duplicate and near-duplicate photos in your library — runs entirely in your browser, no cloud upload, no software to install.

Duplicate photos are quietly taking up your storage

Every photographer accumulates duplicates. They come from multiple sources: downloading the same card twice, syncing across devices, copying folders for backup with overlapping files, burst shooting where you kept more frames than intended, or importing the same shoot into Lightroom multiple times.

A 10,000-photo library with 15% duplicates is 1,500 extra files. On a MacBook with a 512GB SSD, that is potentially 30-60GB of wasted space — space you are paying for in hardware, cloud backup costs, and slower library performance.

TwinHunt finds them without requiring you to upload anything.

Exact duplicates vs. near-duplicates

Not all duplicates are identical files. TwinHunt handles both categories:

Exact duplicates are files with identical content — the same bytes, the same image, just stored twice (sometimes with different filenames). These are the easiest to find and the safest to delete. One copy can go without any review needed.

Near-duplicates are visually similar but not identical files. The most common sources:

  • Burst sequences where you kept 3 frames of the same moment instead of 1
  • The same image exported at different quality settings or file sizes
  • Slight recompositions from a bracket or panorama attempt
  • A JPEG and a RAW of the same shot (these are intentional, but TwinHunt can surface them)
  • The same photo with different edits applied (crop, color grade)
  • Near-duplicate detection is more nuanced. TwinHunt uses perceptual hashing — an algorithm that creates a fingerprint of each image based on its visual content rather than its exact bytes. Two photos that look nearly identical will have similar perceptual hashes, even if the files are different sizes or formats.

    How perceptual hashing works

    A perceptual hash (pHash or dHash) is a compact fingerprint of an image's visual content. Unlike a cryptographic hash (which changes completely if a single pixel changes), a perceptual hash changes gradually as the image changes — similar images produce similar hashes.

    The process:

  • The image is resized to a very small thumbnail (typically 8x8 or 32x32 pixels)
  • Color is converted to grayscale for basic variants, or retained as a color hash
  • A fixed transformation is applied (DCT for pHash, pixel difference for dHash)
  • The result is a 64-bit or 128-bit integer that represents the image
  • To find near-duplicates, TwinHunt compares hashes from all photos in your set and groups those with hash distance below a threshold. The threshold is configurable: tighter finds only very similar photos, looser catches images with more differences (different crops, different exposures of the same scene).

    All of this runs in your browser. No images are sent anywhere.

    Privacy: why browser-based detection matters

    Duplicate photo finders that work via upload send your entire photo library to a server. For personal photos — especially anything with location data, personal moments, unreleased work — this is a meaningful privacy consideration.

    TwinHunt processes everything locally. When you drop your photos onto the interface, JavaScript reads the file data and computes perceptual hashes in your browser. The comparison and grouping happen on your machine. The only network requests are for the web page itself — not your photo data.

    This also makes TwinHunt fast. There is no upload queue, no server processing wait, no download of results. A 1,000-photo library typically completes analysis in under 30 seconds on a modern machine.

    How to find duplicate photos with TwinHunt

    Step 1. Go to sammapix.com/tools/twinhunt and drop your photos or a folder onto the interface.

    Step 2. TwinHunt analyzes each photo and generates perceptual hashes. A progress bar shows how many files have been processed. Large libraries (5,000+ photos) take a minute or two.

    Step 3. Results appear as groups of similar photos. Each group shows the duplicate or near-duplicate set side by side, with file size and similarity score displayed.

    Step 4. Review each group and mark which files to keep and which to delete. TwinHunt never deletes anything automatically — you always make the final decision.

    Step 5. Download a list of files marked for deletion, or select them in your file manager to remove them. TwinHunt produces a plain-text file with the full paths of files you have marked for deletion — compatible with any OS file manager or batch delete script.

    What to do with the results

    The standard workflow after TwinHunt analysis:

    For exact duplicates: Delete without review. If two files are byte-for-byte identical, keeping both serves no purpose.

    For near-duplicates from bursts: Keep the sharpest, best-exposed frame. Delete the rest. If you have already culled and selected your keepers, use your cull selections as the guide — if a near-duplicate was not selected in culling, it is a delete.

    For different-format pairs (RAW + JPEG): These are intentional pairs in most workflows. TwinHunt will flag them — review and keep both if that is your intention, or delete the JPEG if you are working from RAW.

    For different-edit versions: If you have the same photo edited two different ways, keep both only if both serve a purpose. Otherwise, keep your final edit and delete the intermediate versions.

    Integrating TwinHunt into your photo workflow

    TwinHunt is most effective at two workflow points:

    Before import into Lightroom. Run TwinHunt on the raw card contents before importing. Remove exact duplicates and obvious burst extras before they enter your catalog. This keeps the catalog lean from the start.

    Quarterly library cleanup. Run TwinHunt across your full library (or a date-range subset) every few months to catch duplicates that have accumulated from syncing, backup copying, or multiple imports.

    Combining TwinHunt with SammaPix's Cull tool gives you a fast pre-processing workflow: TwinHunt finds and removes duplicates, Cull handles the selection pass, and then only your keepers go into your editing workflow.


    FAQ

    Will TwinHunt find duplicates across different folders?

    Yes. TwinHunt analyzes all photos you drop into the interface regardless of which folder they came from. If you drop an entire hard drive or multiple card contents at once, TwinHunt compares across all of them and finds matches.

    Can TwinHunt distinguish between a RAW and its JPEG version?

    Yes — perceptual hashing is format-agnostic. A RAW file and its JPEG equivalent will appear as near-duplicates with a high similarity score. TwinHunt labels these as cross-format pairs so you can review them separately from same-format duplicates.

    Is it safe to use on a large library?

    TwinHunt is designed for large libraries. It processes photos incrementally and does not load all files into memory at once — hashes are computed file by file and stored in an in-memory index. A 10,000-photo library requires roughly 200MB of browser memory for the hash index, which is well within normal browser limits on any modern device. TwinHunt never modifies or deletes your files — it only identifies duplicates and lets you decide what to do with them.

    Ad - Hosting for WordPress from SiteGround - Powerful, yet simple to use. Click to learn more.

    Share this article

    Try SammaPix free

    Compress, convert to WebP, and AI-rename your images — no signup needed for compression. 100% client-side, images never leave your browser.

    Start optimizing