How to Use Visual Similarity Analysis
Audience: Archivists — people who want to find and group visually similar or duplicate photos in a batch.
When digitizing old photo albums, it's common to have near-duplicate shots — multiple photos from the same moment, very similar compositions, or scanned duplicates of the same print. Zeuge's visual similarity analysis automatically finds these groups so you can decide how to handle them.
What Visual Similarity Analysis Does
Zeuge analyzes every item in your batch and identifies photos that look visually similar. It then:
- Proposes groups of similar photos for your review
- Lets you confirm groups (they become linked in the archive)
- Lets you reject groups (if the algorithm made a mistake)
Confirmed groups are also used in the contributor carousel — contributors can record a single memory that covers all photos in a group, which is more natural than recording the same memory for five nearly identical shots.
Step 1 — Open the Items Tab
From your batch detail page, click the "Items" tab.
📸 [SCREENSHOT: Items tab showing the photo grid and the similarity analysis section below]
Step 2 — Run Similarity Analysis
Scroll to the Similarity section or look for the "Analyze for similar photos" button. Click it.
📸 [SCREENSHOT: The "Analyze for similar photos" button in the Items tab]
Zeuge will:
- Download the thumbnail version of every item in your batch
- Compute a perceptual hash (pHash) for each image — a mathematical "fingerprint" that captures the visual content
- Compare all images pairwise to find matches within a similarity threshold
- Build groups of connected matches
- Show you the proposed groups
This process typically takes 10–60 seconds for a batch of 100 photos. Larger batches take proportionally longer.
📸 [SCREENSHOT: The "running" state — a progress indicator showing the analysis is in progress]
Note on the AI engine: The standard similarity engine uses pHash, which is fast and works well for near-duplicate detection. Pro and Enterprise plans have access to a premium AI embedding engine (Vertex AI multimodal embeddings) which can find visually and conceptually similar images even when the composition differs. See Subscription Tiers for details.
Step 3 — Review Proposed Groups
When the analysis completes, proposed groups appear as group cards in the similarity section. Each card shows:
- Thumbnail collage of all photos in the group
- Number of items in the group
- Confidence score — how similar the algorithm thinks they are (higher = more confident)
- Action buttons: Confirm, Reject
📸 [SCREENSHOT: Several proposed group cards — showing thumbnail collages, confidence scores, and Confirm/Reject buttons]
Step 4 — Confirm or Reject Each Group
✅ Confirm a Group
If the photos genuinely belong together (e.g., three shots of the same moment, or two copies of the same photo), click "Confirm".
The group moves to the "Confirmed" section and becomes:
- Visible to contributors in their carousel (they can record one memory for the whole group)
- Linked in the archive metadata
❌ Reject a Group
If the algorithm made a mistake (e.g., grouped two photos that just happen to have a similar background but are actually different subjects), click "Reject".
Rejected groups are dismissed and their items remain independent.
📸 [SCREENSHOT: Confirming a group — the card moving from "Proposed" to "Confirmed" section]
Step 5 — View Group Details
Click "View" (or click the group card) to see all items in a group at full size before deciding.
📸 [SCREENSHOT: Expanded group detail view showing all 4 photos in a proposed group at larger size]
Creating Groups Manually
If you notice a group of related photos that the algorithm didn't catch, you can create a group manually:
- In the items grid, select multiple items (hold Shift + click).
- Click "Create Group".
- A confirmed group is created immediately (marked with
source: manual).
📸 [SCREENSHOT: Selecting multiple items in the grid and the "Create Group" button appearing]
Re-Running Analysis
If you upload new items after running the analysis, click "Analyze for similar photos" again. A new analysis run will be created, and newly proposed groups will appear for review. Previously confirmed groups are preserved.
How Groups Appear to Contributors
Once you confirm groups, contributors using the voice narration carousel will:
- See the entire group highlighted as a unit
- Be able to select all photos in a group with one action
- Record one single memory for the entire group
This is especially valuable when you have 4–5 near-identical shots — the contributor only has to record one response instead of five.
📸 [SCREENSHOT: The contributor carousel showing a confirmed group — all group items highlighted together with "Record for group" affordance]
Common Questions
The algorithm grouped photos that aren't really similar. Why? pHash works well for near-duplicates but can occasionally produce false positives if photos share a dominant color, similar framing, or similar backgrounds. Just click "Reject" to dismiss incorrect groups.
The algorithm missed a group of similar photos. What should I do? You can create the group manually (see above). You can also re-run the analysis — occasionally running the analysis a second time after confirming or rejecting existing groups can surface missed matches.
How is the confidence score calculated?
For the pHash engine: 1 − (hamming_distance / 63). A score of 1.0 means the images are pixel-for-pixel identical (or identical after resizing). A score around 0.85–0.95 typically represents near-duplicate shots from the same moment.
Will confirmed groups affect how metadata is merged? Not directly — metadata is merged per-item based on responses. However, confirmed groups help contributors record shared memories, which typically results in the same metadata being applied to all items in the group.
Can I un-confirm a group? Currently, confirmed groups can be rejected from the confirmed section to remove the grouping.
Next Steps
→ Full Archivist Reference Guide → Processing Pipeline Reference







