So, I’m selfhosting immich, the issue is we tend to take a lot of pictures of the same scene/thing to later pick the best, and well, we can have 5~10 photos which are basically duplicates but not quite.
Some duplicate finding programs put those images at 95% or more similarity.

I’m wondering if there’s any way, probably at file system level, for the same images to be compressed together.
Maybe deduplication?
Have any of you guys handled a similar situation?

  • just_another_person@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    4
    ·
    2 months ago

    The problem is that OP is asking for something to automatically make decisions for him. Computers don’t make decisions, they follow instructions.

    If you have 10 similar images and want a script to delete 9 you don’t want, then how would it know what to delete and keep?

    If it doesn’t matter, or if you’ve already chosen the one out of the set you want, just go delete the rest. Easy.

    As far as identifying similar images, this is high school level programming at best with a CV model. You just run a pass through something with Yolo or whatever and have it output similarities in confidence of a set of images. The problem is you need a source image to compare it to. If you’re running through thousands of files comprising dozens or hundreds of sets of similar images, you need a source for comparison.

    • cizra@lemm.ee
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      2 months ago

      OP didn’t want to delete anything, but to compress them all, exploiting the fact they’re similar to gain efficiency.

        • WhyJiffie@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          2 months ago

          No, not really.

          The problem is that OP is asking for something to automatically make decisions for him. Computers don’t make decisions, they follow instructions.

          The computer is not asked to make decisions like “pick the best image”. The computer is asked to optimize, like with lossless compression.