• unexposedhazard
      link
      fedilink
      English
      arrow-up
      45
      ·
      5 months ago

      This feels like it should be a browser plugin that automatically anonymizes anything you download.

    • NeatNit
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      5
      ·
      edit-2
      5 months ago

      I feel like this will cause quality degradation, like repeatedly re-compressing a jpeg. Relevant xkcd

      Edit: though obviously for most use cases it shouldn’t matter

      • Passerby6497@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        1
        ·
        5 months ago

        Why would it cause degradation? You’re not recompressing anything, you’re taking the visible content and writing it to a new PDF file.

        • NeatNit
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          6
          ·
          5 months ago

          You’re pushing it through one system that converts a PDF file into printer instructions, and then through another system that converts printer instructions into a PDF file. Each step probably has to make adjustments with the data it’s pushing through.

          Without looking deeply into the systems involved, I have to assume it’s not a lossless process.

          • 4am@lemm.ee
            link
            fedilink
            English
            arrow-up
            6
            ·
            5 months ago

            Those printer instructions are called Postscript and they’re the basis of PDF.

            You are thinking that the printing process will rasterize the PDF and then essentially OCR/vector map it back. It’s (usually) not that complicated.

          • TomSelleck@lemm.ee
            link
            fedilink
            English
            arrow-up
            6
            ·
            5 months ago

            You should maybe look a bit more into it. How do you think commercial printers or even hobbyists maintain fidelity in their images? Most images pass through multiple programs during the printing process and still maintain the quality. It’s not just copy/paste.

            • NeatNit
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              6
              ·
              5 months ago

              They maintain a high quality but not lossless.

              As a trivial example, if you use the wrong paper size (like Letter instead of A4) then it might crop parts of the page or add borders or resize everything. Again I’ll admit, in 99% of cases it doesn’t matter, but it might matter if, say, an embedded picture was meant to be exactly to scale.

              • TomSelleck@lemm.ee
                link
                fedilink
                English
                arrow-up
                3
                ·
                5 months ago

                My friend, I worked in commercial printing for 2 decades. You’re still making assumptions that are wrong. There are ways to transfer files that are lossless and even ways to improve and upscale artwork. Why do you care so much about this?

                • NeatNit
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  3
                  ·
                  5 months ago

                  “There are ways” ≠ this is what happens by default when done by the average user

        • NeatNit
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          4
          ·
          5 months ago

          See my reply to another comment

          • Diplomjodler@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            ·
            5 months ago

            You’re still wrong. the only place where it could cause quality loss if embedded bitmap images are compressed with lower quality settings (which you can adjust). PDF is a vector format, i.e. a mathematical description of what is to be rendered on screen. It was explicitly designed to be scalable, transmittable and rendered on a wide variety of devices without quality loss.

            • NeatNit
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              8
              ·
              5 months ago

              No point discussing this if neither of us is going to prove it one way or the other.

              Bitmaps are actually a key part of what I was thinking about, so you agree with me there it seems. There’s also the issue of using the wrong paper size. .IIRC Windows usually defaults to Letter for printing even in places where A4 is the only common size and no one has heard of Letter, and most people don’t realise their prints are cropped/resized. This would still apply when printing to PDF.

              • Diplomjodler@lemmy.world
                link
                fedilink
                English
                arrow-up
                4
                ·
                edit-2
                5 months ago

                My point is that all these things can be controlled in the settings of your PDF printer driver. So it’s not completely straightforward but definitely doable.

      • Turun@feddit.de
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        I don’t understand the “that’s no how PDFs work” criticism.

        Removing data from the original file is the whole point of the exercise! Of course unique tokens can be hidden in plain sight in images, letter spacing, etc. If we want to make sure to remove that we need to degrade the quality of the PDF so that this information is lost in said lossy conversion.