See linked posting. I’ve commented there with a link to a CLI tool in Python that allows downloading of IA collections. I’ve submitted a patch to enable specifying start and end points so that it’s easier to resume downloading a huge collection, or to allow multiple people to split up the work.

https://archive.org/details/georgeblood

https://archive.org/details/78rpm_bowling_green

F*ck the RIAA and absurdly long copyright.


EDIT: There is more than one collection of 78s on IA, so I updated the title.


The issue with these collections are that they’re absolutely HUGE. And yes, IA offers torrents for them, but as a separate torrent for every. single. album. And the torrents have all data in them – FLAC, fixed-rate MP3, VBR MP3, PDF liner notes, etc. etc… there may be some extremely hardcore data-hoarders out there who want everything, but IMHO as these are scratchy old 78 records, FLAC is overkill to just save the audio in a listenable format. The George Blood collection, just the VBR MP3s, is looking to be about 6TB. With ALL data it might be over 40TB! I can’t afford that many hard drives :)


So, my approach at the moment is to save just the VBR MP3s (they seem to be done at up to 320kbps VBR) and the JPEG album cover. If I have a chance and any storage left afterwards, I can make a separate pass to get the album liner PDFs…


Tool used: https://github.com/jjjake/internetarchive


Patch to allow setting start and end item indices for downloads: https://github.com/jjjake/internetarchive/pull/605


Example usage to grab just the VBR MP3 and record label JPG for each (note the --start-idx and --end-idx arguments):

#ia download --start-idx=4001 --end-idx=8000 -a -i --format="VBR MP3" --format="JPEG" --search collection:georgeblood

I’m going to concentrate on the George Blood collection for now… I’m starting at item 1. It would be great if others started at index 50,000, 100,000, 150,000, … and others started at the end and worked backwards in similarly-sized chunks, so that it’s assured someone gets each of them.

  • Haui
    link
    fedilink
    English
    arrow-up
    40
    arrow-down
    1
    ·
    10 months ago

    Probably stating the obvious but „are in no threat of being deleted“ is an absolute joke.

    A company holding the IP can just make it unavailable tormorrow. A big chunk of us is here because reddit somehow is allowed to delete our posts because the law is idiotic. At least european people are allowed to get their data but the cooperative works of thousands of people is threatened due to those laws.

    The concept of IP needs to be reformed.

    • Arghblarg@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      1
      ·
      10 months ago

      Yeah. And whenever anyone says “Oh the music companies would never let these old recordings die, it’s their bread and butter!” I give them this story.

      We cannot trust our cultural heritage to any one entity.

      • Haui
        link
        fedilink
        English
        arrow-up
        3
        ·
        10 months ago

        Oof. I just read this. It’s pretty brutal.

    • As concrete examples, try to get a copy of Disney’s 1946 movie, “Song of the South.” It’s been removed from circulation because of its whitewashed presentation of “happy slaves.” Similarly, 6 of Dr. Seuss’ books, including “And to Think That I Saw It on Mulberry Street” were withdrawn because of racial imagery (the mentioned book had a “Chinaman” drawn with a WWII stereotype style - rice hat, sloping eyes, buck teeth).

      There’s media you simply can’t get anymore.

      • Haui
        link
        fedilink
        English
        arrow-up
        8
        ·
        10 months ago

        Our culture has been copyrighted.

        • In this case, the media was withdrawn for (arguably) good reasons: the representations were deemed hurtful or harmful.

          Good reasons or bad, they still stand as stark examples of how media can disappear at the whims of a single organization.

          • Haui
            link
            fedilink
            English
            arrow-up
            3
            ·
            10 months ago

            Yes and it’s horrific

          • GnuLinuxDude@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            10 months ago

            Song of the South does whitewash being black in the USA, but it is set in post-civil war America, so superficially it does not need to handle the slavery topic, which can be dismissed as having been dealt with already.

    • WarmSoda@lemm.ee
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      11
      ·
      edit-2
      10 months ago

      Did you think your posts on Reddit were protected by copyright laws or something?

      Are you seriously comparing posts on a forum to music rights?

      • Haui
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        10 months ago

        What exactly are you trying to convey? That these „works“ made by ordinary people who have only a basic understanding of copyright law should be deleted if someone feels like it? That the law is more important than justice?

        Also, do you really think you‘re cool by implying things phrased as a question? Won‘t you just talk like a normal person and state your opinion instead of fake-calling-out others?

        • WarmSoda@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          edit-2
          10 months ago

          Posts you make on a forum are not “works” that are copyrightable. Deleting a post is not an injustice.

          Sentences phrased with a question mark means it’s asking a question. When someone asks a question, the normal response is to then provide an answer to that question.

          But you’re just being an asshole. You know exactly what I’m saying, and you know you’re saying ridiculous things so your only response is not answering either of the two questions and and then try to twist it.

          • Arghblarg@lemmy.caOP
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            1
            ·
            edit-2
            10 months ago

            Posts you make on a forum are not “works” that are copyrightable.

            That may depend on the platform – slashdot (remember that site?) once upon a time had a footer on their pages stating “All posts belong to their authors”. There were a few big debates about that being legally enforceable. Hmm. I wonder if there ever was a legal ruling on that.

            I notice today their site does not have such a disclaimer. Probably disappeared long ago, due to one of their many corporate buyouts.

          • Haui
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            1
            ·
            10 months ago

            I‘m glad you saw your mistake. Have a good one.

              • Haui
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                10 months ago

                Does this happen to you often? Maybe rethink your approach in discussions.

                  • Haui
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    10 months ago

                    Or you can double down and blame others for your behavior, sure.