Entrance Events! Chat Gallery Search Everyone Wiki Login Register

Welcome, Guest. Please login or register. - Thinking of joining the forum??
May 15, 2025 - @338.41 (what is this?)
Activity rating: Three Stars Posts & Arts: 40/1k.beats Unread Topics | Unread Replies | My Stuff | Random Topic | Recent Posts Start New Topic  Submit Art
News: :happy:  :pc: There are community newsletters here! :pc: :happy: Super News: Upload a banner!

+  MelonLand Forum
|-+  World Wild Web
| |-+  ☞ ∙ Life on the Web
| | |-+  ⛽︎ ∙ Technology & Archiving
| | | |-+  the ethics of archiving


« previous next »
Pages: [1] Print
Author Topic: the ethics of archiving  (Read 120 times)
brightbluebug
Casual Poster
*

⛺︎ My Room

View Profile

Joined 2024!
« on: May 11, 2025 @382.05 »

thinking about born-digital archival (or archival in general), and the conflict that is wants and wishes of the original creator vs preservation.

this: https://forum.melonland.net/index.php?topic=3586.0 thread from last year piqued thought of this, but what mostly prompted this thread was this: https://huggingface.co/datasets/nyuuzyou/archiveofourown/discussions/3

basically someone scraped ao3 and made a huge dataset of fics, then posted it on huggingface, sort of a gen-ai enthusiast community from what i can tell - so with the intent of serving as training data for generative ai; this was done without the consent of any of the authors nor ao3, & is questionably legal. it was posted about on social media (reddit..?); this angered a lot of fic writers, for obvious (and i personally think rightful) reasons.

some went to the huggingface thread and there ended up being some interesting discussion (along with as par death threats). the ethics of ai, if training it on data is same as human learning, etc. the thread itself was started by someone asking for the dataset somewhere else, as they wanted it partially as an offline archive of the site; one reply to it reads,
Quote
Hey! If you want to archive fic or artwork from AO3, do it yourself instead of supporting a disgusting thief.


which kind of encapsulates what the big question is: is it ethical to archive things, no matter the means nor consent of its creator? if something is made without the intent of archival and rather for something like profit, but fufills the same ends, should it still be evaluated ethically the same way?

i'm not sure where i stand on the creator consent vs archival thing, but i'm leaning towards the former; art being something so creator-centered and intertwined that i think they deserve that respect regarding their work. i would like to know your all's thoughts :smile:

a few other thoughts about this:
-if individual/local archives exist, is it necessary for larger/collective archives to be created of those?
-there's a pretty clear clash of internet subcultures on here. its interesting
-is there necessity to archive archives?
-should certain forms of archival  be more accepted than others?
« Last Edit: May 11, 2025 @756.30 by brightbluebug » Logged
nlolnlolnlo
Newbie ⚓︎
*


it's the end of the world and i'm driving around

⛺︎ My Room

View Profile WWW

Found A Trick With Life's LemonsJoined 2024!
« Reply #1 on: May 11, 2025 @530.51 »

not in a state of mind to provide any sort of insight but just want to say this is something i've considered a lot lately especially in the way that archival overlaps with piracy. intriuged to see what discussions this brings forward
Logged

nobo
Jr. Member ⚓︎
**


drainnnn

⛺︎ My Room
StatusCafe: nobo
iMood: nobo
Itch.io: My Games
RSS: RSS

View Profile WWW

First 1000 Members!Joined 2023!
« Reply #2 on: May 11, 2025 @579.44 »

My viewpoint might be a bit old fashioned, but I don't see the Internet as a marketplace but as a public information resource. Things that are put on the Internet are easy to backup and make copies of by design. It's a feature, not a bug.

Over the years and for a long time now people have realized the Internet could be transformed into a marketplace and now expect everyone to retroactively treat it like one. For example, uploading an image to the Internet and saying, "please don't archive this" would be like if Walmart shipped a pallet load of groceries to the park and just left it on the basketball court with a sign that said, "Please don't touch."

I do understand the frustration, because every time someone employs a scraper for public good (read: Aaron Schwartz), they get the books thrown at them legally, and every time someone uses it for personal profit, nothing is done. It is a bullshit double standard. And it's completely not fair so I empathize.
Logged

_ghost_
Jr. Member ⚓︎
**


⛺︎ My Room

View Profile WWW

Melonland's Local Ghost !Joined 2025!
« Reply #3 on: May 11, 2025 @610.17 »

The thing about generative AI, in my opinion, can be summed up with two specific problems.

The first is that the vast majority of datasets, especially ones meant for widespread, public usage, just take indiscriminately from large chunks of the internet with no real care or attention to what that material is. So many kinds of AI programs including gen-AI and algorithms meant to identify and categorize photos are often extremely biased and sometimes trained off of data that constitutes an actual privacy issue. Any photo of you that has been uploaded to a server anywhere has the potential to have ended up in the specific part of the web any giving AI crawler is scraping. Personal records get leaked online all the time. It's not impossible and is in fact likely that personal records end up in AI data sets eventually. This is an internet privacy issue generally, and also a consent issue; being able to opt-in to allowing your photos or records to be used for this should be fully up to you, and the lack of transparency is gross.

The second is just the ways these programs are pushed into businesses in place of actual human workers when the technology is fully just Not There Yet to support full automation but it's there Enough to need a human to fix it's mistakes. In the field of translation, for example, machine translation has existed for years and is now basically just being rebranded as "AI" when there's not been much of a change in how it works, just how well it works. Someone who would get hired to do a full translation ten years ago might now be hired to "Edit" an existing translation done by AI that is filled with so many errors they have to start from scratch, and getting paid less since they aren't being hired to actually do the translating, even if they still are doing it. It's doing a lot to devalue the worker and add meaningless steps to their job. AI can just only do so many things. This isn't even getting into things that become fully meaningless when done by AI (i.e. people using them to write university papers, when the entire point of writing university papers is proving that you know how to research information or understand a text and build an argument using that info).

So the long and short of it is that no, I don't find archiving unethical, and generally speaking even AI-crawling is, conceptually, neutral. The issues with the latter are because of the execution, not the idea. Some people act like archiving someone's online work is a huge deal because they didn't give permission, but by putting it on the internet, you have given other people permission to see and download it. You have actively put it out of your control. I think it's polite to remove something from a web archive if the owner/creator asks, especially if the archived page contains personal information, but archival work is always going to include things that people don't want saved or didn't even consider would be saved. Think of how many artists and writers we don't know the names of because their work was just not preserved. Surely some of them didn't want to be remembered, anyway, but what they would or wouldn't have wanted isn't really relevant anymore.

Also I just don't believe in, like, "owning" an idea or whatever; not crediting sources is unethical but I fully think a copyright system that makes it illegal for two people to write two books that are deemed too similar is stupid, and a lot of people's complaints about gen-AI or web archives are basically just regurgitating copyright law as if it's a morality thing and not a law designed to govern how businesses maintain "ownership" of an idea or image.
Logged
crazyroostereye
Jr. Member ⚓︎
**


I am most defiantly a Human

⛺︎ My Room
RSS: RSS

View Profile WWW

Joined 2024!
« Reply #4 on: May 11, 2025 @860.17 »

Similar to what the others already said. Putting something on the Internet is inherently releasing that Information to be taken by anyone. While respecting Copyright is Important, and that Copyright serves a Purpose. While still agreeing that most Implementations of Copyright are broken. As I believe, that a person who came up with the Idea should have a head start to Market his Idea and to garner Fame and Fortune with his Idea. But also it is only a Head Start and not a practically Permanent Right to it.

But in Particular Archiving and Preservation takes a high Societal Value to me anyway, where nor Law, nor Artist Will has a right to Prevent the Preservation. And that goes for Physical and Digital Mediums. While Distribution of that Archive can be Limited or even Restricted especially in the case of Copyrighted work. The Archives still should reserve the Right to maintain a Copy, even Against the Will of the Artist and Copyright holder.
Logged

Pages: [1] Print 
« previous next »
 

Vaguely similar topics! (3)

I made a themable last.fm status embed thing

Started by vvinrgBoard ♺ ∙ Web Crafting Materials

Replies: 1
Views: 1118
Last post January 05, 2025 @724.72
by moos
Archiving for source referencing (electronic references) purposes

Started by MemoryBoard ⛽︎ ∙ Technology & Archiving

Replies: 1
Views: 821
Last post January 26, 2024 @21.06
by ThunderPerfectWitchcraft
Theme Poll - Do you like the forum theme?

Started by MelooonBoard ⛄︎ ∙ Forum Info & Questions

Replies: 24
Views: 12687
Last post February 12, 2025 @398.38
by invader_gvim

Melonking.Net © Always and ever was! SMF 2.0.19 | SMF © 2021 | Privacy Notice | ~ Send Feedback ~ Forum Guide | Rules | RSS | WAP | Mobile


MelonLand Badges and Other Melon Sites!

MelonLand Project! Visit the MelonLand Forum! Support the Forum
Visit Melonking.Net! Visit the Gif Gallery! Pixel Sea TamaNOTchi