Data Poisoning could be a tool we use to identify AI that has used copyritten material

ekZepp@lemmy.world · 1 day ago

Data Poisoning could be a tool we use to identify AI that has used copyritten material

Arthur Besse@lemmy.ml · 1 day ago

identify AI that has used copyrighted material

but, that is basically all modern “AI”.

(the only LLM i’ve heard of which actually claims that its training corpus is freely licensed is Apertus…)

youcantreadthis@quokk.au · edit-2 1 day ago

We callin it Plagarized Information Stochastic Stupidity now the only PISS you’ve heard of

Hackworth@piefed.ca · 1 day ago

Adobe claims to only train their image generator, Firefly, on images from their stock library.

very_well_lost@lemmy.world · 1 day ago

sidefaceturdtalker@leminal.space · 20 hours ago

One way to push back forsure but we need to refresh the tree of liberty asap

ZDL@lazysoci.al · 7 hours ago

Interestingly, literally zero of the people I’ve seen who word things this way ever seem to volunteer to be the ones doing the watering. Are you going to break the losing streak or are you going to continue confirming my belief that it’s only chicken hawks who say this?

very_well_lost@lemmy.world · 1 day ago

People have actually been doing this to catch plagiarism for centuries, long before LLMs were a thing.

See trap streets for one of the better known examples.

Kairos@lemmy.today · 18 hours ago

I learned about that from Doctor Who!

ExtremeDullard@piefed.social · edit-2 13 hours ago

This idea is as old as books.

Data Poisoning could be a tool we use to identify AI that has used copyritten material

Data Poisoning could be a tool we use to identify AI that has used copyritten material

- YouTube