Merry Christmas 2023 everyone!

QuakeIV · Post by **QuakeIV** » Mon Jan 08, 2024 7:12 am

I'd kindof prefer people just outlaw AI or something, instead of doing this nonsense where they try to explain how this was always not allowed. No, this was always not possible, so there was no reason to think about whether its allowed or not.

Post by **Arioch** » Mon Jan 08, 2024 7:23 am

I don't see how it would be possible to "outlaw AI..." I don't think it's practically possible to outlaw technology itself. The practice of using copyrighted data for AI training without permission of the copyright holders, however, is something that could be regulated, and I think it's perfectly reasonable that artists, writers and other content creators should be allow to control whether or not their creations are used in this way. However, the law is not clearly settled on this matter. There are currently several lawsuits in progress against OpenAI and others over copyright infringement, and what is or is not allowed with this new technology will be settled, eventually.

Although I expect that if and when rules are put in place to clarify the copyright laws regarding use of data for AI training, there will always be foreign countries who do not respect copyright law who will continue to do whatever they want with data available on the internet, regardless of what rules are settled on.

Krulle · Post by **Krulle** » Mon Jan 08, 2024 2:27 pm

Indeed.
See copyrightlaws, and how companies moved around.

But still, for "western democracies" I do expect a solution similar to the usage of news abstracts taken from news sites for news aggregators....
So I do expect that OpenAI and others will have to pay a share to the relevant "Collecting societies for collective rights management".
Or have to publish their training materials...
Which is extremely sensitive for business cases.

BTW: still an enjoyable find. Thanks for sharing.
And a (belated) happy new year to you all as well!

QuakeIV · Post by **QuakeIV** » Mon Jan 08, 2024 4:16 pm

There isn't going to be any way to dissect a model and know what its contents are based off of, so its an unenforceable law.

Post by **Arioch** » Mon Jan 08, 2024 6:48 pm

QuakeIV wrote: ↑
Mon Jan 08, 2024 4:16 pm
There isn't going to be any way to dissect a model and know what its contents are based off of, so its an unenforceable law.

If your generator is able to produce recognizable facsimiles of the work of a particular content creator when that creator's name is included in a prompt, then it's kind of hard to claim that you didn't use that creator's work to train your generator.

QuakeIV · Post by **QuakeIV** » Mon Jan 08, 2024 8:39 pm

Yeah good point you’d need to do a fair bit of extra work to try to make sure to break those relationships in the model, a court could conceivably do as you said. It might make it not worth it to illegally use information since de facto these businesses are automatically trusting search engine corollaries as a huge shortcut. If they then have to manually audit that to eliminate incriminating correlations there is a lot less benefit.

TBH it might result in a degree of lawfare trying to trick an ai into producing something subjectively similar despite not actually having used the information in question. I actually kind of side with the ‘outlaw ai’ crowd on this so that might be fine insofar as it could kindof screw them over. Running the models is easy enough but training them is a huge cumbersome operation with a lot of server assets that would be fairly easy to sue out of existence. Existing models would still exist and be fairly easy to run on a consumer grade gpu (insofar as many are actually kept proprietary and secret) but it would become not worth it to make new ones kindof fast.

Post by **Arioch** » Mon Jan 08, 2024 9:06 pm

The point is that if a generator appears to be using infringing material, the offended party can file a lawsuit in which they could subpoena the AI company records and code used for training purposes. All this would be difficult and expensive, but it's possible... perhaps not for an individual, but for class action lawsuits and large plaintiffs like the New York Times (which is currently suing OpenAI). Some of these plaintiffs are large corporations with deeper pockets than the AI company defendants. So an AI company would be taking a substantial risk by flouting the law... if indeed that's how the legal rulings go... which at present is by no means certain. But I think it's very likely that either court rulings or new laws will place limitations on what copyrighted work can be used without permission in AI training.

Between this and the issue of personal data privacy (which is, in some cases, related), the legal system is going to be very busy over the next decade or so sorting things out. Until that happens, it's hard to really say what will be allowed and what will not.

Urist · Post by **Urist** » Tue Jan 09, 2024 7:03 am

My two cents say that there's going to eventually be a market for 'limited-scope' algorithmic art. As in, LLMs that have been carefully and specifically trained *only* on certain sets of available artwork, and with extensive paper trails to prove it. Anything from art that's in the public domain, or where the artist has specifically agreed to have that work of art added to the training pool.

Frankly, the demand for "AI" (and as a roboticist, I do so *hate* that term!) art is so strong that I don't ever see the technology being suppressed.

Krulle · Post by **Krulle** » Tue Jan 09, 2024 1:49 pm

It is impossible.

And to be honest, most "AI" models are very limited in scope anyway, and are usually specifically training using company-owned data to do company-relevant modelling.

General AI are more problematic, legally.

But general AI are for most applications not relevant, as their error margins are far too large to allow "generative content creation"...
General AI are just the most obvious element of how far generative AI models came to be able to fake their "intelligence"....

We'll likely never have "one AI" controlling most aspects of airport systems, but one for the heat-ventilation-building elements, one for security (in general), one for evaluating security risks using the security scanners (possibly feeding their outcome into the general security AI), one for crowd-flow control, one for luggage steering, .....
They are simply easier and faster to train, as well as easier to improve in their segment.

Heck, even my employer uses at least 5 different AI algorithms whose work I use when I use the company software tools.
And each one of them is very specifically tuned for one task, and that task only.
(And one could argue that for the automatic translations, that's not one single AI, but many AI, a different one for each language->language translation.)
And that disregards the likeliness that our security cameras seem to react automatically to people moving around in areas where they should not be, which is likely AI controlled as well.
(So far, except for the language translation models, we don't use generative AI (yet). - Too afraid to get the issues of the generative AI "inventing" sources, like they invent court cases in the US...)

Demarquis · Post by **Demarquis** » Thu Jan 11, 2024 3:06 am

Wouldn't it be easier for AI companies to place restrictions on their LLM's output, rather than the training input? It's not copyright infringement if the customer cant get to see it.

Post by **Arioch** » Thu Jan 11, 2024 4:41 am

Demarquis wrote: ↑
Thu Jan 11, 2024 3:06 am
Wouldn't it be easier for AI companies to place restrictions on their LLM's output, rather than the training input? It's not copyright infringement if the customer cant get to see it.

Band-aids like banning certain keywords are pretty easy to work around, and the content is going to leak out even if it's not specifically specified in the prompt. But... if you're going to ban a keyword in the output, why would you allow that keyword in training? If you have a bunch of source images that are associated with the prompt "Frank Frazetta," that means you knew before you used them that they were Frazetta's work. It's kind of hard to claim ignorance. Aside from that, the AI companies have deals with sites like DeviantArt and ArtStation; they know perfectly well when the images they get from these sites are copyrighted.

Urist · Post by **Urist** » Thu Jan 11, 2024 6:01 am

Demarquis wrote: ↑
Thu Jan 11, 2024 3:06 am
Wouldn't it be easier for AI companies to place restrictions on their LLM's output, rather than the training input? It's not copyright infringement if the customer cant get to see it.

It's also a lot harder to control the output than the input. To use Arioch's example, it's very difficult to write rules/a program that can "take this set of images and highlight any of them that show that they were inspired by any art produced by Frank Frazetta or use his style." Turns out that that's pretty difficult, compared to just "Make sure that no art by Frank Frazetta is in this set of training data."

Another thing to keep in mind is that if (as seems likely) LLM companies are going to be held liable for copyright infringements, then the company needs to be able to prove that Frank Frazetta's art was in no way used to generate their images. That's a lot easier to do if you have filtered your input data, rather than hoping that your output restrictions caught *every* output that might have used some of his art/style.

(Essentially, it *is* copyright infringement if the customer *thinks* they see it and you can't prove them wrong.)

Overkill Engine · Post by **Overkill Engine** » Sat Jan 20, 2024 5:44 pm

Demarquis wrote: ↑
Thu Jan 11, 2024 3:06 am
Wouldn't it be easier for AI companies to place restrictions on their LLM's output, rather than the training input? It's not copyright infringement if the customer cant get to see it.

In addition to the points that others brought up, that would be neutering one of the SELLING FEATURES of the software, is the ability to replicate a requested style.

They are going to want to lease this software to as many as people as possible. Which is only going to be possible when it already comes with as much inputs learned as possible. And the average consumer isn't going to want to do the training inputs themselves. The average human is fucking lazy and can't even be bothered to get off their butt to change the channel if they can't find the remote.

Which leads to two primary outcomes:

So they either need to draw out litigation to avoid the government redefining IP/copy-writes/etc to require an artist's permission as a training source, or at least for long enough to make a bunch of lucre and bail for something else.

OR

They need to do what they should have done from the get-go and obtain those permissions.

Demarquis · Post by **Demarquis** » Sun Jan 21, 2024 3:06 am

Ok, I admit that I am a little confused. My understanding was that copying an artist's style isn't copyright infringement regardless of who or what is doing it (provided, of course, that no one claims that the new image is an original). How would you police style anyway? What counts as close enough or not? What I was talking about was reproducing an actual original image, or an element directly copied from an original image (like Mikey Mouse) and reselling that without a license. Again, what I believe is the case is that producing an entirely different cartoon animal in the Disney style isn't copyright infringement. Am I wrong?

""...take this set of images and highlight any of them that show that they were inspired by any art produced by Frank Frazetta or use his style." Turns out that that's pretty difficult, compared to just "Make sure that no art by Frank Frazetta is in this set of training data."

Right, that's what I'm saying, it isn't clear to me that art which is inspired by Frazetta or his style is copyright infringement. If it isn't, then I'm not sure what the argument is about.

"Essentially, it *is* copyright infringement if the customer *thinks* they see it and you can't prove them wrong."

I'm trying to parse this and I can't. How do you police what a customer sees? How many customers must agree about something to make it legally official? I'm sure some customer somewhere thinks Arioch is ripping off Krazycat, but that doesn't mean he is.

Compare all this to a human artist, who learns by studying previous artists and their styles, then produces their own original art, based on what they have learned. Now, granted that a human artists who then didn't go on to develop their own style would be considered a bad artist, but I doubt anyone would sue them for it.

QuakeIV · Post by **QuakeIV** » Sun Jan 21, 2024 6:30 am

Yea thats kindof what I meant by 'lawfare', you could probably in many cases trick an AI into producing something subjectively similar and get the company that made or provided access to it in trouble. It would be a fairly draconian thing to make it a question of whether the court thought the output looked similar to something it shouldn't be using.

Krulle · Post by **Krulle** » Tue Jan 23, 2024 1:49 pm

QuakeIV wrote: ↑
Sun Jan 07, 2024 2:22 am
[...] You can retain a lawfully obtained copy of something for your own use as long as you don't redistribute it. [...]

For non-commercial, private uses, but an AI selling prompt resolves is not non-commercial, nor a private entity.

QuakeIV · Post by **QuakeIV** » Tue Jan 23, 2024 5:06 pm

The reason I take issue with that interpretation is people look at eachothers art and draw stylistic 'inspiration' all the time, which is apparenly legal. In other words, analzying the artwork, whether in public or in the privacy of your own home, and using your brain to make decisions about what aspects of it you might like to use for yourself, is completely lawful behavior.

Post by **Arioch** » Tue Jan 23, 2024 7:09 pm

Well, human artists can and do sue other human artists for 'drawing inspiration' from their work if it's noticeable and demonstratable. But generative AI is not a human artist and does not have a brain or make decisions, and so even if something is legal for a human artist to do, it does not necessarily follow that it is therefore legal for a corporation's software to do. Hence the current lawsuits.

QuakeIV · Post by **QuakeIV** » Tue Jan 23, 2024 7:12 pm

To be honest I think it’s likely a distinction without a clear and technically definable difference and don’t really like the implications of ‘it’s cool as long as neurons do it’ we have already had various animals neurons isolated in a dish and used for various purposes.

Post by **Arioch** » Tue Jan 23, 2024 7:18 pm

QuakeIV wrote: ↑
Tue Jan 23, 2024 7:12 pm
To be honest I think it’s likely a distinction without a clear and technically definable difference and don’t really like the implications of ‘it’s cool as long as neurons do it’ we have already had various animals neurons isolated in a dish and used for various purposes.

Humans have some legal rights which software (and animals) do not. Whether that should be the case is open to debate, but currently that is the law.

Well of Souls Forums

Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!

Re: Merry Christmas 2023 everyone!