July 24, 2024
Nightshade, the tool that ‘poisons’ data, gives artists a fighting chance against AI

Intentionally poisoning someone else is never morally right. But if someone in the office keeps swiping your lunch, wouldn’t you resort to petty vengeance?

For artists, protecting work from being used to train AI models without consent is an uphill battle. Opt-out requests and do-not-scrape codes rely on AI companies to engage in good faith, but those motivated by profit over privacy can easily disregard such measures. Sequestering themselves offline isn’t an option for most artists, who rely on social media exposure for commissions and other work opportunities. 

Nightshade, a project from the University of Chicago, gives artists some recourse by “poisoning” image data, rendering it useless or disruptive to AI model training. Ben Zhao, a computer science professor who led the project, compared Nightshade to “putting hot sauce in your lunch so it doesn’t get stolen from the workplace fridge.” 

“We’re showing the fact that generative models in general, no pun intended, are just models. Nightshade itself is not meant as an end-all, extremely powerful weapon to kill these companies,” Zhao said. “Nightshade shows that these models are vulnerable and there are ways to attack. What it means is that there are ways for content owners to provide harder returns than writing Congress or complaining via email or social media.” 

Zhao and his team aren’t trying to take down Big AI — they’re just trying to force tech giants to pay for licensed work, instead of training AI models on scraped images. 

“There is a right way of doing this,” he continued. “The real issue here is about consent, is about compensation. We are just giving content creators a way to push back against unauthorized training.” 

Left: The Mona Lisa, unaltered. Middle: The Mona Lisa, after Nightshade Right: AI sees the shaded version as a cat in a robe.

Left: The Mona Lisa, unaltered.
Middle: The Mona Lisa, after Nightshade.
Right: How AI “sees” the shaded version of the Mona Lisa.

Nightshade targets the associations between text prompts, subtly changing the pixels in images to trick AI models into interpreting a completely different image than what a human viewer would see. Models will incorrectly categorize features of “shaded” images, and if they’re trained on a sufficient amount of “poisoned” data, they’ll start to generate images completely unrelated to the corresponding prompts. It can take fewer than 100 “poisoned” samples to corrupt a Stable Diffusion prompt, the researchers write in a technical paper currently under peer review.

Take, for example, a painting of a cow lounging in a meadow.

“By manipulating and effectively distorting that association, you can make the models think that cows have four round wheels and a bumper and a trunk,” Zhao told TechCrunch. “And when they are prompted to produce a cow, they will produce a large Ford truck instead of a cow.”

The Nightshade team provided other examples, too. An unaltered image of the Mona Lisa and a shaded version are virtually identical to humans, but instead of interpreting the “poisoned” sample as a portrait of a woman, AI will “see” it as a cat wearing a robe. 

Prompting an AI to generate an image of a dog, after the model was trained using shaded images that made it see cats, yields horrifying hybrids that bear no resemblance to either animal. 

AI-generated hybrid animals

It takes fewer than 100 poisoned images to start corrupting prompts.

The effects bleed through to related concepts, the technical paper noted. Shaded samples that corrupted the prompt “fantasy art” also affected prompts for “dragon” and “Michael Whelan,” who is an illustrator specializing in fantasy and sci-fi cover art. 

Zhao also led the team that created Glaze, a cloaking tool that distorts how AI models “see” and determine artistic style, preventing it from imitating artists’ unique work. Like with Nightshade, a person might view a “glazed” realistic charcoal portrait, but an AI model will see it as an abstract painting — and then generate messy abstract paintings when it’s prompted to generate fine charcoal portraits. 

Speaking to TechCrunch after the tool launched last year, Zhao described Glaze as a technical attack being used as a defense. While Nightshade isn’t an “outright attack,” Zhao told TechCrunch more recently, it’s still taking the offensive against predatory AI companies that disregard opt outs. OpenAI — one of the companies facing a class action lawsuit for allegedly violating copyright law — now allows artists to opt out of being used to train future models. 

“The problem with this [opt-out requests] is that it is the softest, squishiest type of request possible. There’s no enforcement, there’s no holding any company to their word,” Zhao said. “There are plenty of companies who are flying below the radar, that are much smaller than OpenAI, and they have no boundaries. They have absolutely no reason to abide by those opt out lists, and they can still take your content and do whatever they wish.” 

Kelly McKernan, an artist who’s part of the class action lawsuit against Stability AI, Midjourney and DeviantArt, posted an example of their shaded and glazed painting on X. The painting depicts a woman tangled in neon veins, as pixelated lookalikes feed off of her. It represents generative AI “cannibalizing the authentic voice of human creatives,” McKernan wrote.

McKernan began scrolling past images with striking similarities to their own paintings in 2022, as AI image generators launched to the public. When they found that over 50 of their pieces had been scraped and used to train AI models, they lost all interest in creating more art, they told TechCrunch. They even found their signature in AI-generated content. Using Nightshade, they said, is a protective measure until adequate regulation exists. 

“It’s like there’s a bad storm outside, and I still have to go to work, so I’m going to protect myself and use a clear umbrella to see where I’m going,” McKernan said. “It’s not convenient and I’m not going to stop the storm, but it’s going to help me get through to whatever the other side looks like. And it sends a message to these companies that just take and take and take, with no repercussions whatsoever, that we will fight back.” 

Most of the alterations that Nightshade makes should be invisible to the human eye, but the team does note that the “shading” is more visible on images with flat colors and smooth backgrounds. The tool, which is free to download, is also available in a low intensity setting to preserve visual quality. McKernan said that although they could tell that their image was altered after using Glaze and Nightshade, because they’re the artist who painted it, it’s “almost imperceptible.” 

Illustrator Christopher Bretz demonstrated Nightshade’s effect on one of his pieces, posting the results on X. Running an image through Nightshade’s lowest and default setting had little impact on the illustration, but changes were obvious at higher settings.

“I have been experimenting with Nightshade all week, and I plan to run any new work and much of my older online portfolio through it,” Bretz told TechCrunch. “I know a number of digital artists that have refrained from putting new art up for some time and I hope this tool will give them the confidence to start sharing again.”

Ideally, artists should use both Glaze and Nightshade before sharing their work online, the team wrote in a blog post. The team is still testing how Glaze and Nightshade interact on the same image, and plans to release an integrated, single tool that does both. In the meantime, they recommend using Nightshade first, and then Glaze to minimize visible effects. The team urges against posting artwork that has only been shaded, not glazed, as Nightshade doesn’t protect artists from mimicry. 

Signatures and watermarks — even those added to an image’s metadata — are “brittle” and can be removed if the image is altered. The changes that Nightshade makes will remain through cropping, compressing, screenshotting or editing, because they modify the pixels that make up an image. Even a photo of a screen displaying a shaded image will be disruptive to model training, Zhao said. 

As generative models become more sophisticated, artists face mounting pressure to protect their work and fight scraping. Steg.AI and Imatag help creators establish ownership of their images by applying watermarks that are imperceptible to the human eye, though neither promises to protect users from unscrupulous scraping. The “No AI” Watermark Generator, released last year, applies watermarks that label human-made work as AI-generated, in hopes that datasets used to train future models will filter out AI-generated images. There’s also Kudurru, a tool from Spawning.ai, which identifies and tracks scrapers’ IP addresses. Website owners can block the flagged IP addresses, or choose to send a different image back, like a middle finger. 

Nightshade’s critics claim that the program is a “virus,” or complain that using it will “hurt the open source community.” In a screenshot posted on Reddit in the months before Nightshade’s release, a Discord user accused Nightshade of “cyber warfare/terrorism.” Another Reddit user who inadvertently went viral on X questioned Nightshade’s legality, comparing it to “hacking a vulnerable computer system to disrupt its operation.”

Believing that Nightshade is illegal because it is “intentionally disrupting the intended purpose” of a generative AI model, as OP states, is absurd. Zhao asserted that Nightshade is perfectly legal. It’s not “magically hopping into model training pipelines and then killing everyone,” Zhao said — the model trainers are voluntarily scraping images, both shaded and not, and AI companies are profiting off of it. 

The ultimate goal of Glaze and Nightshade is to incur an “incremental price” on each piece of data scraped without permission, until training models on unlicensed data is no longer tenable. Ideally, companies will have to license uncorrupted images to train their models, ensuring that artists give consent and are compensated for their work. 

It’s been done before; Getty Images and Nvidia recently launched a generative AI tool entirely trained using Getty’s extensive library of stock photos. Subscribing customers pay a fee determined by how many photos they want to generate, and photographers whose work was used to train the model receive a portion of the subscription revenue. Payouts are determined by how much of the photographer’s content was contributed to the training set, and the “performance of that content over time,” Wired reported

Zhao clarified that he isn’t anti-AI, and pointed out that AI has immensely useful applications that aren’t so ethically fraught. In the world of academia and scientific research, advancements in AI are cause for celebration. While most of the marketing hype and panic around AI really refers to generative AI, traditional AI has been used to develop new medications and combat climate change, he said. 

“None of these things require generative AI. None of these things require pretty pictures, or make up facts, or have a user interface between you and the AI,” Zhao said. “It’s not a core part for most fundamental AI technologies. But it is the case that these things interface so easily with people. Big Tech has really grabbed onto this as an easy way to make profit and engage a much wider portion of the population, as compared to a more scientific AI that actually has fundamental, breakthrough capabilities and amazing applications.”

The major players in tech, whose funding and resources dwarf those of academia, are largely pro-AI. They have no incentive to fund projects that are disruptive and yield no financial gain. Zhao is staunchly opposed to monetizing Glaze and Nightshade, or ever selling the projects’ IP to a startup or corporation. Artists like McKernan are grateful to have a reprieve from subscription fees, which are nearly ubiquitous across software used in creative industries.

“Artists, myself included, are feeling just exploited at every turn,” McKernan said. “So when something is given to us freely as a resource, I know we’re appreciative.’ 

The team behind Nightshade, which consists of Zhao, Ph.D student Shawn Shan, and several grad students, has been funded by the university, traditional foundations and government grants. But to sustain research, Zhao acknowledged that the team will likely have to figure out a “nonprofit structure” and work with arts foundations. He added that the team still has a “few more tricks” up their sleeves. 

“For a long time research was done for the sake of research, expanding human knowledge. But I think something like this, there is an ethical line,” Zhao said. “The research for this matters … those who are most vulnerable to this, they tend to be the most creative, and they tend to have the least support in terms of resources. It’s not a fair fight. That’s why we’re doing what we can to help balance the battlefield.” 

Source link