An OpenAI Tool Will Detect Images Generated by Its DALL-E 3 System

May 20, 2024

IBL News | New York

OpenAI unveiled that it’s developing a tool that detects 98% of images generated by its text-to-image generator DALL-E 3 system. The success drops if the images are altered.

The tool, called Media Manager, will be in place by 2025. Currently, the company is working with creators, content owners, and regulators toward a standard.

Media Manager seems to be its response to growing criticism of the approach to developing AI that relies heavily on scraping publicly available data from the web.

“This will require cutting-edge machine learning research to build a first-ever tool of its kind to help us identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences,” OpenAI wrote in a blog post.

Recently, eight U.S. newspapers, including the Chicago Tribune, sued OpenAI for IP infringement, accusing OpenAI of pilfering articles for training generative AI models that it then commercialized without compensating or crediting the source publications.

OpenAI last year allowed artists to opt out of and remove their work from the data sets that the company uses to train its image-generating models.

The company also lets website owners indicate via the robots.txt standard, which gives instructions about websites to web-crawling bots. OpenAI continues to ink licensing deals with large content owners, including news organizations, stock media libraries, and Q&A sites like Stack Overflow. Some content creators say OpenAI hasn’t gone far enough, however.

A number of third parties have built opt-out tools for generative AI. Startup Spawning AI, whose partners include Stability AI and Hugging Face, offers an app that identifies and tracks bots’ IP addresses to block scraping attempts. Steg.AI and Imatag help creators establish ownership of their images by applying watermarks imperceptible to the human eye. Nightshade, a project from the University of Chicago, poisons image data to render it useless or disruptive to AI model training.
.

Latest News