Hacker News
← Back

This Image Does Not Exist

1 year ago/135 comments/thisimagedoesnotexist.com
2 years ago by davidbarker

I scored 27/30, with around 12 being human-created.

I've been keeping a close eye on DALL-E 2, especially images posted on Twitter and Reddit[0], and the biggest sign that it's a DALL-E 2 image to me are the edges of objects not being as crisp as you'd expect in a human-created image.

The composition that DALL-E 2 creates is typically fantastic, but it isn't perfect when it comes to the edges, details, and textures (fabric, for example).

I still think it's incredible for inspiration — I don't think a typical artist/designer will use DALL-E 2's output as the final image, but rather as a great starting/mid point for their final image. I can see myself using DALL-E 2 to generate something akin to a wireframe that I'll then go in and add the details to manually.

[0] https://www.reddit.com/r/dalle2/

2 years ago by notahacker

Biggest signs for me were the edges you mentioned, compositions that werent technically bad but were visually uninteresting (and had minor logical flaws like missing details in a reflection), but also the subject matter is the sort of thing that could be described in a succint text prompt designed to challenge a machine (I actually guessed a couple of the prompts, and "tessellated Hello Kitty" and "Origami car" are less likely to be human art concepts than challenges for machines). I'm browsing on a phone so some of the fine detail is lost.

The fact that I'm using the subject matter as much as the image itself to discern AI art is a sign of how good the tech is though. Especially since the subject matter in question is typically "stuff that would be really challenging for an AI to conceptualise" and the image itself (not necessarily the AI's only attempt of course) is a sound attempt at rendering it.

I see most of the utility as the opposite: the artist still needs to come up with the original concept, the overall compositions are unlikely to be better than an artist would conceive, but even in the hands of a non-artist a bit of text can produce "good enough" images (that can be cropped or touched up to remove rough edges) instantaneously which would take a lot of time for a skilled artist to produce.

2 years ago by linux2647

> are less likely to be human art concepts than challenges for machines

Counter-example: there was one image I got that was a Gameboy as a concrete brick. That seems like a prompt you’d give DALL-E 2, but it was actually of human creation

2 years ago by cowtools

Yeah, but I guessed that one correctly because that sort of art (taking symbolism like a game-boy and putting it in a "cool" medium) is pretty common on the internet already.

It makes sense that someone would think to make that because a gameboy is basically already a grey brick. So it's really just a change of texture- it's a comparison of two things that look similar that a person would probably make but an AI might not. it has some artistic intent.

The other DALL-E prompts are just "a <random subject> doing <random thing> in <random medium>", like "a dog building a fence in watercolor".

Would it make sense for a person to paint a dog building a fence in watercolor? Probably not unless they're writing a children's book.

2 years ago by sandworm101

Biggest/best tell for me is not the image but the subject matter. The images with some interesting twist always come from the humans. For instance one image shown to me had human hands at a laptop, but they were coming through the screen to type. That is playing with human perception, twisting reality to make an artistic point (the self-typing laptop where the user comes from inside the machine). I've never seen an AI-generated image that projects any artistic depth. They are beautiful and sometimes funny, but never interesting.

Of course most human-created images are equally shallow, but if you do see depth then you know it came from a human. Put this test to an art historian/appraiser or architect and they would probably score very well.

2 years ago by hn_throwaway_99

This was my "tell" as well. I don't know a ton about DALL-E, but it seems like most of the pictures are in the style of "show X doing Y in the style of Z": "Show a bear carrying a pizza in a realistic style". The composition of the DALL-E ones is almost "random association", e.g. acid-trip type associations. For each image I'd ask myself "What purpose would a human have for creating this?" and it usually helped get me the right answer.

2 years ago by sandworm101

But what if the human is tasked with creating an image that looks like it was created by a machine? If you gave human artists the same taskings (X and Y in style of Z) then they would also create a bunch of purposeless images. That is the dataset I would create if I wanted to promote my AI project.

2 years ago by taylodl

The human-created images had something interesting to say, whereas the computer-generated images did not. I do find this to be an interesting variation of the Turing Test.

2 years ago by probably_wrong

I made it "honestly" until photo 18 (where I got a score of 17/18), then I realized that all photos were auto-generated and spammed the "robot" button. Apparently there was a single human-generated picture in my set.

> The composition that DALL-E 2 creates is typically fantastic, but it isn't perfect when it comes to the edges, details, and textures (fabric, for example).

For me the trick for discerning them is asking "what's the purpose of this detail?", namely: if a figure is writing, what are they writing? If a figure is wearing a shirt, does it look like a shirt I've ever seen? Why are they standing the way they are standing? And what material is this statue made of?

2 years ago by jorgesborges

There's me trying to evaluate intent, context, talent -- when instead I should look for blurry edges. I imagine a future when AI is so sophisticated and human-like they're granted all the privileges of living, sentient beings but they're still clumsy with a knife and fork and they fart too often in public.

2 years ago by bjackman

I'm slightly surprised to report a 100% score. However, I think I'm actually recognising DALL-E-2's _style_, and not "this picture is CG". It feels more like I'm identifying the work of an artist I'm very familiar with, rather than identifying fundamentally different categories of creator.

I think all it would take to "defeat" me here would be greater variety of high-quality image generation AIs.

I expect in a few years that will be the case, unless the "style" of DALL-E-2 is something fundamental, instead of an artifact of its specific implementation.

2 years ago by properparity

If you open up the image in a new tab to see the full resolution and look at fine details it's very easy to spot AI ones - AI seems to especially struggle on reflective details, it obviously doesn't know it's meant to be a reflection, so it gets details subtly wrong or weird in the mirror image.

Also anything mechanical is just way off on AI generated things - a car with the fuel door in the middle of the passenger door, a missing tire but you see no axle and nothing makes mechanical sense.

2 years ago by sva_

Also, fucked up hands/fingers/letters and such details often give it away

2 years ago by kgwgk

Interestingly these are the kind of things -fingers, writings, reflections- that also betray the “generative” nature of dreams.

https://lucid.fandom.com/wiki/Reality_check

2 years ago by fartcannon

But still, its pretty miraculous.

2 years ago by kuprel

Also Dalle makes square images so if it’s not square it’s human made

2 years ago by vmception

On my phone this was indiscernible

I can see DALL-E-2 stuff spreading around WhatsApp groups like wildfire, masquerading as something else

2 years ago by SiempreViernes

Even now just selecting Dalle propmpts that output things that look like basic ray tracer tests will probably make it undecidable for anyone not bothering to recrating the scene and chech that it's physically coherent.

2 years ago by Out_of_Characte

well, Theres still plenty telltale signs of AI handwork here. Quasi-faces on places no human would place them or would articulate a detail much better. edge coherence is also very important. Seeing smudges of paint where it doesn't belong at all, lines no human would be able to make and errors only humans make but where AI struggles with.

2 years ago by SiempreViernes

My point is that Dalle can do things like the Cornell box pretty much perfectly, so if you want to make a test where it's impossible to decide who made it you should sample those type of prompts.

2 years ago by pvarangot

If enough humans on the pool this webpage gets its images from start copying DALL-E-2 effectively that would also "defeat" you.

2 years ago by lelandfe

I am failing spectacularly, love it.

Also - please speed up the delay switching to a new image! I bet this would feel way better if the delay was in the 0-0.5s range instead of multiple seconds. (I bet it’s some runaway JS, investigating…)

Edit: `changeImage` is blocked by `await saveDB`, and the POST is 500ing. (e.g. uBlock makes that AWS request fail fast – why desktop people are less affected)

Maybe use `Promise.allSettled` to make them parallel, and prepare the data being sent beforehand to avoid state race conditions :) thanks for leaving sourcemaps on!

2 years ago by capableweb

> Also - please speed up the delay switching to a new image!

I came here to suggest the opposite, as I didn't have time to read the full prompt which generated the image, when it was a generated picture.

So I guess the conclusion is instead: allow users to click a second time on any of the buttons/picture to proceed at their own leisure, so the ones who want to read can read, and the ones who want to speed through it, can do that too.

2 years ago by lelandfe

Nice, that’s even better :) add some keyboard controls (maybe left for artist, right for robot?, and then either key to advance) and folks can zoom through!

2 years ago by jcims

Only tangentially related, I assembled a children's book made by GPT-3 and Midjourney, which is a generative art collaboration that is along the lines of DALL-E.

(Edit: see JacobiX reply, looks like it robbed the concept, possibly because I didn't give it any direction.)

I just prompted GPT-3 (davinci-002 engine on openai) to write a children's book and include illustration suggestions. I didn't even suggest a topic, just said 'children's book'. I fed the illustration suggestions into Midjourney with the qualifier of 'children's book illustration of ...' and pulled the first image of each (it always generates four).

I then asked it for a cover page, last page and a background image for the text.

I assembled the story text written by GPT-3 and images rendered by Midjourney in this google slides preso. The only real creative input i had was the font. I did iterate on the text background image prompt a couple of times to find something that would work:

(Note this link works in incognito if you prefer not to have your google login associated)

https://docs.google.com/presentation/d/e/2PACX-1vT4XWNx2SdEg...

I've been having a lot of fun with this prompt. Part in asterisks is where you set the topic. The last 'Title of the book:' is where you submit, it really helps keeping the generative part on track.

----

    Write a four page children's book with the following:
    
    Title of the book:
    Description of the cover page with enough detail for an illustrator to draw:
    Description of the background imagery used for the pages with text on them:
    First page text in quotes:
    Description of the opposite page illustration:
    Second page text in quotes:
    Description of the opposite page illustration:
    Third page text in quotes:
    Description of the third page illustration:
    Fourth page text in quotes:
    Description of the fourth page illustration:
    Description of the final page:
    
    The subject of the children's book is *how to prevent buffer overflows in c code*.

    Title of the book:
2 years ago by JacobiX

It should be noted that there is a real children's book with the same name and a very similar content: The Very Lonely Firefly by Eric Carle.

For this particular case, it seems like GPT-3 is paraphrasing the story ...

2 years ago by jcims

Interesting. I thought it sounded familiar. Guiding it with a subject is probably a better way to avoid that kind of thing.

Here's what resulted from my prompt above for a children's book about buffer overflows - https://docs.google.com/presentation/d/e/2PACX-1vSp7wc1xCgiv...

Doesn't always work hahaha.

2 years ago by jcims

Feeding it a suggested topic to avoid derivative works.

Input:

The subject of the children's book is how to autorotate a helicopter after its engines have failed.

Title of the book:

Output:

"Autorotating a Helicopter"

Description of the cover page with enough detail for an illustrator to draw: The cover will feature a large illustration of a helicopter in the air, with the text "Autorotating a Helicopter" in big, bold letters.

Description of the background imagery used for the pages with text on them: The pages will have a white background with black text.

First page text in quotes: "If you're ever in a helicopter and the engines fail, don't panic! Just autorotate."

Description of the opposite page illustration: The illustration on this page will show a helicopter with its engines on fire, falling towards the ground.

Second page text in quotes: "To autorotate, pull back on the cyclic and maintain a level attitude."

Description of the opposite page illustration: The illustration on this page will show a helicopter with its engines on fire, but the blades are still spinning.

Third page text in quotes: "To land, flare the helicopter just before touchdown."

Description of the third page illustration: The illustration on this page will show a helicopter landing safely on the ground, with the blades still spinning.

Fourth page text in quotes: "Remember, practice makes perfect! So if you ever find yourself in a helicopter with failed engines, you'll be prepared."

Description of the fourth page illustration: The illustration on this page will show a helicopter in the air, with the engines on fire.

Description of the final page: The final page will feature a large illustration of a helicopter landing safely on the ground, with the text "The End" in big, bold letters.

a few possible illustrations - https://ibb.co/c2ZYQps

2 years ago by undefined
[deleted]
2 years ago by soared

Now I'm tempted to make a dynamic book with a different story each time you open the cover.

2 years ago by jcims

This is something that I've been thinking of as well. GPT-3 can be a bit unruly in its output but if you execute ten prompts you'll get something parsable 6 times.

2 years ago by TheLongLife

24/30 You can definitely recognize DALL-E's style of mashing two elements together. Most of the time it will look smudgy at the points where two images are combined. Also, if there are any eyes in the image it will mess them up in some way, almost every time.

There is also some kind of artifact on DALL-E's images which makes gradient areas look very low color depth.

Hardest ones to get right are low quality images and images of abstract artwork or the ones that are made in a "smudgy" style already.

It is pretty good overall but I would say that it's almost impossible to fool anyone if the images were in high resolution and you could zoom them in.

It's probably most effective in small images like thumbnails for example.

It's kind of spooky to think how good it will be in a few years and what it will be used for.

2 years ago by jphoward

Is it me or is this very depressing and scary? I always assumed the creative industries would be relatively resistant to automation, with more algorithmic jobs more easily replaceable. But with this you never need to buy stock pictures ever again, right? I saw several stock-picture-esque images of indistinguishable quality, and yet with incredible specificity to the user-supplied description.

2 years ago by Gigachad

I thought this way originally but I have realised these tools are being used right now to massively amplify the creative output of individuals. AI text to voice tech is being used for a lot of hilarious content right now. I saw an image upscailer ai used in a 1 week project to rebuild a ps2 game to make it look modern.

There is a long runway of human guided ai generation. Imagine if a single human was able to produce an entire animated TV show and all it took was good story telling skills without requiring 100 animators.

2 years ago by elephanlemon

This is the right way to think about this IMO. Wouldn’t it be amazing if tools like this massively increased the scope of what is possible for a single artist?

Something I think about a lot is video game creation - what if AI tools let you quickly create 3D objects and a world without a big, dedicated team of artists?

(Check this out by Nvidia! https://m.youtube.com/watch?v=5j8I7V6blqM)

2 years ago by lelandfe

Humorously, the narration on that video also sounds like an AI. The rising inflection on nearly every clause almost sounds like they’re recording every sentence separately

2 years ago by antiterra

It’s similar to how photography grants anyone the technical ability to render a photo-realistic image, but the majority of photos we see are easily discarded from memory. It did put painters out of the portrait business, but also freed them from a realist ideal.

2 years ago by andybak

I find it amazing and exciting - but then I don't earn my living as an illustrator.

I suspect there was similar wailings and gnashings of teeth when photography was invented. Humans adapt and it's us that decides what to value. We are quite fond of other people so tastes will alter to preserve that balance.

2 years ago by bigDinosaur

There was a clear career path for anyone out of a job due to photographs though. Become a photographer. Which was a perfectly fine career for a very long time. It still is, really, despite the high quality cameras available to many people, because it requires artistic and technical skill.

The career path for an artist trying to find something else while AI is developing at this pace is...less clear, and definitely scarier.

2 years ago by notahacker

Photography didn't replace visual art though (which is just as well for painters and sculptors really, the skill needed to operate an early camera was pretty orthogonal). A subset doing the grunt work of simply recording details for prosperity had a competitive alternative, but even in the subset of image production where photorealism is considered desirable, people still commission painters for "lifelike" portraits and Canaletto renderings of Venice are still worth a fortune despite a near infinitude of similar images on Instagram.

Same goes for this really, but perhaps even more so. It's impressive at producing eye-catching photos quality images of simple concepts you might use in advertising campaigns, and can do a good job of reproducing well defined cartoon styles. But the artists working in those sort of roles are already working digitally and using image libraries, and this is a massive productivity boost for them - dozens of "$brand on beach" concept images for no effort at all, and they'll still need all the retouching and layering and colour balancing and combining with other images. And as soon as the brief gets more precise about details of layout and colour scheme or what they want the model to look like, they're either back to their original creative flow or designing text prompts and curating images for the brief is an advanced skill in itself. Non artists can produce "good enough" images too (looks great for memes), but much like non-artists wielding cameras they're mostly satisfying their own whims rather than competing with the business of art

2 years ago by eternalban

Conceptual art remains a possibility.

2 years ago by antiterra

The quality is not indistinguishable to any deliberate observation. Plus, I think people will begin to casually notice how off it is once they become ‘trained’ as to the quirks of these images. That is, unless the quality of the generated images continues to steadily improve.

2 years ago by Gigachad

Every time I think I have come to understand the limits of ML generated content, I see something a few months later that completely surpasses what I believed possible.

2 years ago by TaupeRanger

Creative industries are still safe from automation. This doesn't do anything except generate nice stock art or album covers. Unless you're in those industries, there's nothing to worry about. It can't even do text.

2 years ago by wubbert

>It can't even do text.

An unpaid intern adding text to AI generated images would be trivial.

2 years ago by leokeba

The only human-made images I got (2 out of 30) were also the only ones that were not square. Given that most GANs only do square images, human made ones should be square also if we want this to be fair. And as some others have said, DALL-E 2 really seems to have a "style" that's quite easy to spot once you've seen a few images already, so even without the square hint I think would have gotten close to 30/30. But I would like to see the results on a better crafted dataset with "naive" people, like a kind of visual Turing test.

2 years ago by can16358p

Nice test but it would be great if it didn't wait forever to pass to the next image after answering.

2 years ago by bambax

Yes, the site is so slow it's unusable. Maybe the result of being on the front page of HN, but still; it could all work client side.

2 years ago by hk__2

Indeed; all the responses are already in the JS file: https://thisimagedoesnotexist.com/static/js/main.dddea102.ch...

2 years ago by carvking

yub, I waited 6 forevers and then came here to write this.

Daily Digest

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.