A cat on the pizza: what more to capture?

I wrote “An angry golden-shaded shorthair cat sitting on a pizza in a mocha green background in an abstract style”.

It took way longer for Craiyon to process than Dall-E2, which is now open to everyone (I tried a month ago when it just became open to everyone), and lacked different styles of variations for people to choose from. Also, DallE2 captures the emotions of pictures and the definition of “abstract” better. Plus, DallE2 has no ads on the web.
The above pictures are generated by Craiyon, and below are the ones generated by Dall-E2. Feel the difference 🙂


  1. Thanks for the post, Gigi! This is very interesting. I wonder what causes so different outputs in seemingly similar AI models with the exact same prompt. Is it a result of model training differences? Differences in raw data sources? We think AI models will help us solve many of humanity’s problems, but it seems one day we may be saturated by the different AI model choices at our disposal.

  2. Hi Gigi!
    Thank you for the post, I think in Craiyon’s case, they’re having trouble with their image encoder that’s why faces in general come out very weird looking. Nevertheless, it is super interesting to understand why Dall-E2 does not have the same problem, and how different might their profit formula be to support a no ad page.

  3. I’m actually pretty shocked at how well the AI was able to capture exactly what you were going for in this post: I guess the more words you use and the more context you can feed the machine, the more accurate the results will end up being, which makes sense. I wonder what the future applications of this are, as the technology becomes more sophisticated?

  4. Interesting comparisons! Nice to see there are competitions which drive up quality. The ads were also something I noticed while doing this exercise. It’s interesting that the two companies are monetizing their services in such different ways.

