1 Making our First Image: “A damn fine cup of coffee”
The chapter introduces Stable Diffusion through an analogy to early hip‑hop: like DJs remix familiar records into something new, the model recombines learned visual concepts to produce novel images. It frames generative imaging as a playful way to realize ideas we’ve long doodled in notebooks, emphasizing that while first results come quickly, mastery takes experimentation, technique, and iteration. The open‑source nature of Stable Diffusion empowers users to run, customize, and extend the system on consumer hardware, supported by a fast‑evolving ecosystem of community tools and methods.
Readers are guided through creating their first text‑to‑image outputs with the prompt “A damn fine cup of coffee,” learning core controls and trade‑offs. The chapter explains iterations versus batch size (time, VRAM, and practicality), the role of seeds in reproducibility (default 42) and variety, and how image dimensions—especially aspect ratio—meaningfully shape composition and content. A key habit is to generate many candidates and curate the best, treating the model less as an all‑imaginative artist and more like a search engine over possible images, responsive to precise wording and settings.
Prompt engineering fundamentals follow: prefer clear, descriptive phrasing over poetic ambiguity; add scene context to steer composition; and specify stylistic cues (for example, surrealist painting, wood etching) to avoid the uncanny valley and evoke desired aesthetics. The chapter also demonstrates the power of open source by exposing the built‑in NSFW safety check that can replace outputs with a placeholder image and shows how to optionally disable it in code—underscoring user agency. It concludes by reinforcing an iterative workflow of prompt refinement, parameter tuning, and selective curation as the path to images that truly match one’s intent.
Getting my imagination on the screen
Browsing an infinite library of Pulp Sci-Fi that never was.
Who knew monks were such avid readers of sci-fi?
Envisioning ancient aliens.
The initial 6 images created by our prompt: “A damn fine cup of coffee.”
Average seconds to create an image, comparing iterations and batch.
Generating 30 images at once.
Creating 6 different images with seed 12345.
Images with a 5:3 landscape aspect ratio.
Images with a 3:5 aspect ratio using the same seed.
Images with a 3:7 aspect ratio using the same seed.
Images with a 4:1 aspect ratio using the same seed.
A poetic prompt does not always yield poetic images.
A straight forward prompt yields more cups of black coffee.
Adding a scene to an image can help provide context.
Choosing a landscape aspect ratio helps display the counter.
Creating surrealistic images.
Images in the style of a wood etching.
I would say that’s a damn fine cup of coffee!
Being “Rick-rolled” by Stable Diffusion.
Summary
- Generating with Stable Diffusion is an iterative process, in which we are constantly revising our settings and prompts.
- Despite the many ways to improve images, it’s always a good idea to generate a variety of images to see if we find a particular one that stands out to us as pleasing.
- Our prompts should be clear and descriptive. Giving some context for the object we’re prompting can change the image dramatically. Describing the style of the image can further let us change the feeling of the images we’re generating.
- The aspect ratio that we use to generate an image can have a major impact on the way the image looks. Consider whether the image you want to create would look better as a square, a landscape or portrait.
- Because Stable Diffusion is open source we (as well as the entire community of users) can change and extend its behavior.
A Damn Fine Stable Diffusion Book ebook for free