Overview

1 Making our First Image: “A damn fine cup of coffee”

This chapter welcomes readers into text-to-image creation with Stable Diffusion by pairing clear prompting with practical tooling. It frames the journey as moving from unpredictable, “wonderful and strange” results toward intentional, repeatable outcomes by learning how to communicate with the model, iterate quickly, and make informed adjustments. The focus is on using the AUTOMATIC1111 WebUI to generate images, while introducing core skills: crafting effective prompts, managing image dimensions and aspect ratios, and controlling randomness for reproducibility.

After a brief orientation to the A1111 interface and the txt2img workflow, the chapter uses a coffee-themed example to demonstrate the mechanics of generation. It shows how to produce many candidates efficiently via batch count (sequential) and batch size (parallel), explains the VRAM trade-offs of larger batch sizes, and emphasizes volume as a simple but powerful tactic for finding strong images. The role of seeds is made explicit: a seed of -1 yields new randomness each run, while fixed seeds enable exact reproduction and fair comparison of setting changes. A short primer on pseudo-random number generators clarifies why seed control is essential for consistent experimentation.

Prompt engineering is presented as an iterative, search-like process where clarity outperforms poetic phrasing. The chapter illustrates how adding concrete details (e.g., scene context) and specifying style descriptors can steer the model away from the uncanny and toward desired aesthetics. It also shows how width and height—constrained to model-friendly multiples—change composition and subject framing, and warns that extreme aspect ratios often degrade results. The overarching workflow is to fix a seed, adjust aspect ratio to suit the subject, generate batches, and refine the prompt with clear, descriptive terms and stylistic guidance—culminating in a satisfying, stylistically coherent image that fits the intended mood.

Stable Diffusion sure can create strange things, let’s try to avoid going too far in that direction.
Image of the upper portion of the A1111 UI.
Entering our prompt into A1111
The initial image created by our prompt: “A damn fine cup of coffee.”
Batch count and Batch size theoretically offer different ways to increase images generated
One configuration for generating 30 images at once.
30 different answer to the prompt “A damn fine cup of coffee”
Setting the value to -1 will give us a ‘random’ seed each time we hit ‘Generate’.
The recycle button will give us the seed we used previously.
The options for setting our Width and Height in the UI.
Images with a 5:3 landscape aspect ratio.
Using the same seed but reverting to 512x512 shows us the impact of aspect ratio and image size.
We can easily swap Width and Height values in A1111.
Images with a 3:5 aspect ratio .
Images with a 3:7 aspect ratio.
Images with a 4:1 aspect ratio.
A poetic prompt does not always yield poetic images.
A straight forward prompt yields more cups of black coffee.
Adding a scene to an image can help provide context.
Choosing a landscape aspect ratio helps display the counter.
Creating surrealistic images.
Images in the style of a wood etching.
I would say that’s a damn fine cup of coffee!

Summary

  • Generating with Stable Diffusion is an iterative process, in which we are constantly revising our settings and prompts.
  • Despite the many ways to improve images, it’s always a good idea to generate a variety of images to see if we find a particular one that stands out to us as pleasing.
  • Our prompts should be clear and descriptive. Giving some context for the object we’re prompting can change the image dramatically. Describing the style of the image can further let us change the feeling of the images we’re generating.
  • The aspect ratio that we use to generate an image can have a major impact on the way the image looks. Consider whether the image you want to create would look better as a square, a landscape or portrait.

FAQ

How do I install and launch AUTOMATIC1111’s Stable Diffusion WebUI (A1111)?Install from the project’s GitHub: https://github.com/AUTOMATIC1111/stable-diffusion-webui. After installing, run ./webui.sh (Linux/macOS) or webui.bat (Windows), then open your browser to http://127.0.0.1:7860. The default tab is txt2img.
Where are my generated images saved in A1111?By default: A1111-root/outputs/txt2img-images/.
What is a “prompt” and why is it so important?The prompt is the natural-language description of the image you want to generate. It guides the model’s output, so clarity, specificity, and iterative refinement are key to getting results you like.
What’s the difference between Batch Size and Batch Count?Batch Size is how many images are generated in parallel on the GPU (uses more VRAM). Batch Count is how many iterations run sequentially. In practice, time differences can be small; choose Batch Size based on available VRAM and use Batch Count to reach your total desired images.
How do seeds work, and how can I reproduce an image?A1111 uses a pseudo-random seed to start generation. Set a specific positive seed to reproduce results (with the same prompt and settings). A seed of -1 means “random.” After generating with -1, click the recycle icon next to Seed to reveal the actual seed used.
Which Width and Height values can I use?Use dimensions that are multiples of 8 (common, reliable steps are multiples of 128: 128, 256, 384, 512, 640, 768, 896, 1024). Larger total pixels (W×H) require more VRAM. Adjust these mainly to control aspect ratio rather than to chase high resolution.
Why do extreme aspect ratios produce strange results?Very wide or very tall ratios push the model outside its comfort zone, often causing distortions or odd compositions. Moderate ratios (e.g., square, 4:3, 3:2, 5:3, 3:5) tend to yield more coherent images.
How can I get closer to “a damn fine cup of coffee” with prompt engineering?Be clear and descriptive (“a cup of black coffee”) instead of poetic or vague. Add context to guide composition (“on a diner counter”). Iterate: generate many images, review, and refine the prompt and settings.
How do I change the style or vibe of the image?Add style descriptors to your prompt, such as “surrealist painting” or “wood etching.” This can avoid uncanny realism and better match a mood or theme.
Should I generate many images or focus on perfecting one?Generate many and curate. Chance plays a role, so producing batches (e.g., 30) helps you pick favorites. When testing changes, keep the seed fixed to see the true impact of your adjustments.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • A Damn Fine Stable Diffusion Book ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • A Damn Fine Stable Diffusion Book ebook for free