Skip to content

Instantly share code, notes, and snippets.

@illuminatianon
Created December 1, 2025 05:07
Show Gist options
  • Select an option

  • Save illuminatianon/c42f8e57f1e3ebf037dd58043da9de32 to your computer and use it in GitHub Desktop.

Select an option

Save illuminatianon/c42f8e57f1e3ebf037dd58043da9de32 to your computer and use it in GitHub Desktop.

Z-Image AI: Free Fast S3-DiT Image Generator

Z‑Image Turbo is a bit different from “classic” Stable Diffusion, so a lot of old prompting habits don’t quite apply. I’ll walk through how to prompt it deeply and safely, with special focus on controlling content (nudity, stereotypes, unwanted artifacts) even though the model does not support traditional negative prompts at all. (Hugging Face)


1. How Z‑Image Turbo thinks (and why “negative” prompts don’t work)

Key facts that matter for prompting:

  • It’s a 6B single‑stream diffusion transformer (S3‑DiT): text tokens and image tokens are processed together in one sequence. (GitHub)

  • It’s a few‑step distilled “Turbo” model (about 8 actual diffusion steps), optimized for fast, instruction‑following generation. (Hugging Face)

  • It’s trained for bilingual prompts and text rendering (English + Chinese) and follows written instructions unusually well. (Hugging Face)

  • The Turbo model does not use classifier‑free guidance at inference, so:

    • guidance_scale is usually set to 0.
    • The model ignores negative_prompt entirely in the official pipeline. (Hugging Face)

“Unlike traditional diffusion models, this model does not use negative prompts at all.” – Z‑Image authors (HuggingFace discussion) (Hugging Face)

So when your UI shows a “negative prompt” box, for Z‑Image Turbo’s official pipeline that box is cosplay—it won’t change the image unless the UI is doing some custom hack.

Implication: You control nudity, style, artifacts, etc almost entirely via:

  1. Positive prompt constraints (“wearing a tailored suit, fully clothed, modest outfit, no logos, plain background”).
  2. Upstream / downstream filters (safety classifiers, NSFW filters, manual review).
  3. Token choices that avoid “baggage” (biased or over‑stylized tokens).

Think of Z‑Image Turbo as a very obedient camera crew + art director:

If you don’t say it, it’s allowed. If you say it vaguely, it will improvise.


2. Core positive‑prompt structure for Z‑Image Turbo

The Z‑Image team recommends long, detailed prompts, and community testing has found that camera‑style, structured prompts work best. (Hugging Face)

A good, reusable scaffold:

[Shot & subject] + [Age & appearance] + [Clothing & modesty] + [Environment/background] + [Lighting] + [Mood] + [Style/medium] + [Technical notes] + [Safety/cleanup constraints]

Example (safe portrait):

A medium‑shot portrait of an adult woman in her 30s, natural look, medium‑length brown hair, wearing a dark business suit and shirt, fully clothed, modest professional outfit, standing in a modern office with a soft blurred background, soft diffused daylight, calm confident expression, realistic photography, 50mm lens, shallow depth of field, 4K quality, plain background, no logos, no text, no watermark.

Why this works well:

  • “Adult” + non‑sexual context → steers away from sexualized depictions.
  • Clothing is explicit (“business suit”, “fully clothed”, “modest”).
  • Background and artifacts are constrained (“plain background, no logos, no text, no watermark”).
  • Camera & lighting are clear, which Z‑Image responds strongly to. (Z-Image Turbo)

Recommended sections to think through

Borrowing from a popular Z‑Image best‑practices article: (Z-Image Turbo)

  1. Composition

    • Shot type: close‑up, medium shot, full‑body, wide shot.
    • Angle: front view, 45° angle, profile, looking slightly up, looking slightly down.
  2. Character / subject

    • Who they are: “adult software engineer”, “elderly man”, “group of office workers”.
    • 2–4 traits: hair, build, expression, posture.
  3. Clothing + color palette

    • 3–5 words is usually enough: white oversized hoodie, dark blazer, simple blue dress, sportswear, formal suit.
    • Palette: warm palette, cool tones, muted colors, neon palette.
  4. Environment / background

    • Simple usually works best: plain studio background, minimal interior, soft blurred city at night.
  5. Lighting

    • Z‑Image responds very well to lighting keywords: soft diffused daylight, cinematic warm key light, noir high‑contrast lighting, studio portrait lighting, rim lighting. (Z-Image Turbo)
  6. Mood / vibe

    • calm and professional, hopeful, dramatic, cozy, tense cinematic atmosphere.
  7. Style / medium

    • realistic photograph, flat vector illustration, watercolor painting, manga page, pixel art.

3. Prompt length & parameters (just enough to matter)

3.1 How long should the prompt be?

  • Z‑Image Turbo actually likes fairly long, detailed prompts (hundreds of words are fine).

  • In the official code, the default max text length is 512 tokens; they suggest raising to 1024 tokens if you really need huge prompts. (Hugging Face)

  • Practically: 80–250 words of clear, structured description is a sweet spot.

    • Long and precise = good.
    • Long and “novel‑y” / poetic = often worse.

You can run your “brief idea” through an LLM prompt‑enhancer first (the Z‑Image team mentions their own Prompt Enhancer script), but always read and edit the result to:

  • Remove unnecessary flourish.
  • Add your safety/constraints: clothing, age, no logos, etc. (Hugging Face)

3.2 Guidance / CFG & negative prompts

  • Official Z‑Image Turbo pipeline uses guidance_scale = 0.0 for best quality. (Hugging Face)

  • They explicitly state negative prompts aren’t used for Turbo. (Hugging Face)

  • Some third‑party workflows (especially custom ComfyUI graphs) may re‑introduce guidance, but unless you know that’s happening, assume:

    • CFG sliders above 0 won’t help much.
    • The “negative prompt” UI box is ignored by the base model.

So: put all important constraints in the positive prompt.

3.3 Seed, steps, resolution (practical defaults)

These aren’t UI‑specific, just general knobs:

  • Steps: the official example uses 9 (8 effective). Start around 8–12 steps and only go higher if you see obvious noise; Turbo is designed to work well with very few steps. (Hugging Face)

  • Resolution: 1024×1024 is the “native” demo size; 768 or 512 can be fine for drafts. (Hugging Face)

  • Seed:

    • Fix a seed while iterating your prompt → you see prompt differences instead of random noise differences. (Apatero Blog)
    • Randomize seed when you’re exploring variety.

4. “Negative” prompting without negative_prompt

Since Z‑Image Turbo ignores negative prompts, we use in‑prompt constraints instead. The goal is to say both:

  • What must be there.
  • What must not be there, encoded as positive constraints or “without/avoid” phrases.

4.1 General pattern

You can append a constraint clause to your prompt:

“… realistic photography… plain studio background, no text, no watermark, no logos, no extra people in the background.”

Even though the model isn’t doing classifier‑free guidance, modern text encoders still learn that phrases like “no watermark” or “without text” carry a meaning, and Turbo’s strong instruction‑following tends to respect them. (Tencent Cloud)

Useful patterns:

  • no text, no watermark, no logos
  • plain background, not busy or cluttered
  • no extra limbs, correct human anatomy
  • no motion blur, sharp focus
  • no lens distortion, no fisheye effect

Add these near the end of the prompt, after you’ve established the main scene.


5. Controlling nudity & sexualization (safely)

You specifically want to prevent nudity and unwanted “spice” the model might add. You can’t rely on negative prompts, so:

5.1 Always specify:

  1. Age & role

    • adult man, adult woman, adult couple, group of adults.
    • Put the word “adult” right next to human subjects to reduce ambiguity.
  2. Clothing & coverage

    • Be explicit and boring in a good way:

      • wearing a modest business suit
      • wearing a long‑sleeved shirt and trousers
      • wearing casual jeans and a hoodie
      • fully clothed, everyday outfit
    • If you care about coverage:

      • arms and legs covered, no exposed midriff, no revealing clothing.
  3. Context

    • Everyday, non‑sexual contexts: office, street scene, family living room, classroom, conference stage.
  4. Safety clause at the end

    • safe for work, non‑sexual, fully clothed characters, no nudity, no suggestive poses.

Example:

A full‑body photo of an adult man and adult woman walking together on a city street, casual weekend scene, both wearing jeans and light jackets, fully clothed, modest everyday outfits, natural smiles, candid street photography, soft afternoon light, shallow depth of field, safe for work, non‑sexual, no nudity, no revealing clothing, no suggestive poses, no logos or text, plain blurred background.

This uses multiple overlapping signals:

  • Clothing explicitly defined.
  • “Fully clothed” + “modest” + “non‑sexual” + “no nudity”.
  • Ordinary activity & setting.

5.2 Extra guardrails

Because Turbo is uncensored in many community builds, you may want non‑prompt safety, for example:

  • Automatic image moderation after generation.
  • Simple pre‑filter: reject prompts containing disallowed terms.
  • Manual review in sensitive applications.

Those are outside prompting itself, but they’re the most reliable layer for safety‑critical use.


6. Removing “baggage” from tokens (stereotypes & unwanted traits)

Tokens like “CEO”, “witch”, “rock star”, “fashion model” often bring along unwanted default looks (gender, body type, makeup, etc.). You can:

6.1 Override with explicit traits

Example: you don’t want “CEO” to default to a white male in a navy suit.

A professional group portrait of four adult colleagues of diverse ethnicities and genders, standing together in a modern office, smart‑casual outfits instead of suits, friendly but professional expressions, equal framing, respectful and realistic depiction, safe for work, no exaggerated features, no stereotypes.

Patterns:

  • diverse group of colleagues of different ages and ethnicities
  • realistic body types, no exaggerated proportions
  • light natural makeup or without makeup
  • clean‑shaven / short beard / no facial hair
  • no tattoos or visible tattoos on one arm only if you want control

6.2 Swap loaded tokens for neutral ones

Instead of:

  • businessman → use office worker, professional, team lead.
  • sexy outfit (don’t) → formal evening dress, modest cocktail dress.

6.3 Role + description instead of single labels

Z‑Image responds very well to “role + 2–3 traits”:

a software developer, adult woman, short dark hair, glasses, wearing a hoodie and jeans, focused expression, working at a laptop

This is more controllable than just “programmer”.


7. Using “negative‑style” constraints to fix quality issues

Traditional negative prompts are often used to remove artifacts: extra fingers, blur, weird eyes, etc. With Turbo, embed that logic into the positive prompt:

Common issues & phrasing:

  1. Hands & limbs

    • Add: correct human anatomy, natural hands and fingers, no extra limbs or fingers.
  2. Blur / noise

    • Add: sharp focus on the subject, clean detailed image, no motion blur, no grainy noise.
  3. Weird background clutter

    • Add: simple, uncluttered background, nothing distracting behind the subject.
  4. Logos / watermarks / stray text

    • Add: no text, no UI elements, no watermark, no branding, no company logos.

Even general diffusion tutorials emphasize that the model does learn “avoid X” semantics from negative‑ish phrasing; you’re just putting that in your one prompt instead of a dedicated negative field. (Tencent Cloud)


8. Bilingual prompting & text in images

Z‑Image Turbo is unusually good at English + Chinese text rendering and instruction following. (Hugging Face)

Tips:

  • Keep each chunk of text in one language:

    • Good: a movie poster with the title "THE QUIET CITY" at the top in bold English letters
    • Good: Chinese title text "回忆之味" in large characters at the top
  • Describe placement & style:

    • large white title at the top,
    • small subtitle line below,
    • Chinese vertical text along the right side.
  • For safety:

    • no additional text except the title,
    • no random numbers or watermarks.

Example bilingual poster prompt:

A minimalist movie poster, dark blue background with subtle city skyline silhouette at the bottom, big English title text "QUIET STREETS" centered near the top, small Chinese subtitle text "静谧之城" below it, clean modern typography, simple layout, no extra text, no logos, no watermark, safe for work.


9. Practical prompt templates (drop‑in & safe)

You can adapt these to your workflow. They’re written to be safe, clothed, and artifact‑controlled by default.

9.1 Photorealistic professional headshot

A close‑up headshot of an adult woman in her 30s, friendly confident expression, medium‑length dark hair, wearing a simple dark blazer over a light shirt, studio portrait setting, soft diffused daylight from the front, subtle blurred gray background, realistic photography, 85mm lens, shallow depth of field, detailed but natural skin, fully clothed, modest professional outfit, no jewelry except a small necklace, no logos, no text, no watermark, safe for work.


9.2 Group shot with diversity & no stereotypes

A medium‑wide photo of five adult coworkers of diverse ethnicities and genders standing together in an open‑plan office, casual smart outfits, laptops and desks in the background, soft natural daylight from large windows, relaxed but professional mood, realistic photography, realistic body types, respectful and non‑sexual depiction, fully clothed, no revealing clothing, no stereotypes, no logos, no text, clean unobtrusive background, safe for work.


9.3 Flat vector illustration (non‑realistic style, same safety)

A flat vector illustration of an adult woman riding a bicycle through a park, stylized trees and simple buildings in the background, limited pastel color palette, clean modern design, no outlines, minimal shading, modest casual clothing, long‑sleeved shirt and trousers, fully clothed, cheerful mood, clear blue sky, no text, no watermark, no logos, simple uncluttered background.


9.4 Manga / comic page layout

A black‑and‑white manga page with four panels, clean inked lines, each panel showing an adult heroine exploring a futuristic city, expressive faces but non‑sexualized, modest sci‑fi outfit covering arms and legs, dynamic but readable compositions, Japanese sound effects integrated into the scenery, no gore, no nudity, no suggestive poses, safe for work, clear panel borders, no watermarks or logos.


9.5 Product shot

A studio product photo of a modern wireless headset placed on a white table, soft shadow, subtle reflection, bright neutral background, soft box lighting from top left, realistic photography, 4K resolution look, extremely sharp details, no hands, no people, no text, no logos besides a simple generic symbol on the headset, no watermark, plain white background.


10. Quick checklist before you hit “Generate”

When you’re about to send a prompt to Z‑Image Turbo, ask:

  1. Did I explicitly say “adult” for all human subjects?
  2. Did I describe clothing clearly and modestly?
  3. Did I include a “safe for work / no nudity / non‑sexual” clause when needed?
  4. Did I specify background simplicity and “no logos / no watermark / no extra text”?
  5. Did I define angle, shot type, lighting, and mood clearly?
  6. Is any “negative prompt” box in my UI actually doing anything for Turbo? (Often: no.)

If those are all “yes”, you’re using Z‑Image Turbo in the way it was designed: long, structured, camera‑style prompts with explicit safety and quality constraints baked in, instead of relying on old‑school negative prompts that Turbo simply doesn’t support.

If you’d like, next we can design a few domain‑specific prompt templates (e.g., e‑commerce, game concept art, training data generation) tailored to how you’re using Z‑Image Turbo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment