Generation¶
Generate synthetic images from real images using Stable Diffusion + ControlNet.
Condition Extractors¶
Each extractor transforms a real image into a control signal that guides generation:
| Extractor | Description | Requires |
|---|---|---|
canny |
Canny edge detection | : |
openpose |
Human pose estimation | : |
segmentation |
YOLOv8-seg instance segmentation | ultralytics |
mediapipe_face |
Face mesh landmarks | mediapipe |
Usage¶
Python API¶
from ciagen import generate
result = generate(
source="data/real/train/images/",
output="data/generated/",
extractor="canny",
sd_model="fennecinspace/sd-v15",
cn_model="lllyasviel/sd-controlnet-canny",
num_per_image=3,
prompt="a person walking",
seed=42,
quality=30,
guidance_scale=7.0,
)
CLI¶
ciagen generate \
--source data/real/train/images/ \
--output data/generated/ \
--extractor canny \
--sd-model fennecinspace/sd-v15 \
--cn-model lllyasviel/sd-controlnet-canny \
--num 3 \
--prompt "a person walking"
Hydra¶
python run.py task=gen model.cn_use=lllyasviel_canny prompt.base='["a person walking"]'
Prompts¶
Three prompt strategies:
- Fixed prompt : pass
prompt="a person in a park" - Caption-based : set
use_captions=Trueto read per-image.txtcaption files - Vocabulary-modified : set
modify_captions=Truewith a vocabulary template to generate prompt variations
Output¶
Generated images are saved as {original_name}_{index}.png in the output directory. A metadata.yaml file is created with the generation configuration.