How to Tag Images for LoRA Training: Complete Guide (Free Browser Tool)
Training a LoRA that actually works depends less on your GPU and more on the quality of your dataset tags. Most people obsess over training parameters while their tagging is inconsistent, noisy, and missing critical information. This guide fixes that.
Why Tagging Matters More Than You Think
When you train a LoRA, the model learns to associate the tags in your .txt files with the visual content of the paired images. If your tags are wrong or inconsistent, the model learns the wrong associations.
The practical consequences:
- A style LoRA that "bleeds" character features because you forgot to blacklist them
- A character LoRA that won't activate reliably because the trigger word is absent from half the files
- A model that over-fits to pose because every image has
standing, looking_at_viewerin the tags
Good tagging is not optional. It is the foundation your training runs on.
What Are Training Tags?
Tags are plain text labels stored in .txt files alongside your images. Each .txt file has the same base name as its paired image — 000.png gets 000.txt, 001.jpg gets 001.txt, and so on.
A typical .txt file looks like this:
my_style_v1, watercolor, ink_wash, soft_shading, bokeh, upper_body, long_hair, looking_at_viewer, white_background
The tags come from the Danbooru tag system — a large, structured vocabulary developed by the anime image community. Tags cover everything from artistic medium (watercolor, digital_media) to composition (close_up, full_body) to subject details (blue_eyes, short_hair).
Your trainer (kohya_ss, SD-scripts, EveryDream) reads these files during training. The model learns that images tagged with watercolor tend to look a certain way, and gradually adjusts its weights to replicate that style when prompted.
Manual vs Automatic Tagging
You could write every .txt file by hand. For 10 images this is fine. For 100 images it is tedious. For 500+ images it is impractical, and manual tags introduce inconsistency — you will describe the same visual feature five different ways across your dataset.
Automatic taggers solve this. They run a computer vision model over each image and output a ranked list of Danbooru tags sorted by confidence. You review and edit the output rather than starting from scratch.
The gold standard automatic tagger in the Stable Diffusion community is WD ViT Tagger v3 by SmilingWolf — the same model used inside AUTOMATIC1111, ComfyUI, and kohya_ss's built-in captioning tool.
WD ViT Tagger v3 Explained
WD ViT Tagger v3 uses a Vision Transformer (ViT) architecture trained on millions of Danbooru images. It predicts thousands of tags simultaneously and assigns each a confidence score between 0 and 1.
For every image, you set a threshold — tags with confidence above the threshold are included, those below are dropped. The default of 0.35 gives a good balance for most datasets.
The model separates tags into categories:
- General tags — visual content, style, composition, colours
- Character tags — named characters from anime, games, and media
- Rating tags — content rating (safe, sensitive, explicit)
Character tags are held to a higher internal confidence bar to reduce false positives — the model is conservative about claiming it recognises a specific character.
Step-by-Step: Using the Free Browser Tool
You do not need Python, CUDA, or any local installation. The tagger runs entirely in your browser at mohsindev369.dev/tools/dataset-tagger.
Step 1: Upload your images
Drag and drop a folder of images directly onto the upload area, or click to browse. You can also upload a .zip file containing your dataset. Supported formats: JPG, PNG, WebP, BMP. There are no file size limits.
Step 2: Set your trigger word
The trigger word is a unique term prepended to every .txt file. Your LoRA learns to associate this word with the common visual element across all images — the style, character, or concept you are training.
Choose something specific and uncommon: tide_ocean_style_v2 rather than ocean, my_character_v1 rather than girl. Generic words exist in the base model's training data already, which means your LoRA will conflict with existing associations.
Step 3: Adjust the confidence threshold
The default 0.35 works for most datasets. Go lower (0.2–0.28) to capture more style and texture tags. Go higher (0.5+) for a cleaner, stricter tag set. For style LoRAs, lower thresholds work better because subtle artistic features have lower confidence scores.
Step 4: Build your blacklist
This is where most people skip a critical step. The blacklist removes specific tags from all outputs regardless of confidence. It is essential for style LoRAs.
For a style LoRA, blacklist subject tags: 1girl, 1boy, sitting, standing, looking_at_viewer, smile, dress, shirt. If you leave these in, your model partially learns those poses and outfits as part of the style — making it less reliable as a pure style activator.
For a character LoRA, leave the blacklist empty or minimal. You want the model to learn subject details.
Step 5: Tag the dataset
Click the auto-tag button. The tagger loads the WD ViT Tagger v3 model into your browser (about 350MB, cached after first use) and processes each image. Progress shows per image. The whole process takes a few minutes for 100 images on a modern machine.
Step 6: Review and edit per image
After tagging, flip through the image grid. Each image shows its tags as clickable chips. Click the × on any chip to remove it. Type in the add field to insert a custom tag. Common edits: removing wrong character detections, adding your trigger word if the auto-insert missed it, or correcting tags for unusual art styles.
Step 7: Download and use
Click download. You get a .zip containing sequentially named images (000.png, 001.png) and matching .txt files. Drop the extracted folder directly into kohya_ss, SD-scripts, or EveryDream as your dataset directory — no reformatting needed.
Style LoRA vs Character LoRA: Different Approaches
Style LoRA tagging strategy:
- Lower threshold (0.2–0.3) to capture subtle artistic features
- Aggressive blacklist — strip all character, pose, and clothing tags
- Focus on: medium (
watercolor,digital_media), lighting (soft_lighting,backlighting), texture (rough_texture,hatching), colour treatment
Character LoRA tagging strategy:
- Standard threshold (0.35–0.45)
- Minimal or no blacklist
- Focus on: physical features, distinctive clothing, accessories, hair and eye colour
- Consider adding a descriptive trigger alongside a unique word:
my_char_v1, silver_hair, red_eyes
Tip: For character LoRAs, include some images with different poses, expressions, and outfits if possible. A dataset of 20 images all in the same pose trains a model that is good at that one pose.
Common Tagging Mistakes
Threshold too low. Setting 0.1 floods every file with marginal tags. Noisy tags teach the model to ignore the signal. Stay above 0.2 and use the blacklist rather than a very low threshold.
Forgetting the trigger word. If even a few images lack the trigger, the model's association weakens. Always verify trigger word presence before downloading.
Not reviewing the output. The tagger is not perfect. Unusual art styles, cropped compositions, and mixed media confuse it. Spend 10 minutes reviewing — it is the highest-ROI step in your dataset preparation.
Inconsistent tag vocabulary. Do not manually add tags using different synonyms. If the tagger uses blonde_hair, do not add yellow_hair to some images. Inconsistency reduces the model's ability to learn a clean concept.
What to Do With Your Tagged Dataset
Once you have your tagged zip:
- Extract it to a folder named with your repeat count:
10_my_style_v1for 10 repeats,20_my_char_v1for 20. - Drop it into your kohya_ss dataset directory or point your trainer config at it.
- Run a small test before a full training run — 500 steps to check whether the trigger activates correctly before committing hours of GPU time.
For the full training setup, the kohya_ss wiki and the Civitai guides are the most up-to-date resources. The tagger handles the labelling step — the rest of the pipeline is covered elsewhere.
The free browser tagger is at mohsindev369.dev/tools/dataset-tagger. It works offline after the first load and processes everything locally — your images never leave your device.