Wan-2.1 LoRA Trainer

video trainer

fal-ai/wan-trainer/t2v-14b

Motion LoRAs for Wan 2.1 text-to-video at full 14B size.

The same trainer as the 1.3B version but on the 14B Wan 2.1 model: slower and costlier, noticeably higher output quality. Mix videos for motion and images for appearance in the zip. Like all Wan 2.1 trainers, its LoRAs target Wan 2.1, not the catalog's Wan 2.2 endpoint.

Open in fal playground ↗Official API docs ↗

What goes in the zip

Zip with at least 10 videos and/or images, with optional matching name.txt captions per file.

Good starting point

number_of_steps: 400learning_rate: 0.0002

Parameters

Schema facts come straight from the fal API; the notes are ours.

Required

training_data_urlstringrequired

URL to a zip archive of your training images, optionally with matching .txt caption files.

In the atelier: The album you hand the painter. It is the single biggest factor in what the LoRA becomes.

Tip: 15 to 30 sharp, varied images beat 200 sloppy ones. Vary angle, lighting and background; keep the subject consistent.

Watch out: Duplicate or near-duplicate images push the LoRA toward memorizing instead of learning.

Raw schema description

URL to zip archive with images of a consistent style. Try to use at least 10 images and/or videos, although more is better. In addition to images the archive can contain text files with captions. Each text file should have the same name as the image/video file it corresponds to.

Optional

number_of_stepsintegerdefault: 400100 – 20000

How many training iterations the model runs on your dataset. More steps means the LoRA sees your images more times.

In the atelier: Practice repetitions. Too few and the painter never picks up the skill. Too many and he stops learning and starts memorizing your exact photos.

Tip: Around 1000 is a solid default for a 15 to 30 image subject dataset. Small datasets need fewer steps, not more.

Watch out: If outputs start reproducing your training photos almost exactly (same pose, same background), you overtrained. Go back down.

Raw schema description

The number of steps to train for.

learning_ratenumberdefault: 2e-40.000001 – 1

How big each learning update is. Controls how aggressively the model changes per step.

In the atelier: The painter's eagerness. A high rate is frantic practice: fast but sloppy, and it can wreck habits he already had. A low rate is careful practice: slow, but precise.

Tip: Stay near the trainer's default unless you have a reason. If results look fried or oversaturated, lower it. If the subject barely shows after many steps, raise it slightly or add steps.

Watch out: Learning rate and steps trade off against each other. Doubling both at once is how datasets get burned.

Raw schema description

The rate at which the model learns. Higher values can lead to faster training, but over-fitting.

trigger_phrasestring

A unique word or phrase baked into your captions that activates the LoRA at inference time.

In the atelier: The skill's calling word. Say it in the prompt and the painter knows to use the bracelet.

Tip: Pick something that is not a real word, like TOK or OHWX, so it does not collide with anything the base model already knows.

Watch out: If you train with a trigger and forget it in your prompts later, the LoRA will seem weak or broken.

Raw schema description

The phrase that will trigger the model to generate an image.

auto_scale_inputbooleandefault: false

Automatically resizes training media to resolutions the trainer handles best.

Tip: Leave on. Turn off only if you have pre-sized everything deliberately.

Raw schema description

If true, the input will be automatically scale the video to 81 frames at 16fps.

Call it

import { fal } from "@fal-ai/client";

const result = await fal.subscribe("fal-ai/wan-trainer/t2v-14b", {
  input: {
    "training_data_url": "https://your-cdn.com/dataset.zip",
    "number_of_steps": 400,
    "learning_rate": 0.0002,
    "trigger_phrase": "TOK"
  },
  logs: true,
});
console.log(result.data);