对生成扩散模型进行高效调优。通过ControlLora-Tuner模块,在训练时只需训练少规模的参数,即可定制专属于你的场景的条件生成模型!
本模型基础的Diffusion Model采用Stable-Diffusion-v1-5预训练模型,可训练的调优模块(ControlLora-Tuner)的参数量占总模型的约0.1%。
基于 ModelScope 框架,通过调用预定义的 Pipeline 可实现快速调用。
注:当前输入的’cond’(condition image)只支持长宽相等的正方形图像。
from modelscope.pipelines import pipeline
sd_control_lora_pipeline = pipeline('efficient-diffusion-tuning',
'damo/multi-modal_efficient-diffusion-tuning-control-lora')
inputs = {'prompt': 'pale golden rod circle with old lace background',
'cond': 'https://modelscope.oss-cn-beijing.aliyuncs.com/test/images/efficient_diffusion_tuning_sd_control_lora_source.png'}
result = sd_control_lora_pipeline(inputs)
print(f'Output: {result}.')
以下过程基于fill50k数据集,实现了SD-ControlLora模型的训练及验证过程。
# 模型ID
model_id = 'damo/multi-modal_efficient-diffusion-tuning-control-lora'
# 加载训练集
train_dataset = MsDataset.load(
'controlnet_dataset_condition_fill50k',
namespace='damo',
split='train',
download_mode=DownloadMode.FORCE_REDOWNLOAD).to_hf_dataset( # noqa
).select(range(100)) # noqa
# 加载验证集
eval_dataset = MsDataset.load(
'controlnet_dataset_condition_fill50k',
namespace='damo',
split='validation',
download_mode=DownloadMode.FORCE_REDOWNLOAD).to_hf_dataset( # noqa
).select(range(20)) # noqa
tmp_dir = tempfile.TemporaryDirectory().name # 使用临时目录作为工作目录
max_epochs = 1 # 训练轮次
# 修改配置文件
def cfg_modify_fn(cfg):
cfg.train.max_epochs = max_epochs # 最大训练轮次
cfg.train.lr_scheduler.T_max = max_epochs # 学习率调度器的参数
cfg.model.inference = False # 模型状态
return cfg
# 构建训练器
kwargs = dict(
model=model_id, # 模型id
work_dir=tmp_dir, # 工作目录
train_dataset=train_dataset, # 训练集
eval_dataset=eval_dataset, # 验证集
cfg_modify_fn=cfg_modify_fn # 用于修改训练配置文件的回调函数
)
trainer = build_trainer(name=Trainers.vision_efficient_tuning, default_args=kwargs)
# 进行训练
trainer.train()
# 进行评估
result = trainer.evaluate()
print('result:', result)
如果该模型对您有所帮助,请引用下面的相关的论文:
@inproceedings{hu2021lora,
title={{LoRA}: Low-Rank Adaptation of Large Language Models},
author={Hu, Edward and Shen, Yelong and Wallis, Phil and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Lu and Chen, Weizhu},
booktitle=ICLR,
year={2021}
}
@misc{rombach2021highresolution,
title={High-Resolution Image Synthesis with Latent Diffusion Models},
author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
year={2021},
eprint={2112.10752},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@software{wu2023controllora,
author = {Wu Hecong},
month = {2},
title = {{ControlLoRA: A Light Neural Network To Control Stable Diffusion Spatial Information}},
url = {https://github.com/HighCWu/ControlLoRA},
version = {1.0.0},
year = {2023}
}