StableLM-Tuned-Alpha
是一套建立在StableLM-Base-Alpha
模型之上的3B和7B参数的纯解码器语言模型,并在各种聊天和指令跟随数据集上进一步微调。
通过使用以下代码片段,开始与 StableLM-Tuned-Alpha
进行聊天:
from modelscope.utils.constant import Tasks
from modelscope.pipelines import pipeline
pipe = pipeline(task=Tasks.text_generation, model='AI-ModelScope/stablelm-tuned-alpha-7b', model_revision='v1.0.2', device='cuda')
system_prompt = """<|SYSTEM|># StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""
prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>"
result = pipe(prompt)
print(result)
StableLM Tuned 应使用格式为 <|SYSTEM|>...<|USER|>...<|ASSISTANT|>...
的提示语.
系统提示是:
<|SYSTEM|># StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
StableLM-Tuned-Alpha
) are licensed under the Non-Commercial Creative Commons license (CC BY-NC-SA-4.0), in-line with the original non-commercial license specified by Stanford Alpaca.lm@stability.ai
Parameters | Hidden Size | Layers | Heads | Sequence Length |
---|---|---|---|---|
3B | 4096 | 16 | 32 | 4096 |
7B | 6144 | 16 | 48 | 4096 |
StableLM-Tuned-Alpha
模型是在五个数据集的组合上进行微调的::
Alpaca: 一个由OpenAI的 text-davinci-003
引擎生成的52,000条指令和演示的数据集。
GPT4All Prompt Generations: 其中包括由GPT-4生成的40万个提示和回应;
Anthropic HH: 由对人工智能助手的帮助性和无害性的偏好组成;
DataBricks Dolly: 包括Databricks员工在InstructGPT论文的能力领域产生的15000条指令/回复,包括头脑风暴、分类、封闭式QA、生成、信息提取、开放式QA和总结;
and ShareGPT Vicuna (English subset): 从ShareGPT获取的对话数据集.
模型是在上述数据集上通过监督微调学习的,以混合精度(FP16)进行训练,并通过AdamW进行优化。我们概述了以下超参数:
Parameters | Batch Size | Learning Rate | Warm-up | Weight Decay | Betas |
---|---|---|---|---|---|
3B | 256 | 2e-5 | 50 | 0.01 | (0.9, 0.99) |
7B | 128 | 2e-5 | 100 | 0.01 | (0.9, 0.99) |
这些模型旨在被开源社区的聊天类应用使用,遵守CC BY-NC-SA-4.0许可证.
尽管上述数据集有助于引导基础语言模型进入 "更安全 "的文本分布,但并非所有的偏见和毒性都可以通过微调来缓解。我们要求用户注意在生成的反应中可能出现的这类潜在问题。不要把模型输出作为人类判断的替代品或真理的来源。请负责任地使用。
如果没有达科塔-马汉(@dmayhem93)的帮助,这项工作是不可能完成的。 .
@misc{alpaca,
author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
title = {Stanford Alpaca: An Instruction-following LLaMA model},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
@misc{vicuna2023,
title = {Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality},
url = {https://vicuna.lmsys.org},
author = {Chiang, Wei-Lin and Li, Zhuohan and Lin, Zi and Sheng, Ying and Wu, Zhanghao and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao and Gonzalez, Joseph E. and Stoica, Ion and Xing, Eric P.},
month = {March},
year = {2023}
}
@misc{gpt4all,
author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}