BiLLa: A Bilingual LLaMA with Enhanced Reasoning Ability

BiLLa 是开源的推理能力增强的中英双语 LLaMA 模型. 模型的主要特性有:

较大提升 LLaMA 的中文理解能力, 并尽可能减少对原始 LLaMA 英文能力的损伤;
训练过程增加较多的任务型数据, 利用 ChatGPT 生成解析, 强化模型理解任务求解逻辑;
全量参数更新, 追求更好的生成效果.

Github: https://github.com/Neutralzz/BiLLa

以下是经过有限的评测分析得出的结论:

BiLLa-7B-LLM 中英语言建模能力显著优于 Chinese-LLaMA-7B;
BiLLa-7B-SFT 中文推理能力显著优于 BELLE-LLaMA-Ext-7B 等模型;
由GPT4打分, BiLLa-7B-SFT 在英文指令上得分显著高于 ChatGLM-6B, 中文得分持平, 但解题与代码得分更高.

代码示例

from modelscope.utils.constant import Tasks
from modelscope.pipelines import pipeline
pipe = pipeline(task=Tasks.text_generation, model='AI-ModelScope/BiLLa-7B-SFT', device_map='auto',model_revision='v1.0.7')
inputs = 'Human: Write a Python function that checks if a given number is even or odd.\nAssistant: '
result = pipe(inputs, min_length=10, max_length=512, num_beams=3,temperature=0.8,do_sample=False, early_stopping=True,top_k=50,top_p=0.8, repetition_penalty=1.2, length_penalty=1.2, no_repeat_ngram_size=6,max_new_tokens=1024)
print(result)

输入格式

Human: [Your question]
Assistant:

模型局限性

当前BiLLa模型未经RLHF训练, 泛化性有待观望.

BiLLa训练过程中使用了较多的任务型数据, 建议减少常识类的、时事类的提问.

BiLLa训练数据中包含了多轮对话摘要数据, 但未直接包含多轮对话的生成数据, 因此模型多轮对话能力可能较差.

Clone with HTTP

git clone https://www.modelscope.cn/AI-ModelScope/BiLLa-7B-SFT.git