训练文本生成
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

30 lines
10 KiB

2 years ago
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2,3"
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# You need to download the model into the `examples` directory
model_path = '/home/majiahui/models-LLM/openbuddy-llama-7b-v1.4-fp16/'
# model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_path)
# with open('../system.prompt', 'r', encoding='utf-8') as f:
# prompt = f.read()
# prompt = "生成论文小标题内容#论文题目为“基于红细胞分布宽度建立乙肝相关慢加急性肝衰竭不良预后的优化预测模型”,以“研究建立一种基于红细胞分布宽度的乙肝相关慢加急性肝衰竭不良预后的优化预测模型,包括数据收集、特征选择、模型构建等方面的研究。最终成果是验证所设计模型对于乙肝相关慢加急性肝衰竭不良预后的预测能力,并对模型进行优化,提高预测准确性和可靠性。”为论文的研究方向,为论文生成目录,要求只有一级标题和二级标题,一级标题使用中文数字 例如一、xxx;二级标题使用阿拉伯数字 例如1.1 xxx;一级标题不少于7个;每个一级标题至少包含3个二级标题"
#
# prompt = "\n\nUser: 生成论文小标题内容#论文题目为“基于红细胞分布宽度建立乙肝相关慢加急性肝衰竭不良预后的优化预测模型”,以“研究建立一种基于红细胞分布宽度的乙肝相关慢加急性肝衰竭不良预后的优化预测模型,包括数据收集、特征选择、模型构建等方面的研究。最终成果是验证所设计模型对于乙肝相关慢加急性肝衰竭不良预后的预测能力,并对模型进行优化,提高预测准确性和可靠性。”为论文的研究方向,为论文生成目录,要求只有一级标题和二级标题,一级标题使用中文数字 例如一、xxx;二级标题使用阿拉伯数字 例如1.1 xxx;一级标题不少于7个;每个一级标题至少包含3个二级标题\nAssistant:"
# prompt = "\n\nUser: 生成论文小标题内容#论文题目是“多无人艇一致性自主编队控制研究”,目录是“一、引言\\n 1.1 研究背景\\n 1.2 研究意义\\n 1.3 国内外研究现状\\n\\n二、多无人艇编队控制技术综述\\n 2.1 多无人艇编队控制技术分类\\n 2.2 多无人艇编队控制技术研究现状\\n 2.3 多无人艇编队控制技术存在的问题\\n\\n三、多无人艇一致性控制方法研究\\n 3.1 一致性控制方法分类\\n 3.2 多无人艇一致性控制方法研究现状\\n 3.3 多无人艇一致性控制方法存在的问题\\n\\n四、多无人艇一致性自主编队控制方法研究\\n 4.1 自主编队控制方法分类\\n 4.2 多无人艇一致性自主编队控制方法研究现状\\n 4.3 多无人艇一致性自主编队控制方法存在的问题\\n\\n五、多无人艇一致性自主编队控制仿真实验\\n 5.1 实验设计\\n 5.2 实验结果分析\\n 5.3 实验结论\\n\\n六、总结与展望\\n 6.1 研究成果总结\\n 6.2 研究不足与改进方向\\n 6.3 发展前景与应用价值”,请把其中的小标题“3.3 多无人艇一致性控制方法存在的问题”的内容补充完整,补充内容字数在700字左右\nAssistant:"
# input_ids = tokenizer.encode(prompt, return_tensors='pt')
# output_ids = model.generate(input_ids, max_new_tokens=1400)
# print(output_ids[0])
a = [13866, 338, 385, 15278, 393, 16612, 263, 3414, 29889, 14350, 263, 2933, 393, 7128, 2486, 1614, 2167, 278, 2009, 29889, 13, 13, 2277, 29937, 2799, 4080, 29901, 13, 30486, 30494, 32229, 31596, 30210, 31367, 31455, 32190, 31495, 30503, 31474, 31349, 29937, 13, 31088, 30748, 32049, 31479, 30544, 30651, 30866, 32011, 31830, 30606, 30356, 32168, 30886, 30872, 30908, 30698, 32048, 32050, 31367, 31455, 30843, 30573, 32229, 31596, 30214, 30651, 30015, 31367, 31455, 32011, 31830, 30606, 32028, 31900, 31410, 30505, 30356, 32168, 30886, 30872, 30525, 30806, 30210, 30908, 30698, 32048, 32050, 30214, 31473, 32041, 31149, 30783, 31867, 32121, 30886, 30872, 30330, 31867, 30745, 32067, 32154, 30330, 30356, 30613, 30670, 30753, 31184, 30525, 30806, 30210, 31084, 31943, 31579, 31522, 30267, 30878, 32127, 30494, 30801, 30392, 30783, 32011, 31830, 30606, 32028, 31900, 31410, 30210, 30356, 32168, 30886, 30872, 30687, 32048, 31174, 30448, 31947, 30752, 31367, 31455, 30214, 30573, 30672, 30356, 30356, 32168, 30886, 30872, 31302, 32004, 30417, 32159, 30210, 31579, 31522, 31541, 31695, 30503, 30687, 32048, 31084, 31943, 30267, 30024, 30573, 32048, 30333, 30210, 31367, 31455, 30525, 31331, 30214, 30486, 30494, 32048, 30333, 30210, 31367, 31455, 32190, 31495, 30503, 31474, 31349, 30214, 30578, 30354, 30413, 31022, 30909, 29896, 29900, 29900, 29900, 30578, 13, 13, 2277, 29937, 13291, 29901, 13, 1, 29871, 32066, 32006, 30672, 30356, 30356, 31074, 30210, 30413, 31683, 32058, 32014, 30503, 30356, 32053, 30533, 30956, 30210, 31302, 30528, 30214, 30356, 32168, 30886, 30872, 30494, 30573, 30356, 30613, 30910, 31599, 30210, 30908, 30698, 31263, 30494, 30636, 30748, 30267, 32011, 31830, 30606, 32028, 31900, 31410, 30505, 30356, 32168, 30886, 30872, 30525, 30806, 30732, 30544, 30743, 30287, 31185, 31025, 30908, 30698, 32048, 32050, 30214, 30783, 30672, 30356, 31867, 32121, 30886, 30872, 30330, 31867, 30745, 32067, 32154, 30503, 30356, 30613, 30670, 30753, 31184, 30525, 30806, 31302, 30544, 30743, 30287, 31185, 31025, 31084, 31943, 31579, 31522, 30267, 30810, 31959, 31084, 31943, 31579, 31522, 30413, 32148, 30783, 30672, 30356, 30356, 32168, 30886, 30872, 32001, 30417, 30908, 30698, 30210, 31424, 31195, 31474, 31349, 30214, 31325, 32038, 30783, 30909, 30666, 32014, 30356, 32053, 30670, 30753, 30733, 30732, 30330, 32072, 32060, 30533, 30467, 30503, 30606, 32152, 30495, 30214, 30651, 31436, 32047, 30846, 30753, 31539, 30503, 30606, 31267, 30910, 31599, 30953, 32001, 30417, 30908, 30698, 30210, 31474, 31349, 30267, 13, 13, 31367, 31455, 32011, 31830, 30606, 32028, 31900, 31410, 30505, 30356, 32168, 30886, 30872, 30525, 30806, 30210, 30908, 30698, 32048, 32050, 30214, 30783, 30909, 31947, 30752, 30687, 31201, 32011, 31830, 30606, 32028, 31900, 31410, 30210, 30356, 32168, 30886, 30872, 30687, 32048, 30214, 30573, 30672, 30356, 30356, 32168, 30886, 30872, 31302, 32004, 30417, 32159, 30210, 31579, 31522, 31541, 31695, 30503, 30687, 32048, 31084, 31943, 32001, 30417, 30908, 30698, 30210, 31474, 31349, 30267, 32048, 30333, 30998, 31594, 30651, 30557, 32096, 30502, 30525, 30806, 31174, 30448, 32160, 32327, 30383, 13, 13, 30287, 30330, 32011, 31830, 30606, 32028, 31900, 31410, 30210, 31867, 32121, 30886, 30872, 31579, 31522, 13, 13, 32011, 31830, 30606, 32028, 31900, 31410, 30505, 31867, 32121, 30886, 30872, 30525, 30806, 31302, 30544, 30743, 30015, 32014, 31867, 30895, 31062, 30330, 31867, 30855, 32267, 30733, 30330, 31264, 32331, 31441, 30374, 30330, 32216, 30545, 31032, 31867, 30024, 30210, 31084, 31943, 31579, 31522, 30267, 30810, 31959, 31084, 31943, 31579, 31522, 30783, 30909, 30672, 30356, 31867, 32121, 30210, 31424, 30690, 30705, 30886, 30872, 30503, 31867, 30855, 32267, 30733, 32001, 30417, 30908, 30698, 30210, 31474, 31349, 30267, 32048, 30333, 30998, 31594, 31867, 32121, 30886, 30872, 30210, 30895, 31062, 30330, 31867, 30855, 32267, 30733, 30210, 31195, 32241, 30330, 31264, 32331, 31441, 30374, 30210, 31429, 31072, 30503, 32216, 30545, 31032, 31867, 30210, 30988, 31072, 31184, 30525, 30806, 31174, 30448, 31947, 30752, 31367, 31455, 30267, 13, 13, 30685,
print(len(a))
print(tokenizer.decode(a, skip_special_tokens=True))