16 Llama 3.2

LLaMA 系列的当前版本包括：

LLaMA 3.2：多语言大模型，提供 1B 和 3B 参数的文本生成模型。
LLaMA 3.2 Vision：多模态模型，支持 11B 和 90B 参数的图像推理。

其历史版本还有 LLaMA 3.1 和 LLaMA 2，以及代码专用的 Code LLaMA。

16.1 本地化部署

16.1.1 使用 Ollama 本地化部署

使用 ollama 部署 LLaMA 模型相对简单，它提供了一个命令行工具，可以快速运行 LLaMA 系列模型。以下是具体步骤：

1. 安装 Ollama

首先，你需要在本地安装 ollama 工具。可以使用以下步骤：

macOS 用户可以直接使用 brew 安装：

brew install ollama

# 查看帮助
ollama --help

Windows 和 Linux 用户可以访问 Ollama 的官网下载相应的安装包。

2. 加载 LLaMA 模型

安装完成后，你可以通过命令行直接运行 LLaMA 模型。Ollama 会在首次运行时下载模型。

ollama run llama3.2:1b

这是运行 LLaMA 2 的示例，你可以根据需求替换为其他版本的 LLaMA 模型（如 LLaMA 3）。Ollama 提供了一些默认模型名，具体可以通过 ollama 的文档或命令行查看。

3. 查询支持的模型

你可以通过以下命令来查看 ollama 本地可用的模型列表。访问 https://ollama.com/library 查看所有模型：

ollama list

4. 本地模型部署

Ollama 主要针对本地部署，因此它会在首次运行模型时自动将所需的模型权重下载到本地，并通过本地硬件（如 CPU 或 GPU）加速推理。

# 查看模型
ollama list

# 运行模型
ollama run llama3.2:1b "Hi, who are you?"
ollama run llama3.2:3b "Hi, who are you?"
ollama run llama3.1:8b 'Hi, who are you?'

5. 使用 Ollama API

如果你想通过代码来调用 Ollama 提供的 LLaMA 模型，可以使用它的 API。这里是一个简单的 Python 示例：

import requests

response = requests.post(
    "http://localhost:11434/api/generate",
    json={"model": "llama3.1:8b", "prompt": "Hi, who are you?", "stream": False}
)

print(response.content.decode())

{"model":"llama3.1:8b","created_at":"2024-10-15T07:29:47.257886Z","response":"I'm an artificial intelligence model known as Llama. Llama stands for \"Large Language Model Meta AI.\"","done":true,"done_reason":"stop","context":[128006,882,128007,271,13347,11,889,527,499,30,128009,128006,78191,128007,271,40,2846,459,21075,11478,1646,3967,439,445,81101,13,445,81101,13656,369,330,35353,11688,5008,16197,15592,1210],"total_duration":3170882792,"load_duration":46793334,"prompt_eval_count":16,"prompt_eval_duration":2294743000,"eval_count":23,"eval_duration":828374000}

这个示例中，Ollama 的 API 会在本地运行，并且你可以通过 HTTP 请求向它发送推理任务。

6. 查看运行状态

# 查看运行状态
ollama ps

# 停止模型
ollama stop llama3.1:8b

16.1.2 Hugging Face 本地化部署

使用 Hugging Face 本地化部署 LLaMA 3 模型，可以通过以下步骤实现。假设你已经了解如何安装必要的软件环境和依赖包。

1. 确认系统配置

LLaMA 3 是一个大型模型，通常需要高性能的硬件，包括至少 24GB 显存的 GPU 和足够的 CPU 内存。

2. 安装依赖包

确保你安装了 Hugging Face Transformers 库和 Accelerate 库来进行模型的加载和加速：

pip install transformers accelerate

3. 获取 Hugging Face Access Token

由于 LLaMA 3 是 Meta 的受限模型，你需要访问 Hugging Face 上的相应页面，并通过请求访问权限。获取后，将 huggingface-cli 配置好你的令牌：

huggingface-cli login

4. 加载模型

一旦你有了访问权限，可以使用 Hugging Face 的 Transformers 库来加载模型并运行它。

Note: 在模型主页提交访问请求，等待批准。

from transformers import AutoTokenizer, AutoModelForCausalLM
from IPython.display import Markdown

# 加载 LLaMA 3 模型
model_name = "meta-llama/Llama-3.2-1b"  # 模型名称根据具体需求替换
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

# 推理
inputs = tokenizer("hi, who are you?", return_tensors="pt").to("mps")
outputs = model.generate(inputs.input_ids, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

5. 使用 Accelerate 加速推理

如果模型非常大，可能需要利用 Accelerate 库来处理多 GPU 或 TPU 的环境加速推理。

from accelerate import init_empty_weights, infer_auto_device_map
from transformers import AutoModelForCausalLM, AutoTokenizer

# 初始化空模型
with init_empty_weights():
    model = AutoModelForCausalLM.from_pretrained(model_name)

# 推断自动设备映射
device_map = infer_auto_device_map(model, max_memory={0: "24GB", 1: "24GB"})

# 使用设备映射加载模型
model = AutoModelForCausalLM.from_pretrained(model_name, device_map=device_map)

6. 本地化模型权重（可选）

如果需要将模型权重完全下载到本地进行脱机推理，可以指定缓存路径：

model = AutoModelForCausalLM.from_pretrained(model_name, cache_dir="./local_model")
tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="./local_model")

这将模型的所有文件下载到 ./local_model 文件夹中。

7. 推理与调整

使用 generate 方法进行文本生成，你可以根据任务需求，调整 max_length、temperature、top_p 等生成参数：

outputs = model.generate(
    inputs.input_ids, 
    max_length=100, 
    num_return_sequences=1, 
    temperature=0.7, 
    top_p=0.9
)

这样你就可以实现 Hugging Face 本地化部署 LLaMA 3 模型进行推理了。

16.1.3 使用 vllm 本地化部署

vLLM 是一个高性能的大语言模型推理和服务框架。它可以显著提高 LLaMA 等模型的推理速度。以下是使用 vLLM 本地部署 LLaMA 3.1 8B 模型的步骤：

安装 vLLM：

pip install vllm

下载运行 LLaMA 3.2 1B 模型：

vllm serve meta-llama/Meta-Llama-3.2-1B

使用 vLLM API（可选）：

import requests

response = requests.post(
    "http://localhost:8000/v1/chat/completions",
    json={"model": "meta-llama/Meta-Llama-3.2-1B", 
          "messages": [{"role": "user", "content": "Hello, who are you?"}], 
          "stream": False}
)

print(response.content.decode())

16.1.4 总结

使用 ollama 部署 LLaMA 模型，速度快，配置简单，适合快速部署和测试。

在拥有 18 G 内存的 MacBook Pro 上，运行 LLaMA 3.2 1B 模型，速度非常快，几乎秒出结果；运行 LLaMA 3.2 3B 模型，速度也非常快；运行 LLaMA 3.1 8B 模型，速度非常快；而运行 LLaMA 3.1 70B 模型，速度很慢。

16.2 任务性能评估

16.2.1 启动 API 服务

ollama run llama3.1:8b --format=json

16.2.2 读取文件

import pandas as pd

# 读取CSV文件中的标题和摘要
file = "example/wos-fast5k/savedrecs.txt"
    
# 根据文件的分隔符修改 delimiter 参数
# 设置 index_col=False 来确保第一列不会被当作行索引
data = pd.read_csv(file, delimiter='\t', index_col=False)  

# 删除全为 NaN 的列
data = data.loc[:, data.notna().sum() != 0]

# 查看处理后的数据
print(data[0:5])

  PT                                                 AU   BA   CA   GP  \
0  J  Just, Berta Singla; Marks, Evan Alexander Neth...  NaN  NaN  NaN   
1  J  Pereira-Marques, Joana; Ferreira, Rui M.; Figu...  NaN  NaN  NaN   
2  J  Golzar-Ahmadi, Mehdi; Bahaloo-Horeh, Nazanin; ...  NaN  NaN  NaN   
3  J  Li, Chenglong; Han, Yanfeng; Zou, Xiao; Zhang,...  NaN  NaN  NaN   
4  J  Hu, Yu; Wang, Yulin; Wang, Runhua; Wang, Xiaok...  NaN  NaN  NaN   

                                                  TI  \
0  Biofertilization increases soil organic carbon...   
1  A metatranscriptomics strategy for efficient c...   
2  Pathway to industrial application of heterotro...   
3  A systematic discussion and comparison of the ...   
4  Dirammox-dominated microbial community for bio...   

                                                  SO   VL   IS   BP  ...  \
0  INTERNATIONAL JOURNAL OF AGRICULTURAL SUSTAINA...   22    1  NaN  ...   
1                                       GUT MICROBES   16    1  NaN  ...   
2                             BIOTECHNOLOGY ADVANCES   77  NaN  NaN  ...   
3                SYNTHETIC AND SYSTEMS BIOTECHNOLOGY    9    4  775  ...   
4             APPLIED MICROBIOLOGY AND BIOTECHNOLOGY  108    1  NaN  ...   

     PY                                                 AB  \
0  2024  Protecting and building soil carbon has become...   
1  2024  The high background of host RNA poses a major ...   
2  2024  The transition to renewable energies and elect...   
3  2024  Synthetic microbial community has widely conce...   
4  2024  Direct ammonia oxidation (Dirammox) might be o...   

                                                  C1   CT   CY   SP   CL TC  \
0  [Just, Berta Singla; Llenas, Laia; Ponsa, Serg...  NaN  NaN  NaN  NaN  0   
1  [Pereira-Marques, Joana; Ferreira, Rui M.; Fig...  NaN  NaN  NaN  NaN  2   
2  [Golzar-Ahmadi, Mehdi; Norouzi, Forough; Holus...  NaN  NaN  NaN  NaN  0   
3  [Li, Chenglong; Han, Yanfeng; Zou, Xiao; Zhang...  NaN  NaN  NaN  NaN  0   
4  [Hu, Yu; Wang, Yulin; Liu, Shuang-Jiang] Shand...  NaN  NaN  NaN  NaN  0   

                                                  WC                   UT  
0  Agriculture, Multidisciplinary; Green & Sustai...  WOS:001247571700001  
1        Gastroenterology & Hepatology; Microbiology  WOS:001176335800001  
2               Biotechnology & Applied Microbiology  WOS:001316252000001  
3               Biotechnology & Applied Microbiology  WOS:001273513000001  
4               Biotechnology & Applied Microbiology  WOS:001252334300001  

[5 rows x 24 columns]

下面，将 dataframe 中的数据逐批处理，生成 JSON 格式的文本。

batch_to_json()：这是函数的主体，将数据按批次进行处理，并将每行数据转化为 JSON 格式。
批次切分：使用 data[i:i+batch_size] 对 DataFrame 进行分批处理。
构建 JSON：每行数据提取出 UT、TI、AB 列的值，并构建一个包含 id、ti 和 ab 键的 JSON 对象。
json.dumps()：将列表形式的 JSON 对象转换为 JSON 字符串。
ensure_ascii=False：确保输出的 JSON 字符串支持非 ASCII 字符。

import pandas as pd
import json
from IPython.display import Markdown

def batch_to_json(data, batch_size):
    # 确保输入的 data 是一个 DataFrame，且 batch_size 是正整数
    if not isinstance(data, pd.DataFrame):
        raise ValueError("data 应该是一个 pandas DataFrame")
    if not isinstance(batch_size, int) or batch_size <= 0:
        raise ValueError("batch_size 应该是一个正整数")
    
    # 提取 dataframe 中的每 batch_size 行数据
    batches = [data[i:i+batch_size] for i in range(0, len(data), batch_size)]
    
    result = []
    for batch in batches:
        json_batch = []
        for _, row in batch.iterrows():
            # 构建每行的 JSON 对象
            json_obj = {
                "ID": row["UT"],
                "TI": row["TI"],
                "AB": row["AB"]
            }
            json_batch.append(json_obj)
        result.append(json.dumps(json_batch, ensure_ascii=False))  # 转换为 JSON 格式的字符串
    return result

# 示例使用
batch_size = 2
json_output = batch_to_json(data[0:1], batch_size)

from pprint import pprint
pprint(json_output)

['[{"ID": "WOS:001247571700001", "TI": "Biofertilization increases soil '
 'organic carbon concentrations: results of a meta-analysis", "AB": '
 '"Protecting and building soil carbon has become a global policy priority, '
 'and novel agronomic fertilization practices may contribute to soil '
 'protection and climate-smart agriculture. The application of microbial '
 'inoculants (biofertilizers) is considered beneficial for soil and climate '
 '-smart agriculture. Therefore, an exhaustive meta-analysis of '
 'biofertilization studies was carried out worldwide to quantify the benefits '
 'of microbial inoculants on SOC concentration. Based on 59 studies and 267 '
 'observations, was found that biofertilizers significantly increased SOC '
 'concentration by an average of 0.44 g C kg-1 soil. All biofertilizer types '
 'were estimated to contribute positively to SOC (0.18-0.70 g C kg-1soil), but '
 'only cyanobacteria, mixtures of organisms, mycorrhizal fungi, and nitrogen '
 'fixers were statistically significant. In terms of crop type, results were '
 'significant and positive for cereals, fruits, legumes and root/tuber crops '
 '(0.44-0.82 g C kg-1soil). A significant positive linear relationship was '
 'observed between crop yield and SOC changes, supporting the notion that '
 'greater productivity helps explain SOC increases, accounting for 7% of the '
 'dataset variability. This study provides the first evidence from a global '
 'assessment that biofertilizer use is associated with an augmented '
 'terrestrial agricultural organic carbon sink contributing to soil protection '
 'and food security where climate-smart solutions are sought."}]']

16.2.3 调整提示词

Note: 将提示词换成英文效果可能会更好。

Note: JSON 格式要求键值对必须用双引号。

# 使用三引号插入一个非常长的系统提示符
system_prompt = """  
Please identify the relevance of each article's research topic to **synthetic microbial community** research based on the following content, and provide a score (0 for completely irrelevant, 10 for highly relevant, 1-9 for partially relevant):

Please note:

1. A synthetic microbial community refers to a composite of one or more microorganisms created by humans. It can be a probiotic, a biofertilizer, or a small microbial community. All research related to these topics should be considered relevant to synthetic microbial community research.
2. The subsequent content will be provided in JSON format, including the article's ID, title (TI), and abstract (AB). Please make your judgment based on the title and abstract content, and record the ID for your response.
3. Be sure to follow the response requirements below.

Response requirements:

1. In your response, please provide the article's ID, score, and a brief reason.
2. Please use the JSON format for your response, for example:

{"results": [{"ID":"string", "score":number, "reason": "string"}]}

"""

要使 OpenAI API 调用的结果以统一的 JSON 格式返回，可以按照以下步骤进行设置：

import requests
import json

def analyze_content(content):
  messages = [
      {"role": "system", "content": system_prompt}
  ]
  
  messages.append({"role": "user", "content": content})

  response = requests.post(
      "http://localhost:11434/v1/chat/completions",
      json={
          "model": "llama3.1:8b",
          "messages": messages,
          "stream": False
      }
  )
  # 将API响应解析为JSON格式
  response_data = response.json()
  content_str = response_data['choices'][0]['message']['content']
  # 从 content_str 中提取 JSON 格式的文本
  json_start = content_str.find("{")
  json_end = content_str.rfind("}") + 1
  json_str = content_str[json_start:json_end]

  try:
      content_dict = json.loads(json_str)
  except json.JSONDecodeError:
      print(f"警告: 返回的内容不包含有效的 JSON 数据。内容为: {content_str}")
      content_dict = {"results": []}
  
  # 如果含有 results 键，则返回 results 键对应的值
  if 'results' in content_dict:
      result = content_dict['results']
  else:
      print(f"警告: 返回的JSON 内容不包含有效的 'results' 键。内容为: {json_str}")
      result = []

  return result

test_output = analyze_content(json_output[0])
pprint(test_output)

[{'ID': 'WOS:001247571700001',
  'reason': 'The article discusses the benefits of biofertilization on soil '
            'organic carbon concentrations and its contribution to '
            'climate-smart agriculture, which is relevant to synthetic '
            'microbial community research as it involves a specific type of '
            'microbe (microbial inoculants) used for agricultural purposes.',
  'score': 9}]

16.2.4 批处理

现在可以进行批处理，并将结果保存下来。

batches = batch_to_json(data[0:20], batch_size=1)

results = []

import time

start_time = time.time()

from tqdm import tqdm

for batch in tqdm(batches):
  result = analyze_content(batch)
  results.append(result)

end_time = time.time()
execution_time = end_time - start_time
print(f"批处理执行时间: {execution_time:.2f} 秒")

批处理执行时间: 88.28 秒

将列表合并成一个数据框，并保存到硬盘中供后续分析使用。

import json

data_list = []

for result in results:
  # 如果 result 不是空列表
  if result:
    for item in result:
      # 确保有 ID 键、score 键、reason 键
      if 'ID' in item and 'score' in item and 'reason' in item:
        data_list.append([item['ID'], item['score'], item['reason']])

df = pd.DataFrame(data_list, columns=['ID', 'score', 'reason'])

df
# df.to_csv('output.csv', index=False)

	ID	score	reason
0	WOS:001247571700001	8.0	The article discusses the use of biofertilizer...
1	WOS:001176335800001	8.5	The article discusses a metatranscriptomics st...
2	WOS:001316252000001	8.0	The article discusses using heterotrophic micr...
3	WOS:001273513000001	10.0	The article's title directly mentions 'synthet...
4	WOS:001252334300001	8.0	The study involves creating a synthetic microb...
5	WOS:001209713100001	8.0	The article discusses microalgal biofuels, lip...
6	WOS:001198552100002	9.0	The article discusses various methods to study...
7	WOS:001171546000001	10.0	The article discusses the construction of a sy...
8	WOS:001314268600001	10.0	The article explores the use of synthetic micr...
9	WOS:001313702300001	0.0	This article's research topic is related to th...
10	WOS:001320306700001	5.0	The research topic focuses on using microalgae...
11	WOS:001319961700001	10.0	The research topic directly relates to synthet...
12	WOS:001318177900001	10.0	The research topic is directly related to synt...
13	WOS:001318328600001	6.0	While this article is about textile printing w...
14	WOS:001317596600001	4.0	The article discusses cationic antimicrobial p...
15	WOS:001291429900001	10.0	This article investigates the effects of anaer...
16	WOS:001319156200001	6.0	The article discusses a nitrogen-fixing cyanob...
17	WOS:001307986400001	9.0	The article discusses synthetic microbial comm...
18	WOS:001319855200001	8.0	The article discusses the use of Paenibacillus...
19	WOS:001314605300007	8.0	The article discusses the role of acetic acid ...