gfiles = list.files(
path = "./extdata/genomes",
pattern = "*.fa.gz",
full.names = TRUE
)
print(gfiles)[1] "./extdata/genomes/CK22.fa.gz" "./extdata/genomes/CK6.fa.gz"
[3] "./extdata/genomes/CK7.fa.gz"
这可以称为 UniPrimer workflow 吧。
I have several gzipped fasta format genomes, and want to design specific primers for each of them.
gfiles = list.files(
path = "./extdata/genomes",
pattern = "*.fa.gz",
full.names = TRUE
)
print(gfiles)[1] "./extdata/genomes/CK22.fa.gz" "./extdata/genomes/CK6.fa.gz"
[3] "./extdata/genomes/CK7.fa.gz"
Specific primer means a pair of primers that can amplify a DNA segment with the genomic DNA of strain 1 but not with all the DNA of the other 29 strains.
设计特异性引物总共分为 3 步:
unikmer,seqkit,rush 等程序完成。primer3 和自己编写的 rPrimer3 软件包等完成。DECIPHER 软件包完成。上述软件及软件包的官方网站(文档)分别是:
创建工作区,就是新建一个子目录,把运行产生的文件放到一起,避免运行产生的文件与原始数据(这里是基因组序列)混在一起,清理文件时发生意外。
ws = "ws" # working space
outdir = file.path("temp", ws) |> R.utils::getAbsolutePath()
if (!dir.exists(outdir)){
dir.create(outdir, recursive = TRUE)
} else {
warning(paste("The work space", outdir, "is existed, skipping..."))
}library(dplyr)
strains = gfiles |> basename() |> stringr::str_remove(".fa.gz")
all_files = dplyr::tibble(
strain = strains,
genome = gfiles,
kmer_all = xfun::with_ext(strains, "all", extra = "."), # only prefix needed
kmer_uniq = xfun::with_ext(strains, "uniq", extra = "."), # only prefix needed
fasta_all = xfun::with_ext(strains, "uniq.fa", extra = "."),
fasta_one = xfun::with_ext(strains, "one.fa", extra = ".")
) |>
mutate(across(kmer_all:fasta_one, ~file.path(outdir, .x)))
all_files# A tibble: 3 × 6
strain genome kmer_all kmer_uniq fasta_all fasta_one
<chr> <chr> <chr> <chr> <chr> <chr>
1 CK22 ./extdata/genomes/CK22.fa.gz /Users/gaoc… /Users/g… /Users/g… /Users/g…
2 CK6 ./extdata/genomes/CK6.fa.gz /Users/gaoc… /Users/g… /Users/g… /Users/g…
3 CK7 ./extdata/genomes/CK7.fa.gz /Users/gaoc… /Users/g… /Users/g… /Users/g…
Note: 下面的脚本已更新,在 R 环境中可以执行1。
这一步主要使用了 kmer 筛选和组装获得基因组片段。我自己先写了一个流程,然后发给 unikmer 的作者沈伟2征求意见,他随后给出了下列的代码。主要改进有:
用了干净的工作区(Work Space - ws)
更好的 unikmer 设置
在生成 kmer 的时候,允许重复 kmer 的产生(去掉了 --unique 参数);在“组装”(uniqs/map)时允许 kmer 多次使用(增加了 -M 参数)。这有利于在最后的组装时获得更长的特异性序列。
在使用基因组时对序列进行过滤(如 # sequences with name containing "plasmid" is removed ('-B/--seq-name-filter plasmid'))。
使用 rush 实现了并行计算3。
直接进化到 rush,非常好用!rush 一下真的快很多,而且执行进度提醒非常友好!在沈教授的启发下,我也对 R 语言的代码进行了并行计算的修改。
Since 31 nt is enough for a primer, so we start with k = 31.
## generating k-mers from each genome
# only kepp the caninical k-mers ('-K/--canonical')
# sequences with name containing "plasmid" are removed ('-B/--seq-name-filter plasmid')
# sort output (-s/--sort)
k = 31
cmd = lapply(seq_along(gfiles), function(i){
paste("unikmer count --canonical --circular --seq-name-filter plasmid --sort -k", k, "-o",
all_files$kmer_all[[i]],
all_files$genome[[i]])
})Run command in parallel using libray(parallel).
这里构建了一个运行系统命令的函数。可以将多个命令传给这个函数,然后进行并行计算。
#' Run command in parallel
#'
#' check the running result and send message or warnings.
run_cmd = function(cmd, description = paste(length(cmd), "commands"), intern = FALSE){
message("Run commands in parallel: ", description)
# run cmd in parallel
library(parallel)
n = round(detectCores() * 0.75) # use 75% of all the cores
res = mclapply(cmd, system, intern = intern, mc.cores = n)
# failed command
cmd_failed = cmd[unlist(res) != 0]
if (length(cmd_failed) == 0){
message("All commands run successfully.")
} else {
warning("There are ", length(cmd_failed), " command(s) failed.")
warning(paste(" ",cmd_failed))
}
invisible(res)
}run_cmd(cmd)Common kmers shared by >2 genomes will be removed. After that, unique sub-sequences are assembled by the resting kmers.
Firstly, find the shared kmers of two or more genomes.
## computing k-mers shared by >= 2 files
cmd = paste("unikmer common -n 2 -o shared --verbose", paste0(all_files$kmer_all, ".unik", collapse = " "))
run_cmd(cmd)Second, remove shared kmers from the genome kmers.
## remove common k-mers
cmd = lapply(seq_along(gfiles), function(i){
glue::glue('unikmer diff -s -o {all_files$kmer_uniq[[i]]} {all_files$kmer_all[[i]]}.unik shared.unik')
})
run_cmd(cmd, "Remove common k-mers")Show the info of shared and genome-specific kmers, indicating how many kmers in different strains?
cmd = glue::glue('unikmer stats -a {all_files$kmer_uniq}.unik')
run_cmd(cmd)Mapping specific k-mers to each input genome.
# allow multiple mapped k-mers (-M/--allow-multiple-mapped-kmer)
# ouput fasta (-a/--output-fasta)
# filter genome sequence by string (-B/--seq-name-filter)
cmd = lapply(seq_along(gfiles), function(i){
glue::glue('unikmer map -m 31 -M -a -g {all_files$genome[[i]]} {all_files$kmer_uniq[[i]]}.unik | seqkit sort -l -r -o {all_files$fasta_all[[i]]}')
})
run_cmd(cmd, "constructing strain-specific DNA fragements")What are the sizes of those fasta output files? Please note some of the fasta file can be empty if no enough available kmers.
How many sequences in different strains?
## summary genome/strain specific sequences
cmd = paste('seqkit stats -T', paste(all_files$fasta_all, collapse = " "))
out = system(cmd, intern = TRUE)
read.delim(text = out) |> as_tibble()# A tibble: 3 × 8
file format type num_seqs sum_len min_len avg_len max_len
<chr> <chr> <chr> <int> <int> <int> <dbl> <int>
1 /Users/gaoch/GitHub/Syn… FASTA DNA 1066 5519220 31 5178. 279884
2 /Users/gaoch/GitHub/Syn… FASTA DNA 1184 6101182 31 5153 216372
3 /Users/gaoch/GitHub/Syn… FASTA DNA 1375 5120607 31 3724. 365894
Some of the genome may contain several thousand of specific regions/DNA fragments. Only one is needed for the following primer design. So I just keep one.
## find longest specific sequence
# only keep one sequence for a strain,保留最长的一条
cmd = lapply(seq_along(gfiles), function(i){
paste('seqkit head -n 1 --quiet', all_files$fasta_all[[i]], '-o', all_files$fasta_one[[i]])
})
run_cmd(cmd, "find longest specific sequence")Using rPrimer3 to design primer with *.one.fa sequences.
library(rPrimer3)
dir = rprojroot::find_rstudio_root_file()
# set the path to the primer3 parameters
setwd("/opt/homebrew/Cellar/primer3/2.4.0/share/primer3")
primers = lapply(seq_along(gfiles), function(i){
design_primer_from_file(all_files$fasta_one[[i]],
parts = 1,
PRIMER_PRODUCT_SIZE_RANGE = "75-100")
})
setwd(dir)Subsequently, we use DECIPHER::AmplifyDNA() to check primer specificity.
suppressPackageStartupMessages(library(DECIPHER))
# read all genomes
genome = readDNAStringSet(gfiles)
n = sapply(gfiles, function(x) system(paste("zgrep '>' ", x ," | wc -l"), intern=TRUE))
source = rep(basename(gfiles), times = n) |> gsub(pattern=".fa.gz", replacement="")
names(genome) = source
genomeDNAStringSet object of length 15:
width seq names
[1] 5510768 TTGGAAAACATTGCGGATCTTT...GATGGAAAAGGAGGGATATCA CK22
[2] 377383 ATGGAGGTATGTGTATGCATAA...ATATCTAAGTAAAGTTATCTT CK22
[3] 6078688 TTGGATAACATCGATAGCCTCT...TTTTGCAAAGGAGGGATAAGT CK6
[4] 216280 ATGTCAGAAGAAACCTTTTGGA...ATAAATTTAGGGGGTTCAATA CK6
[5] 5100579 TTGGAAAACATTCATGATTTAT...AGAAAAAGGAGGGATTGCTCG CK7
... ... ...
[11] 115146 ATGAAACCAACTATATATGACG...GATGAGGAAAGAGGAGGAAAA CK7
[12] 102333 ATGTCTGAAAAAGTATTAGAAA...AAAAAATAGGAGGTTGTTTCA CK7
[13] 48257 ATGAAAAAAATTGTTTTAGTTA...TAAAAAATGCGAGGAAATGTA CK7
[14] 13676 ATGAACAAGATATTGACTTGCT...CTTAAAGGAGAAGATGAACAG CK7
[15] 10216 GTGGCTAAAATATCAAAAGTGG...AAAATGATAGAGGTGGACAAT CK7
扩增产物的 names 列由 3 部分组成,第一个是扩增效率,第二个是所用的引物编号,第三个是模板的 ID。
products = mclapply(seq_along(gfiles), function(i){
primer = primers[[i]]
if (inherits(primer, "data.frame")){
product = AmplifyDNA(primer$sequence,
genome,
annealingTemp = 55,
P = 4e-7,
maxProductSize = 1000,
minEfficiency = 0.2)
return(product)
} else {
return(NULL)
}
}, mc.cores = 16)Primer
# A tibble: 2 × 3
name sequence product_size
<chr> <chr> <chr>
1 tig00001:4963923-5243806f tgagttgcgctgctgttttc 81
2 tig00001:4963923-5243806r tatcagcgcgccgaaaaatg 81
Product
DNAStringSet object of length 1:
width seq names
[1] 81 TGAGTTGCGCTGCTGTTTTCCTG...CCACATTTTTCGGCGCGCTGATA 100% (1 x 2) CK22
Primer
# A tibble: 2 × 3
name sequence product_size
<chr> <chr> <chr>
1 tig00001:1326286-1542657f ttttacgaccggattggcca 98
2 tig00001:1326286-1542657r gtccgatgttcgattgctgc 98
Product
DNAStringSet object of length 1:
width seq names
[1] 98 TTTTACGACCGGATTGGCCAGTT...TCGGCAGCAATCGAACATCGGAC 99.9% (1 x 2) CK6
Primer
# A tibble: 2 × 3
name sequence product_size
<chr> <chr> <chr>
1 tig00001:988080-1353973f aacgaaaaacagcagccgac 98
2 tig00001:988080-1353973r cgcttcgccaaaaccaaaga 98
Product
DNAStringSet object of length 1:
width seq names
[1] 98 AACGAAAAACAGCAGCCGACGTA...ATATCTTTGGTTTTGGCGAAGCG 99.9% (1 x 2) CK7
虽然该途径可以找到一些比较长的特异性序列,但是在引物设计方面的性能仍然不能令人满意。在这里,我们测试了 32 个基因组,其中就有 3 - 5 个基因组没有找到合适的引物。
没有得到引物的原因,主要是这样一个流程的参数设置过于严格了。如果那些共有的 kmer 处在两个小片段之间,那么缺失这些 kmer 会造成组装失败,从而无法得到足够长的模板。
实际上,特异性的引物不需要要求扩增的序列本身是特异性的。哪怕是一段比较保守的序列,只要引物本身存在差异,是不影响对片段进行特异性扩增的。
如果使用的基因组数量很少,那么设计引物的时候又会面临新的问题。那就是引物的特异性可能会比较差。为了解决这一问题,应当尝试在软件流程中加入一个最常见的 kmer 矩阵,能够在基因组数量比较少的时候对 kmer 进行过滤,使得设计的引物仍然具有较高的特异性。
这样的 kmer 矩阵可以有多个,分别对应着不同的过滤强度。
Pick primer with unikmer
虽然 kmer 与平常设计的 primer 之间有一些不同,但是应该差不多。如果能用 unikmer 设计引物,那可以大大提高引物设计速度。希望能够实现。
一个可行的思路是,得到每个物种特异的 kmer 之后,不用于组装成长片段,而是比较两两 kmer 之间的距离,计算 kmer 与模板结合的亲和力(决定扩增效率),然后根据设定的参数选取距离合适(相当于产物长度)的 kmer 作为引物。
虚拟 PCR
给定一个引物,输入一个基因组,能够计算扩增效率,得到扩增的结果。这方面在 DECIPHER::AmplifyDNA() 中有涉及。不过它也是通过一个 hybrid-min 程序获取的。我看了计算的源代码,扩增效率的计算好像也不是很复杂,能不能一并实现了。
Unikmer 中,“K-mers are either encoded (k<=32) or hashed (k<=64, using ntHash v1) into uint64, and serialized in binary file with extension .unik”。如何理解这句话呢?
这句话描述了 K-mers(一种生物信息学中的DNA序列片段)是如何处理和存储的。具体来说:
K-mers:这是指长度为 k 的 DNA 序列片段。例如,对于一个长度为 k 的 K-mer,可能是像“ATCGGTA”这样的 DNA 序列。
encoded (k<=32):当 k 的值小于等于 32 时,K-mers 被“编码”。编码通常意味着将DNA序列转换为一种紧凑的二进制表示形式。
hashed (k<=64, using ntHash v1):当k的值小于等于 64 时,K-mers 被“哈希”。哈希是将DNA序列通过一个哈希函数(在这里是 ntHash v1)转换为一个固定长度的数值(通常是一个 64 位的整数)。
into uint64:无论是编码还是哈希,最终都将 K-mers 转换为一个 64 位的无符号整数(uint64)。
serialized in binary file with extension .unik:最后,这些64位的无符号整数将被序列化,存储在一个扩展名为 .unik 的二进制文件中。序列化是指将数据结构转换为可以存储或传输的格式。
总结来说,这句话描述了一个处理和存储 K-mers 的方法,根据 K-mers 的长度选择不同的转换方式(编码或哈希),并最终将其以 64 位无符号整数的形式存储在二进制文件中。
编码、哈希和序列化的原理,可以通过以下示例进一步了解。
编码是将数据转换成另一种格式,以便进行高效存储或处理。在处理DNA序列时,常见的编码方法是将每个碱基(A、C、G、T)转换为一个二进制码。
DNA序列:ATCG
编码方式: - A -> 00 - T -> 11 - C -> 01 - G -> 10
ATCG 编码为:00110110
哈希是一种将任意长度的数据映射到固定长度的数值的方法。哈希函数会将输入数据转换为一个唯一的哈希值。
假设我们使用一个简单的哈希函数,将字符的ASCII值相加并取模10。
DNA序列:ATCG
ASCII值: - A -> 65 - T -> 84 - C -> 67 - G -> 71
哈希值计算: [ (65 + 84 + 67 + 71) % 10 = 287 % 10 = 7 ]
所以,ATCG 的哈希值为 7。
序列化是将数据结构或对象转换为一种可以存储或传输的格式(如二进制或文本)。反序列化则是将这种格式恢复为原始数据结构或对象。
假设我们有一个包含若干DNA序列信息的结构:
data = {
'sequence1': 'ATCG',
'sequence2': 'GGTA'
}我们可以使用Python的pickle模块进行序列化:
import pickle
import os
# 序列化
try:
serialized_data = pickle.dumps(data)
print("序列化成功")
except Exception as e:
print(f"序列化失败: {e}")序列化成功
# 保存到文件
file_path = "temp/ws/data.pkl" # 使用与 R 项目路径一致的 workspace
try:
if not os.path.exists(os.path.dirname(file_path)):
os.makedirs(os.path.dirname(file_path), exist_ok=True)
print(f"目录 {os.path.dirname(file_path)} 创建成功")
with open(file_path, 'wb') as file:
file.write(serialized_data)
print(f"数据已成功保存到 {file_path}")
except Exception as e:
print(f"保存数据失败: {e}")54
数据已成功保存到 temp/ws/data.pkl
保存后的二进制文件data.pkl可以传输或存储。
反序列化:
# 从文件读取
with open(file_path, 'rb') as file:
loaded_data = pickle.loads(file.read())
print(loaded_data){'sequence1': 'ATCG', 'sequence2': 'GGTA'}
假设我们有一个K-mer:ATCGTACG,长度为8。
def encode_kmer(kmer):
encoding = {'A': '00', 'T': '11', 'C': '01', 'G': '10'}
return ''.join([encoding[base] for base in kmer])
encoded_kmer = encode_kmer('ATCGTACG')
print(encoded_kmer) # 输出:001101111000010011011011000110
def hash_kmer(kmer):
return hash(kmer) % (2**64)
hashed_kmer = hash_kmer('ATCGTACG')
print(hashed_kmer) # 输出:一个64位无符号整数5104474989518514624
import pickle
kmer_data = {'kmer': 'ATCGTACG', 'encoded': encoded_kmer, 'hashed': hashed_kmer}
# 序列化
serialized_kmer_data = pickle.dumps(kmer_data)
# 保存到文件
with open('kmer_data.unik', 'wb') as file:
file.write(serialized_kmer_data)82
对K-mers进行编码、哈希处理和序列化存储的意义在于:
编码和哈希处理将K-mers从原本的字符序列转换为紧凑的二进制格式或固定长度的数值,这样可以显著减少存储空间。
编码和哈希处理有助于提高计算效率,特别是在大规模数据处理和查询时。
通过标准化的编码和哈希处理,确保所有K-mers以一致的格式存储和处理,便于数据共享和再现性。
编码和哈希可以将原本较长的DNA序列压缩成更短的表示形式,节省存储空间。
序列化存储使得数据可以方便地传输和共享,并且可以在不同系统或平台之间进行数据交换。
对 K-mers 进行编码、哈希处理和序列化存储,不仅可以提高存储和计算效率,还能确保数据的一致性和可移植性,这对于大规模生物信息学数据处理具有重要意义。
unikmerunikmer - a versatile toolkit for k-mers with taxonomic information
unikmer is a toolkit for nucleic acid k-mer analysis, providing functions
including set operation on k-mers optional with TaxIds but without count
information.
K-mers are either encoded (k<=32) or hashed (k<=64) into 'uint64',
and serialized in binary file with the extension '.unik'.
TaxIds can be assigned when counting k-mers from genome sequences,
and LCA (Lowest Common Ancestor) is computed during set opertions
including computing union, intersection, set difference, unique and
repeated k-mers.
Version: v0.20.0
Author: Wei Shen <shenwei356@gmail.com>
Documents : https://bioinf.shenwei.me/unikmer
Source code: https://github.com/shenwei356/unikmer
Dataset (optional):
Manipulating k-mers with TaxIds needs taxonomy file from e.g.,
NCBI Taxonomy database, please extract "nodes.dmp", "names.dmp",
"delnodes.dmp" and "merged.dmp" from link below into ~/.unikmer/ ,
ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz ,
or some other directory, and later you can refer to using flag
--data-dir or environment variable UNIKMER_DB.
For GTDB, use 'taxonkit create-taxdump' to create NCBI-style
taxonomy dump files, or download from:
https://github.com/shenwei356/gtdb-taxonomy
Note that TaxIds are represented using uint32 and stored in 4 or
less bytes, all TaxIds should be in the range of [1, 4294967295].
Usage:
unikmer [command]
Available Commands:
autocompletion Generate shell autocompletion script (bash|zsh|fish|powershell)
common Find k-mers shared by most of the binary files
concat Concatenate multiple binary files without removing duplicates
count Generate k-mers (sketch) from FASTA/Q sequences
decode Decode encoded integer to k-mer text
diff Set difference of k-mers in multiple binary files
dump Convert plain k-mer text to binary format
encode Encode plain k-mer texts to integers
filter Filter out low-complexity k-mers (experimental)
grep Search k-mers from binary files
head Extract the first N k-mers
info Information of binary files
inter Intersection of k-mers in multiple binary files
locate Locate k-mers in genome
map Mapping k-mers back to the genome and extract successive regions/subsequences
merge Merge k-mers from sorted chunk files
num Quickly inspect the number of k-mers in binary files
rfilter Filter k-mers by taxonomic rank
sample Sample k-mers from binary files
sort Sort k-mers to reduce the file size and accelerate downstream analysis
split Split k-mers into sorted chunk files
tsplit Split k-mers according to taxid
union Union of k-mers in multiple binary files
version Print version information and check for update
view Read and output binary format to plain text
Flags:
-c, --compact write compact binary file with little loss of speed
--compression-level int compression level (default -1)
--data-dir string directory containing NCBI Taxonomy files, including nodes.dmp,
names.dmp, merged.dmp and delnodes.dmp (default "/Users/gaoch/.unikmer")
-h, --help help for unikmer
-I, --ignore-taxid ignore taxonomy information
-i, --infile-list string file of input files list (one file per line), if given, they are
appended to files from cli arguments
--max-taxid uint32 for smaller TaxIds, we can use less space to store TaxIds. default value
is 1<<32-1, that's enough for NCBI Taxonomy TaxIds (default 4294967295)
-C, --no-compress do not compress binary file (not recommended)
--nocheck-file do not check binary file, when using process substitution or named pipe
-j, --threads int number of CPUs to use (default 4)
--verbose print verbose information
Use "unikmer [command] --help" for more information about a command.
sessionInfo()R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Asia/Singapore
tzcode source: internal
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DECIPHER_2.28.0 RSQLite_2.3.7 Biostrings_2.68.1
[4] GenomeInfoDb_1.36.4 XVector_0.40.0 IRanges_2.34.1
[7] S4Vectors_0.38.2 BiocGenerics_0.46.0 rPrimer3_0.3.2
[10] dplyr_1.1.4
loaded via a namespace (and not attached):
[1] utf8_1.2.4 generics_0.1.3 bitops_1.0-9
[4] lattice_0.21-9 stringi_1.8.4 digest_0.6.37
[7] magrittr_2.0.3 grid_4.3.2 evaluate_0.23
[10] blob_1.2.4 fastmap_1.2.0 seqinr_4.2-30
[13] Matrix_1.6-4 R.oo_1.25.0 rprojroot_2.0.4
[16] jsonlite_1.8.9 R.utils_2.12.3 DBI_1.2.3
[19] fansi_1.0.6 ade4_1.7-22 cli_3.6.3
[22] rlang_1.1.4 crayon_1.5.3 R.methodsS3_1.8.2
[25] bit64_4.5.2 cachem_1.1.0 withr_3.0.1
[28] tools_4.3.2 memoise_2.0.1 GenomeInfoDbData_1.2.10
[31] here_1.0.1 reticulate_1.39.0 png_0.1-8
[34] vctrs_0.6.5 R6_2.5.1 lifecycle_1.0.4
[37] zlibbioc_1.46.0 stringr_1.5.1 bit_4.5.0
[40] htmlwidgets_1.6.4 MASS_7.3-60 pkgconfig_2.0.3
[43] pillar_1.9.0 glue_1.8.0 Rcpp_1.0.13
[46] xfun_0.48 tibble_3.2.1 tidyselect_1.2.1
[49] knitr_1.48.6 htmltools_0.5.8.1 rmarkdown_2.27
[52] compiler_4.3.2 RCurl_1.98-1.16
含有 rush 的命令在 R Markdown 编译的时候好像会出问题。↩︎
SHEN Wei(沈伟), Associate Professor in Bioinformatics, Institute for Viral Hepatitis, The Second Affiliated Hospital of Chongqing Medical University, China.↩︎
在 R 语言也支持并行计算,相关的方法参见 parallel 包的文档。↩︎