精品成人毛片一区二区视-精品成人久久-精品a在线观看-精品a视频-精品99一区二区三区麻豆-精品91自产拍在线观看一区

SignalP+TMHMM預(yù)測微生物分泌蛋白?廣微測是最權(quán)威的檢測中心嗎??健明迪

更新時(shí)間：2025-07-12 來源：健明迪檢測

SignalP+TMHMM預(yù)測微生物分泌蛋白

華中農(nóng)業(yè)大學(xué) 微生物學(xué)博士

Secretory Protein是指在細(xì)胞內(nèi)分解后，分泌到細(xì)胞外起作用的蛋白質(zhì)。分泌蛋白的N 端有普通由15～30 個(gè)氨基酸組成的信號(hào)肽。信號(hào)肽是引導(dǎo)新分解的蛋白質(zhì)向分泌通路轉(zhuǎn)移的短（長度5-30個(gè)氨基酸）肽鏈。常指新分解多肽鏈中用于指點(diǎn)蛋白質(zhì)的跨膜轉(zhuǎn)移（定位）的N-末端的氨基酸序列（有時(shí)不一定在N端）。運(yùn)用SignalP 注釋蛋白序列能否含有信號(hào)肽結(jié)構(gòu)，運(yùn)用TMHMM注釋蛋白序列能否含有跨膜結(jié)構(gòu)，*終挑選出含有信號(hào)肽結(jié)構(gòu)并且不含跨膜結(jié)構(gòu)的蛋白為分泌蛋白。

軟件Software

SignalP V6.0
SignalP 6.0 預(yù)測來自古細(xì)菌、革蘭氏陽性細(xì)菌、革蘭氏陰性細(xì)菌和真核生物的蛋白質(zhì)中存在的信號(hào)肽predicts signal peptides and the location of their cleavage sites in proteins from Archaea, Gram-positive Bacteria,及其切割位點(diǎn)的位置。Gram-negative Bacteria and Eukarya.在細(xì)菌和古細(xì)菌中，SignalP 6.0 可以區(qū)分五種類型的信號(hào)肽：In Bacteria and Archaea, SignalP 6.0 can discriminate between five types of signal peptides:

Sec/SPI：由 Sec 轉(zhuǎn)座轉(zhuǎn)運(yùn)，并由信號(hào)肽酶 I (Lep) 切割的“規(guī)范”分泌信號(hào)肽；"Standard" secretory signal peptides transported by Sec translocon and cleaved by Signal Peptidase I (Lep).
Sec/SPII：由 Sec 轉(zhuǎn)座子運(yùn)輸，并由信號(hào)肽酶 II (Lsp) 切割的脂蛋白信號(hào)肽；lipoprotein signal peptides transported by the Sec translocon and cleaved by Signal Peptidase II (Lsp).
Tat/SPI：由 Tat 轉(zhuǎn)座子轉(zhuǎn)運(yùn)，并由信號(hào)肽酶 I (Lep) 切割的 Tat 信號(hào)肽；Tat signal peptides transported by the Tat translocon and cleaved by Signal Peptidase I (Lep).
Tat/SPII：由 Tat 轉(zhuǎn)位子轉(zhuǎn)運(yùn)，并由信號(hào)肽酶 II (Lsp) 切割的 Tat 脂蛋白信號(hào)肽；Tat lipoprotein signal peptides transported by Tat translocon & cleaved by Signal Peptidase II (Lsp).
Sec/SPIII：由 Sec 轉(zhuǎn)位子運(yùn)輸，并由信號(hào)肽酶 III (PilD/PibD) 切割的菌毛蛋白和菌毛蛋白樣信號(hào)肽。Pilin & pilin-like signal peptides transported by Sec translocon & cleaved by Signal Peptidase III (PilD/PibD).
此外，SignalP 6.0 預(yù)測信號(hào)肽的區(qū)域。Additionally, SignalP 6.0 predicts the regions of signal peptides.依據(jù)類型，預(yù)測 n、h 和 c 區(qū)域以及其他顯著特征的位置。Depending on the type, the positions of n-, h- and c-regions as well as of other distinctive features are predicted.

TMHMM V2.0c

用于預(yù)測蛋白質(zhì)中的跨膜螺旋。

Python

SignalP和TMHMM關(guān)于學(xué)術(shù)用戶收費(fèi)，但是需求填寫相關(guān)信息和郵箱，以接納下載鏈接（4h有效時(shí)間）。

軟件裝置Installation of Softwares

裝置SignalP 6.0

下載訪問SignalP V6.0網(wǎng)站，找到“Download”，填寫相關(guān)信息，獲取下載鏈接，下載失掉“signalp-6.0.fast.tar.gz”。有兩個(gè)形式可以選擇——“slow_sequential”和“fast"。前者runs the full model sequentially, taking the same amount of RAM as fast but being 6 times slower；后者uses a smaller model that approximates the performance of the full model, requiring a fraction of the resources and being significantly faste。本教程下載的是fast形式。
裝置Installation

裝置依賴Dependencies

Python
matplotlib>3.3.2
numpy>1.19.2
torch>1.7.0 pip install torch
tqdm>4.46.1

裝置SignalP 6.0 # 解緊縮裝置文件 tar zxvf signalp-6.0.fast.tar.gz # 進(jìn)入解壓后的軟件目錄，在終端運(yùn)轉(zhuǎn) python setup.py install # 測試裝置 signalp6 --help

裝置TMHMM V2.0c

下載訪問TMHMM V2.0c網(wǎng)站，找到“Download”，填寫相關(guān)信息，獲取下載鏈接，下載失掉“tmhmm-2.0c.Linux.tar.gz”。
裝置 # 解緊縮 tar zxvf tmhmm-2.0c.Linux.tar.gz # 進(jìn)入解壓后的目錄 cd tmhmm-2.0c # 獲取以后途徑，我的是“/home/liu/tools/tmhmm-2.0c/bin” pwd # 將該途徑參與到系統(tǒng)的環(huán)境變量中，參考我之前的文章來（編輯~/.bashrc）http://liaochenlanruo.github.io/post/f6c9.html#%E6%B7%BB%E5%8A%A0%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F # 修正bin目錄下的tmhmm和tmhmmformat.pl的首行為“#!/usr/bin/perl”
運(yùn)轉(zhuǎn)錯(cuò)誤運(yùn)轉(zhuǎn)軟件時(shí)總報(bào)Segmentation fault (core dumped)錯(cuò)誤，暫時(shí)無解。各位可以運(yùn)用其在線版。

軟件用法Usage

SignalP 6.0

預(yù)測Prediction

A command takes the following form

signalp6 --fastafile /path/to/input.fasta --organism other --output_dir path/to/be/saved --format txt --mode fast

fastafile 輸入文件為FASTA格式的蛋白序列文件Specifies the fasta file with the sequences to be predicted.。
organism is either other or Eukarya. Specifying Eukarya triggers post-processing of the SP predictions to prevent spurious results (only predicts type Sec/SPI).
format can take the values txt, png, eps, all. It defines what output files are created for individual sequences. txtproduces a tabular .gff file with the per-position predictions for each sequence. png, eps, all additionally produce probability plots in the requested format. For larger prediction jobs, plotting will slow down the processing speed significantly.
mode is either fast, slow or slow-sequential. Default is fast, which uses a smaller model that approximates the performance of the full model, requiring a fraction of the resources and being significantly faster. slow runs the full model in parallel, which requires more than 14GB of RAM to be available. slow-sequential runs the full model sequentially, taking the same amount of RAM as fast but being 6 times slower. If the specified model is not installed, SignalP will abort with an error.

輸入Outputs

output_dir/output.gff3：僅包括含有信號(hào)肽的序列信息；

output_dir/prediction_results.txt：包括了輸入文件中的一切序列（不重要）；
output_dir/region_output.gff3：包括一切的信號(hào)肽區(qū)域信息。

n-region: The n-terminal region of the signal peptide. Reported for Sec/SPI, Sec/SPII, Tat/SPI and Tat/SPII. Labeled as N
h-region: The center hydrophobic region of the signal peptide. Reported for Sec/SPI, Sec/SPII, Tat/SPI and Tat/SPII. Labeled as H
c-region: The c-terminal region of the signal peptide, reported for Sec/SPI and Tat/SPI.
Cysteine: The conserved cysteine in +1 of the cleavage site of Lipoproteins that is used for Lipidation. Labeled as c.
Twin-arginine motif: The twin-arginine motif at the end of the n-region that is characteristic for Tat signal peptides. Labeled as R.
Sec/SPIII: These signal peptides have no known region structure.

批處置與結(jié)果優(yōu)化

腳本名：run_SignalP.pl

#!/usr/bin/perl

use strict;

use warnings;

# Author: Liu Hualin

# Date: Oct 14, 2021

open IDNOSEQ, ">IDNOSEQ.txt" || die;

my @faa = glob("*.faa");

foreach (@faa) {

$_ =~ /(.+).faa/;

my $str = $1;

my $out = $1 . ".nodesc";

my $sigseq = $1 . ".sigseq";

my $outdir = $1 . "_signalp";

open IN, $_ || die;

open OUT, ">$out" || die;

while () {

chomp;

if (/^(>\S+)/) {

print OUT $1 . "\n";

}else {

print OUT $_ . "\n";

}

close IN;

close OUT;

my %hash = idseq($out);

system("signalp6 --fastafile $out --organism other --output_dir $outdir --format txt --mode fast");

my $gff = $outdir . "/output.gff3";

if (! -z $gff) {

open IN, "$gff" || die;

;

open OUT, ">$sigseq" || die;

while () {

chomp;

my @lines = split /\t/;

if (exists $hash{$lines[0]}) {

print OUT ">$lines[0]\n$hash{$lines[0]}\n";

}else {

print IDNOSEQ $str . "\t" . "$lines[0]\n";

}

close IN;

close OUT;

}

system("rm $out");

system("mv $sigseq $outdir");

}

close IDNOSEQ;

sub idseq {

my ($fasta) = @_;

my %hash;

local $/ = ">";

open IN, $fasta || die;

;

while () {

chomp;

my ($header, $seq) = split (/\n/, $_, 2);

$header =~ /(\S+)/;

my $id = $1;

$hash{$id} = $seq;

}

close IN;

return (%hash);

}

將run_SignalP.pl與后綴名為“.faa”的FASTA格式文件放在同一目錄下，在終端中運(yùn)轉(zhuǎn)如下代碼：

perl run_SignalP.pl

結(jié)果解讀Output interpretation

*代表輸入文件的名字。

*_signalp/output.gff3：僅包括含有信號(hào)肽的序列信息；
*_signalp/prediction_results.txt：包括了輸入文件中的一切序列（不重要）；
*_signalp/region_output.gff3：包括一切的信號(hào)肽區(qū)域信息;
*_signalp/*.sigseq：存儲(chǔ)一切信號(hào)肽的氨基酸序列文件，可用作TMHMM的輸入文件。

TMHMM

預(yù)測

離線版總是報(bào)錯(cuò)，找不出緣由，因此運(yùn)用網(wǎng)頁效勞器停止，輸入文件為上述生成的“*_signalp/*.sigseq”，將其上傳至網(wǎng)頁版TMHMM，提交義務(wù)，等候結(jié)果即可。

結(jié)果展現(xiàn)

TMHMM可以輸入多種格式的結(jié)果文件，詳細(xì)請(qǐng)參考其官方說明。

在TMHMM網(wǎng)站提交義務(wù)

Long output format

Length：蛋白序列的長度。The length of the protein sequence.
Number of predicted TMHs：預(yù)測到的跨膜螺旋的數(shù)量。The number of predicted transmembrane helices.
Exp number of AAs in TMHs：跨膜螺旋中氨基酸的預(yù)期數(shù)量。The expected number of amino acids intransmembrane helices. 假設(shè)此數(shù)字大于 18，則很能夠是跨膜蛋白（或具有信號(hào)肽）。If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide).
Exp number, first 60 AAs：在蛋白的前60個(gè)氨基酸中跨膜螺旋中氨基酸的預(yù)期數(shù)量。The expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein.假設(shè)該數(shù)字超越幾個(gè)，你應(yīng)該被正告在 N 端預(yù)測的跨膜螺旋能夠是一個(gè)信號(hào)肽。If it more than a few, you are warned that a predicted transmembrane helix in the N-term could be a signal peptide.
Total prob of N-in：N端在膜的細(xì)胞質(zhì)一側(cè)的總概率。The total probability that the N-term is on the cytoplasmic side of the membrane.
POSSIBLE N-term signal sequence：當(dāng)“Exp number, first 60 AAs”大于 10 時(shí)發(fā)生的正告。A warning that is produced when "Exp number, first 60 AAs" is larger than 10.

蛋白F01_bin.1_00110合計(jì)436個(gè)氨基酸，有5個(gè)跨膜螺旋結(jié)構(gòu)。

蛋白F01_bin.1_00142合計(jì)557個(gè)氨基酸，一切序列均在膜外，即該序列編碼的是分泌蛋白。

Short output format

"len="：蛋白序列的長度。The length of the protein sequence.
"ExpAA="：跨膜螺旋中氨基酸的預(yù)期數(shù)量。The expected number of amino acids intransmembrane helices.假設(shè)此數(shù)字大于 18，則很能夠是跨膜蛋白（或具有信號(hào)肽）。If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide).
"First60="：在蛋白的前60個(gè)氨基酸中跨膜螺旋中氨基酸的預(yù)期數(shù)量。The expected number of amino acids in transmembrane helices in the first 60 amino acids of the protein.假設(shè)該數(shù)字超越幾個(gè)，你應(yīng)該被正告在 N 端預(yù)測的跨膜螺旋能夠是一個(gè)信號(hào)肽。If it more than a few, you are warned that a predicted transmembrane helix in the N-term could be a signal peptide.
"PredHel="：預(yù)測到的跨膜螺旋的數(shù)量。The number of predicted transmembrane helices by N-best.
"Topology="：N-best 預(yù)測的拓?fù)浣Y(jié)構(gòu)。The topology predicted by N-best.拓?fù)涫怯煽缒ぢ菪奈恢媒o出的，假設(shè)螺旋在外部，則由“i”分隔，假設(shè)螺旋在外部，則由“o”分隔。'i7-29o44-66i87-109o'意味著它從膜內(nèi)末尾，在位置7到29有一個(gè)預(yù)測的TMH，30-43在膜外，然后是位置44-66的TMH。

結(jié)果匯總

經(jīng)過網(wǎng)頁版預(yù)測我們僅失掉了一個(gè)列表文件（Short output format），該文件需求自己復(fù)制網(wǎng)頁內(nèi)容粘貼到新文件中，我將其命名為*_TMHMM_SHORT.txt，并將其寄存在*_signalp目錄中，該目錄是由run_SignalP.pl生成的。下面我將會(huì)統(tǒng)計(jì)各個(gè)基因組中信號(hào)肽蛋白的總數(shù)量、分泌蛋白數(shù)量和跨膜蛋白數(shù)量到文件Statistics.txt中，并區(qū)分提取每個(gè)基因組的分泌蛋白序列到*_signalp/*.secretory.faa文件中，提取跨膜蛋白序列到*_signalp/*.membrane.faa文件中。該進(jìn)程將經(jīng)過tmhmm_parser.pl完成。

#!/usr/bin/perl use strict; use warnings; # Author: Liu Hualin # Date: Oct 15, 2021 open OUT, ">Statistics.txt" || die; print OUT "Strain name\tSignal peptide numbers\tSecretory protein numbers\tMembrane protein numbers\n"; my @sig = glob("*_signalp"); foreach my $sig (@sig) { $sig=~/(.+)_signalp/; my $str = $1; my $tmhmm = $sig . "/$str" . "_TMHMM_SHORT.txt"; my $fasta = $sig . "/$str" . ".sigseq"; my $secretory = $str . ".secretory.faa"; my $membrane = $str . ".membrane.faa"; open SEC, ">$secretory" || die; open MEM, ">$membrane" || die; my $out = 0; my $on = 0; my %hash = idseq($fasta); open IN, $tmhmm || die; while () { chomp; $_=~s/[\r\n]+//g; # print $_ . "\n"; my @lines = split /\t/; if ($lines[5] eq "Topology=o") { $out++; print SEC ">$lines[0]\n$hash{$lines[0]}\n"; }else { $on++; print MEM ">$lines[0]\n$hash{$lines[0]}\n"; } } close IN; close SEC; close MEM; system("mv $secretory $membrane $sig"); my $total = $out + $on; print OUT "$str\t$total\t$out\t$on\n"; } close OUT; sub idseq { my ($fasta) = @_; my %hash; local $/ = ">"; open IN, $fasta || die; ; while () { chomp; my ($header, $seq) = split (/\n/, $_, 2); $header =~ /(\S+)/; my $id = $1; $hash{$id} = $seq; } close IN; return (%hash); }

運(yùn)轉(zhuǎn)方法：將tmhmm_parser.pl放在*_signalp的上一級(jí)目錄下，*_signalp目錄中必需包括*_TMHMM_SHORT.txt文件和*.sigseq文件。在終端運(yùn)轉(zhuǎn)如下代碼：

perl tmhmm_parser.pl

腳本獲取

本文腳本見GitHub。

敬告：運(yùn)用文中腳本請(qǐng)?jiān)帽疚木W(wǎng)址，請(qǐng)尊重自己的休息效果，謝謝！Notice: When you use the scripts in this article, please cite the link of this webpage. Thank you!

參考

原文鏈接：SignalP+TMHMM預(yù)測微生物分泌蛋白 | liaochenlanruo

轉(zhuǎn)載請(qǐng)注明出處！

編輯于 2021-12-28 09:33

「真誠贊賞，手留余香」

還沒有人贊賞，快來當(dāng)*個(gè)贊賞的人吧！

信號(hào)肽

生物信息學(xué)

SignalP+TMHMM預(yù)測微生物分泌蛋白?廣微測是*威望的檢測中心嗎??健明迪

如何檢測純真水設(shè)備能否有細(xì)菌繁殖呢

億涵凈水處置灌裝

保證產(chǎn)出水質(zhì)的潔凈是純真水設(shè)備消費(fèi)的關(guān)鍵，但是有時(shí)分也會(huì)出現(xiàn)純真水細(xì)菌繁殖的狀況，那么純真水設(shè)備如何檢測能否有細(xì)菌繁殖呢？罕見的有三種方法：

　　一、經(jīng)典微生物培育法：微生物培育法的要素包括：培育基的類型、培育溫度和培育時(shí)間。培育方法包括：燒注皿培育法、鋪平皿法、膜過濾法。

　　二、儀器法主要有：顯微鏡直接計(jì)數(shù)法、放射法、阻抗法以及多種生化方法。

1、優(yōu)點(diǎn)是精度好，準(zhǔn)確度高，可以在較短時(shí)間內(nèi)取得檢測結(jié)果，有利于停止及時(shí)控制。

2、缺陷是需人工處置樣品，任務(wù)量大，樣品處置量小，易受儀器等其他方面的制約，并且儀器法對(duì)微生物是破壞性的，它無法對(duì)污染菌作進(jìn)一步的分別和鑒別。

　　三、慣例方法：微生物的鑒別是一項(xiàng)專業(yè)性很強(qiáng)的任務(wù)，需少量任務(wù)閱歷及專業(yè)知識(shí)。

　　掌握純真水設(shè)備細(xì)菌檢測方法，足以可以看出各種不利于設(shè)備產(chǎn)水規(guī)范的現(xiàn)象，檢測出危機(jī)產(chǎn)水質(zhì)量的污染細(xì)菌種類，保證用戶可以及時(shí)處置效果，結(jié)合純真水設(shè)備運(yùn)轉(zhuǎn)條件保證系統(tǒng)產(chǎn)水動(dòng)搖、牢靠。

發(fā)布于 2022-11-13 15:33?IP 屬地山東

純真水