Choice of the best matrix and parameters for GDH sequences alignment

The alignments are made using : ClustalW at EBI (Cambridge)

Reference : Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994) "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice" Nucleic Acids Res. 22, p 4673 - 4680

Example with the subset H :

GDH NOT EC classified
Viridiplantae
length range [411 - 470]

Get the files (ZIP compressed) in FASTA format : Subset H

 

matrix
PAM
BLOSUM
GONNET
Id

alignment score

(default values for all parameters)

25246
25366
25471
25487

 

parameter

value

score PAM

score BLOSUM

score GONNET

score Id

"Gapopen"

(default = 10)

1

25818

26193

26243

26720

2

25783

25783

26044

26224

100

21890

22150

22636

22636

"Endgap"

10

25246

25366

25471

25487

20

25246

25366

25471

25487

"Gapext"

(default = 0,2)

0,05

25246

25366

25471

25487

0,5

25246

25366

25471

25487

5

25246

25366

25471

25471

10

25246

25366

25471

25471

Gapdist

(default = 4)

10

25246

25366

25471

25487

5

25246

25366

25471

25487

1

25246

25366

25471

25487

 

Matrix

Gapopen

Endgap

Gapext

Score

Gonnet

default = 10

default = ?

default = 0,05

33558

PAM

default

default

default

33207

Gonnet

1

default

default

35545 (Highest score)

PAM

1

default

default

35003

Gonnet

1

10

default

35545

Gonnet

1

20

default

35545

Gonnet

1

default

0,05

35545

Gonnet

1

default

0,5

35511

Gonnet

1

default

5

35076

Gonnet

default

default

5

33298

Gonnet

1

default

10

34737

Gonnet

1

default

1

35503 (Best alignment)

 

Finally, "eye" inspection allows to choose the Gonnet matrix although the highest score is obtained with the Id matrix.

1. Id matrix : Gapopen = 1 - Other default values : SCORE = 26720

gi|15240793|ref|NP_196361.1|      ILG-LDSKI----ERSLMI-PFREIKVECTIPKDDGTLVSYIGFRVQHDN 60
gi|15004984|dbj|BAB62170.1|       ILG-LDSKI----EKSLMI-PFREIKVECTIPKDDGTLVSYVGFRVQHDN 60
gi|28269441|gb|AAO37984.1|        LLG-LDSKL----EKSLLI-PFREIKVECTIPKDDGTLASYVGFRVQHDN 60
gi|7431768|pir||T16982            LLG-LDSKL----EQCLLI-PFREIKVECTIPKDDGSLATFIGFRVQHDN 60
gi|15054452|dbj|BAB62312.1|       -LAVLD--LPPAMEK-IVITPQREMTVELIINRDDGKPESFMGYRVQHDN 94
gi|15054450|dbj|BAB62311.1|       -LAVLD--LPPAMEK-IVITPQREMTVELIINRDDGKPESFMGYRVQHDN 94
                                   *. **  :    *: ::* * **:.**  * :***.  :::*:******

2. GONNET matrix : Gapopen = 1 - Other default values : SCORE = 26243

gi|15240793|ref|NP_196361.1|      ILGLDSKIERSLMIPFREIKVECTIPKDDGTLVSYIGFRVQHDNARGPMK 66
gi|15004984|dbj|BAB62170.1|       ILGLDSKIEKSLMIPFREIKVECTIPKDDGTLVSYVGFRVQHDNARGPMK 66
gi|28269441|gb|AAO37984.1|        LLGLDSKLEKSLLIPFREIKVECTIPKDDGTLASYVGFRVQHDNARGPMK 66
gi|7431768|pir||T16982            LLGLDSKLEQCLLIPFREIKVECTIPKDDGSLATFIGFRVQHDNARGPMK 66
gi|15054452|dbj|BAB62312.1|       VLDLPPAMEKIVITPQREMTVELIINRDDGKPESFMGYRVQHDNARGPFK 100
gi|15054450|dbj|BAB62311.1|       VLDLPPAMEKIVITPQREMTVELIINRDDGKPESFMGYRVQHDNARGPFK 100
                                  :*.* . :*: :: * **:.**  * :***.  :::*:**********:*