Lab Exercise 1

Abstract

Degenerate primer design is important when designing polymerase chain reaction (PCR) experiments to identify novel genes of interest in plant gene families. In this laboratory exercise we apply ClustalW multiple sequence alignment to the sequences of 6 members of the A. thaliana NHX gene family found using NCBI³. Consideration of their degeneracies helps us identify which primers might work best from the 10 possible primer sequences predicted by the j-CODEHOP primer prediction tool⁶. We then use nucleotide sequences from 2 gene family members to generate specific primers using nucleotide sequences instead of amino acid sequences and use the information about conserved amino acid sequences to inform our decisions about which primers might be more optimal than others. We conclude that degenerate primer design must be tailored to the proposed experimental design constraints and that it is up to the researcher to decide which primers are best for their individual needs.

Introduction

We begin designing a degenerate primer for a gene of interest by finding known representative sequences of the NHX gene family using a genomic database such as NCBI³. We subsequently design possible primers to identify specific genes using their nucleotide sequences, and design degenerate primers to identify novel genes by performing a multiple sequence alignment of the amino acid sequences of our query genes.

Methods

All information concerning the methodology of this lab exercise can be found in the class lab manual¹. The author used R for data analysis and presentation. Full source code can be found here with partial code snippets embedded in the document.

Results

The first analysis of this experimental exercise involved using ClustalW⁸ multiple sequence alignment algorithm as implemented by the Kyoto University Bioinformatics Center² to identify conserved amino acid sequences shared by the six members of the A. thaliana NHX gene family identified in the lab manual¹. The ClustalW algorithm aligns the sequences and outputs a conservation score for each amino acid. A "*" denotes complete complementarity, a “:” significant complementarity, and a “.” denotes weak complementarity. The website uses spaces to identify regions with no correspondence, here represented using underscores. The results are summarized in tables 1a - 1c below as part of 50 amino acid blocks further split into groups of 10. The groups containing the complementary regions are highlighted in gray. Click each button to see their respective summary tables.

Conserved Region 1

table1a <- read_csv('clustal_aln_table1.csv')
kable(table1a, "html", align="l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:7, monospace = TRUE) %>%
  row_spec(7, bold = TRUE, background = "#e6e6e6") %>%
  column_spec(6, background = "#e6e6e6")

Sequence ID	51-60aa	61-70aa	71-80aa	81-90aa	91-100aa
AAM08403.1	RWMNESITAL	LIGLGTGVVI	LLISRGKNS-	HLLVFSEDLF	FIYLLPPIIF
sp\|Q68KI4.2\|NHX1_ARATH	RWMNESITAL	LIGLGTGVTI	LLISKGKSS-	HLLVFSEDLF	FIYLLPPIIF
sp\|Q84WG1.2\|NHX3_ARATH	RWMNESITAL	IIGSCTGIVI	LLISGGKSS-	RILVFSEDLF	FIYLLPPIIF
AAM08405.1	RWVNESITAI	LVGAASGTVI	LLISKGKSS-	HILVFDEELF	FIYLLPPIIF
AAM08407.1	YYLPEASASL	LIGLIVGGLA	NISNTETSIR	TWFNFHDEFF	FLFLLPPIIF
AAM08406.1	HYLPEASGSL	LIGLIVGILA	NISDTETSIR	TWFNFHEEFF	FLFLLPPIIF
Conservation Output	_::_*:__::	::______	_:_.__..__	__:__:::	::******

Table 1a: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 2

table1b <- read_csv('clustal_aln_table2.csv')
kable(table1b, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:7, monospace = TRUE) %>% 
  row_spec(7, bold = TRUE, background = "#e6e6e6") %>%
  column_spec(5, background = "#e6e6e6")

Sequence ID	151-160aa	161-170aa	171-180aa	181-190aa	191-200aa
AAM08403.1	LGDFLAIGAI	FAATDSVCTL	QVLNQD-ETP	LLYSLVFGEG	VVNDATSVVL
sp\|Q68KI4.2\|NHX1_ARATH	LGDYLAIGAI	FAATDSVCTL	QVLNQD-ETP	LLYSLVFGEG	VVNDATSVVV
sp\|Q84WG1.2\|NHX3_ARATH	IADYLAIGAI	FSATDSVCTL	QVLNQD-ETP	LLYSLVFGEG	VVNDATSVVL
AAM08405.1	ARDYLAIGTI	FSSTDTVCTL	QILHQD-ETP	LLYSLVFGEG	VVNDATSVVL
AAM08407.1	FVECLMFGSL	ISATDPVTVL	SIFQELGSDV	NLYALVFGES	VLNDAMAISL
AAM08406.1	FVECLMFGAL	ISATDPVTVL	SIFQDVGTDV	NLYALVFGES	VLNDAMAISL
Conservation Output	:__:::_:	::*._.*_	.::::_____	_:***.	:**_::_:

Table 1b: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 3

table1c <- read_csv('clustal_aln_table3.csv')
kable(table1c, "html", align = "l") %>% 
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:7, monospace = TRUE) %>% 
  row_spec(7, bold = TRUE, background = "#e6e6e6") %>%
  column_spec(4, background = "#e6e6e6")

Sequence ID	251-260aa	261-270aa	271-280aa	281-290aa	291-300aa
AAM08403.1	FGRHSTD-RE	VALMMLMAYL	SYMLAELFAL	SGILTVFFCG	IVMSHYTWHN
sp\|Q68KI4.2\|NHX1_ARATH	FGRHSTD-RE	VALMMLMAYL	SYMLAELFDL	SGILTVFFCG	IVMSHYTWHN
sp\|Q84WG1.2\|NHX3_ARATH	IGRHSTD-RE	VALMMLLAYL	SYMLAELFHL	SSILTVFFCG	IVMSHYTWHN
AAM08405.1	FGRHSTT-RE	LAIMVLMAYL	SYMLAELFSL	SGILTVFFCG	VLMSHYASYN
AAM08407.1	LDVDNLQNLE	CCLFVLFPYF	SYMLAEGLSL	SGIVSILFTG	IVMKHYTYSN
AAM08406.1	LDTENLQNLE	CCLFVLFPYF	SYMLAEGVGL	SGIVSILFTG	IVMKRYTFSN
Conservation Output	:._..____*	_.::::.:	*****_._	.::::_	::.::_

Table 1c: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Multiple sequence alignment using ClustalW identified 3 conserved regions further summarized in table 2 below. Degeneracy scores were calculated for each conserved region by multiplying the individual amino acid residues by the number of codons which could code for them¹. In addition, the amino acid sequences were converted to nucleotide sequences using an online translation tool⁵. Preliminary examination shows that conserved region 1 exhibits a much higher degeneracy score than conserved regions 2 and 3.

table2 <- read_tsv('Table1_conserved_regions.csv')
kable(table2, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:3, monospace = TRUE)

Conserved Region	Protein Sequence	ClustalW Position (aa)	Calculated Degeneracy	Nucleotide Sequence
1	LLPPIIF	94-100	10368	YTNYTNCCNCCNATHATHTTY
2	LVFGE	186-190	384	YTNGTNTTYGGNGAR
3	SYMLAE	271-276	576	WSNTAYATGYTNGCNGAR

Table 2: A summary of the identified conserved amino acid stretches of length 5 or greater from the Clustal 2.1 multiple sequence alignment generated from 6 members of the A. thaliana NHX gene family.

For our third analysis 10 degenerate primer pairs were generated using the j-CODEHOP platform as part of the Base-by Base analysis sequence analysis suite⁶ using default settings and ClustalW for multiple sequence alignment, with an equal number in both the forward and reverse directions. These results are reported in Table 3 below.

table3 <- read_csv('codehopPrimersT.csv')
kable(table3, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:10, monospace = TRUE) %>%
  kable_styling() %>%
  scroll_box(width = "105%", height = "450px", fixed_thead = TRUE)

Primer Name	Primer Sequence 5’-3’	Direction	Annealing Temperature (C)	Primer Length (NT)	Clamp Length (NT)	Core length (NT(AA))	Degeneracy	Primer Location (AA)	Primer Location (NT)	Primer AA Sequence	Clamp Score
VFGE-F 32x	ACTCCTCTTCTGTATTCTCTGgtnttyggnga	forward	N/A	32	21	11(4)	32	179-189	535-566	TPLLYSLVFGE	62
VFGE-R 64x	CAGACGTAGCATCATTAACAACACCytcnccraanac	reverse	N/A	37	25	12(4)	64	186-198	556-592	VFGEGVVNDATSV	73
YMLA-F 16x	TATGATGCTGATGGCTTATTTCTCTtayatgytngc	forward	N/A	36	25	11(4)	16	263-275	789-824	LMmLmAYLSYMLA	68
MLAE-F 32x	GATGCTGATGGCTTATTTCTCTTATatgytngcnga	forward	N/A	36	25	11(4)	32	264-276	792-827	MmLmAYLSYMLAE	70
GIVM-F 48x	TGGTATTCTGACTGTCTTCTTCTGTggnathgtnat	forward	N/A	36	25	11(4)	48	281-293	843-878	SGILTVFFCGIVM	69
AETF-F 32x	TGCCTTTGCTATGATGTCCTTTCTTgcngaracntt	forward	N/A	36	25	11(4)	32	311-323	933-968	HfFAllSFLAETF	60
YMLA-R 64x	GAATACCAGACAGAGCAAATAGTTCngcnarcatrta	reverse	N/A	37	25	12(4)	64	272-284	814-850	YMLAELFsLSGIL	64
MLAE-R 64x	TCAGAATACCAGACAGAGCAAATAGytcngcnarcat	reverse	N/A	37	25	12(4)	64	273-285	817-853	MLAELFsLSGILT	63
GIVM-R 48x	TAACATTATGCCAAGTATAATGTGTcatnacdatncc	reverse	N/A	37	25	12(4)	48	290-302	868-904	GIVMSHYTwhNVT	68
AETF-R 64x	CATCCATTCCCACATAAAGAAAGATraangtytcngc	reverse	N/A	37	25	12(4)	64	320-332	958-994	AETFIFLYVGmDA	74

Table 3: Predicted primers output by the j-CODEHOP program using ClustalW for alignment with all other settings kept at default.

The final analysis of this laboratory exercise used the Primer3 platform⁷ to generate primer pairs specific to A. thaliana genes AAM08407.1 and AAM08406.1 due to their similarities outlined in the Discussion section. The primary primer pair represents the best predicted result, while the secondary primer pair indicates a less optimal alternative as scored by the Primer3 algorithm using default settings.

AAM08407.1 Primers

Primary

table4a <- read_csv('8407_primary.csv')
kable(table4a, "html", align="l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:2, monospace = TRUE)

Oligo	Start	Length	Melting Temperature (C)	GC Percentage	Overall Self Complementarity	3’ Self Complementarity	Nucleotide Sequence
Left Primer	1177	20	59.98	45	5	0	ATGGCATTTGCTCTTGCTCT
Right Primer	1375	20	59.94	45	3	0	TGTTCACCACCTCAAATCCA

Table 4a: The primary forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Secondary

table4b <- read_csv('8407_secondary.csv')
kable(table4b, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:2, monospace = TRUE)

Oligo	Start	Length	Melting Temperature (C)	GC Percentage	Overall Self Complementarity	3’ Self Complementarity	Nucleotide Sequence
Left Primer	1011	20	59.94	45	4	2	TTGGTCACACTTGGGATTCA
Right Primer	1211	20	60.14	50	4	2	TCGTGAACAGATTGCAGAGC

Table 4b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

AAM08406.1 Primers

Primary

table5a <- read_csv('8406_primary.csv')
kable(table5a, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:2, monospace = TRUE)

Oligo	Start	Length	Melting Temperature (C)	GC Percentage	Overall Self Complementarity	3’ Self Complementarity	Nucleotide Sequence
Left Primer	82	20	59.84	45	4	0	ATGATGCTCGTGCTTTCCTT
Right Primer	281	20	60.31	40	3	0	ATGATGGGAGGCAACAAAAA

Table 5a: The primary forward and reverse primers computed for A. thaliana gene AAM08406.1 using the Primer3 analysis platform.

Secondary

table5b <- read_csv('8406_secondary.csv')
kable(table5b, "html", align = "l") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0:2, monospace = TRUE)

Oligo	Start	Length	Melting Temperature (C)	GC Percentage	Overall Self Complementarity	3’ Self Complementarity	Nucleotide Sequence
Left Primer	82	20	59.84	45	4	0	ATGATGCTCGTGCTTTCCTT
Right Primer	278	20	60.34	40	2	0	ATGGGAGGCAACAAAAACAA

Table 5b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Discussion

Looking at Table 2 it is clear that conserved regions 2 and 3 are more ideal for degenerate primer design than conserved region 1, as this region has a degeneracy many times that of 2 and 3. Recall that degeneracy is the measure of how uncertain we are of the corresponding nucleotide sequence when given an amino acid sequence due to the wobble hypothesis. The greater the degeneracy, the less specific our primer will be and the greater the chance of it annealing and amplifying genes outside of the target gene family.

While we can predict possible degenerate primers using a given amino acid sequence from proteins within the same gene family, the more traditional approach is to use the nucleotide sequence of a gene of interest directly. Tables 4 and 5 summarize the output of the Primer3 platform when used to analyze the sequences of NHX genes AAM08407.1 and AAM08406.1. We can see that while the primary and secondary primers of AAM08407.1 vary greatly in their 3’ complementarity and nucleotide sequences, both primer pairs of AAM08406.1 are similar and vary only in their melting temperatures. While using individual nucleotide sequences to generate primer pairs eliminates the uncertainty of translating amino acid sequences to cDNA, this may exclude genes of the same gene family that do not contain much if any of the same sequences yet still code for functionally similar proteins. This is why using multiple different primer pairs for the same gene can be beneficial for isolation, as the more sites we can target the more genetic material we can recover during our experiment for further analysis.

Conclusion

Degenerate primer design through bioinformatic analysis requires consideration of multiple approaches to best predict which primer sequence will most effectively probe for novel members of a plant gene family. Many factors, such as primer melting temperature and 3’ end complementarity must be considered to narrow down the results of these approaches⁴. Overall complementarity is especially important to consider as it indicates the probability of a primer sequence annealing to itself and forming rather than the target sequence. The 3’ end of the primer is particularly susceptible to dimer formation and receives its own probability score⁴. Even with careful consideration and data analysis, degenerate primer design is dependent on the constraints of the proposed PCR experiment, with different algorithms allowing for varying levels of customization for a particular need. In addition, it is often necessary to convert the file outputs of bioinformatics tools to more human readable formats for the benefits of the researcher and reader alike. Degenerate primer design is not an exact process and it comes down to the researcher to consider all of the information available to choose the best set of primers.

References

¹Experiment 1: Bioinformatics. (n.d.). BIT161B SQ2020. Retrieved April 14, 2020, from https://canvas.ucdavis.edu/courses/461005/files/folder/Laboratory%20Manual?preview=8296468

² Multiple Sequence Alignment—CLUSTALW. (n.d.). Retrieved April 14, 2020, from https://www.genome.jp/tools-bin/clustalw

³ NHX and Arabidopsis—Protein—NCBI. (n.d.). Retrieved April 14, 2020, from https://www.ncbi.nlm.nih.gov/protein/?term=NHX%20and%20Arabidopsis&utm_source=gquery&utm_medium=search

⁴ Primer Design. (n.d.). Retrieved April 14, 2020, from http://bioweb.uwlax.edu/GenWeb/Molecular/seq_anal/primer_design/primer_design.htm

⁵ Protein to DNA reverse translation. (n.d.). Retrieved April 14, 2020, from http://www.biophp.org/minitools/protein_to_dna/demo.php

⁶ Shin-Lin Tu, Jeannette P. Staheli, Colum McClay, Kathleen McLeod, Timothy M. Rose and Chris Upton. 2018 Base-By-Base Version 3: New Comparative Tools for Large Virus Genomes. Viruses 2018, 10(11), 637; https://doi.org/10.3390/v10110637.

⁷ Steve Rozen, Helen J. Skaletsky (1998) Primer3. Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.

⁸ Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic acids research, 22(22), 4673–4680. https://doi.org/10.1093/nar/22.22.4673

**Degenerate primer design through shared sequence domain identification across six members of the A. thaliana NHX gene family**

Author: Tyler Brassel

Date: May 7th, 2020

Lab Exercise 1

Abstract

Introduction

Methods

All information concerning the methodology of this lab exercise can be found in the class lab manual¹. The author used R for data analysis and presentation. Full source code can be found here with partial code snippets embedded in the document.

Results

Conserved Region 1

Table 1a: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 2

Table 1b: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 3

Table 1c: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Table 2: A summary of the identified conserved amino acid stretches of length 5 or greater from the Clustal 2.1 multiple sequence alignment generated from 6 members of the A. thaliana NHX gene family.

Table 3: Predicted primers output by the j-CODEHOP program using ClustalW for alignment with all other settings kept at default.

AAM08407.1 Primers

Primary

Table 4a: The primary forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Secondary

Table 4b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

AAM08406.1 Primers

Primary

Table 5a: The primary forward and reverse primers computed for A. thaliana gene AAM08406.1 using the Primer3 analysis platform.

Secondary

Table 5b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Discussion

Conclusion

References

¹Experiment 1: Bioinformatics. (n.d.). BIT161B SQ2020. Retrieved April 14, 2020, from https://canvas.ucdavis.edu/courses/461005/files/folder/Laboratory%20Manual?preview=8296468

² Multiple Sequence Alignment—CLUSTALW. (n.d.). Retrieved April 14, 2020, from https://www.genome.jp/tools-bin/clustalw

³ NHX and Arabidopsis—Protein—NCBI. (n.d.). Retrieved April 14, 2020, from https://www.ncbi.nlm.nih.gov/protein/?term=NHX%20and%20Arabidopsis&utm_source=gquery&utm_medium=search

⁴ Primer Design. (n.d.). Retrieved April 14, 2020, from http://bioweb.uwlax.edu/GenWeb/Molecular/seq_anal/primer_design/primer_design.htm

⁵ Protein to DNA reverse translation. (n.d.). Retrieved April 14, 2020, from http://www.biophp.org/minitools/protein_to_dna/demo.php

⁶ Shin-Lin Tu, Jeannette P. Staheli, Colum McClay, Kathleen McLeod, Timothy M. Rose and Chris Upton. 2018 Base-By-Base Version 3: New Comparative Tools for Large Virus Genomes. Viruses 2018, 10(11), 637; https://doi.org/10.3390/v10110637.

⁷ Steve Rozen, Helen J. Skaletsky (1998) Primer3. Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.

Degenerate primer design through shared sequence domain identification across six members of the A. thaliana NHX gene family

Author: Tyler Brassel

Date: May 7th, 2020

Lab Exercise 1

Abstract

Introduction

Methods

All information concerning the methodology of this lab exercise can be found in the class lab manual1. The author used R for data analysis and presentation. Full source code can be found here with partial code snippets embedded in the document.

Results

Conserved Region 1

Table 1a: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 2

Table 1b: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Conserved Region 3

Table 1c: Shown here is a subset of the Clustal 2.1 multiple sequence alignment displayed in an easy to read format with the region of interest highlighted.

Table 2: A summary of the identified conserved amino acid stretches of length 5 or greater from the Clustal 2.1 multiple sequence alignment generated from 6 members of the A. thaliana NHX gene family.

Table 3: Predicted primers output by the j-CODEHOP program using ClustalW for alignment with all other settings kept at default.

AAM08407.1 Primers

Primary

Table 4a: The primary forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Secondary

Table 4b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

AAM08406.1 Primers

Primary

Table 5a: The primary forward and reverse primers computed for A. thaliana gene AAM08406.1 using the Primer3 analysis platform.

Secondary

Table 5b: A secondary pair of forward and reverse primers computed for A. thaliana gene AAM08407.1 using the Primer3 analysis platform.

Discussion

Conclusion

References

1Experiment 1: Bioinformatics. (n.d.). BIT161B SQ2020. Retrieved April 14, 2020, from https://canvas.ucdavis.edu/courses/461005/files/folder/Laboratory%20Manual?preview=8296468

2 Multiple Sequence Alignment—CLUSTALW. (n.d.). Retrieved April 14, 2020, from https://www.genome.jp/tools-bin/clustalw

3 NHX and Arabidopsis—Protein—NCBI. (n.d.). Retrieved April 14, 2020, from https://www.ncbi.nlm.nih.gov/protein/?term=NHX%20and%20Arabidopsis&utm_source=gquery&utm_medium=search

4 Primer Design. (n.d.). Retrieved April 14, 2020, from http://bioweb.uwlax.edu/GenWeb/Molecular/seq_anal/primer_design/primer_design.htm

5 Protein to DNA reverse translation. (n.d.). Retrieved April 14, 2020, from http://www.biophp.org/minitools/protein_to_dna/demo.php

6 Shin-Lin Tu, Jeannette P. Staheli, Colum McClay, Kathleen McLeod, Timothy M. Rose and Chris Upton. 2018 Base-By-Base Version 3: New Comparative Tools for Large Virus Genomes. Viruses 2018, 10(11), 637; https://doi.org/10.3390/v10110637.

7 Steve Rozen, Helen J. Skaletsky (1998) Primer3. Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.

**Degenerate primer design through shared sequence domain identification across six members of the A. thaliana NHX gene family**

All information concerning the methodology of this lab exercise can be found in the class lab manual¹. The author used R for data analysis and presentation. Full source code can be found here with partial code snippets embedded in the document.

¹Experiment 1: Bioinformatics. (n.d.). BIT161B SQ2020. Retrieved April 14, 2020, from https://canvas.ucdavis.edu/courses/461005/files/folder/Laboratory%20Manual?preview=8296468

² Multiple Sequence Alignment—CLUSTALW. (n.d.). Retrieved April 14, 2020, from https://www.genome.jp/tools-bin/clustalw

³ NHX and Arabidopsis—Protein—NCBI. (n.d.). Retrieved April 14, 2020, from https://www.ncbi.nlm.nih.gov/protein/?term=NHX%20and%20Arabidopsis&utm_source=gquery&utm_medium=search

⁴ Primer Design. (n.d.). Retrieved April 14, 2020, from http://bioweb.uwlax.edu/GenWeb/Molecular/seq_anal/primer_design/primer_design.htm

⁵ Protein to DNA reverse translation. (n.d.). Retrieved April 14, 2020, from http://www.biophp.org/minitools/protein_to_dna/demo.php

⁶ Shin-Lin Tu, Jeannette P. Staheli, Colum McClay, Kathleen McLeod, Timothy M. Rose and Chris Upton. 2018 Base-By-Base Version 3: New Comparative Tools for Large Virus Genomes. Viruses 2018, 10(11), 637; https://doi.org/10.3390/v10110637.

⁷ Steve Rozen, Helen J. Skaletsky (1998) Primer3. Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.