mcb 720- exam ii answer key 2010 - ohio university 720/mcb 720- exam ii... · mcb 720 exam ii...
TRANSCRIPT
MCB 720 Exam II Answer Key – 2010‐ Key Points
1. Two different strategies ‐reccognize the amino acid sequence corresponds to leptin by using a BLAST search. Thus, you will be able to use a homologous sequence (e.g., gorilla or human leptin cDNA), labelled by the random primer method to screen your cDNA library. Note that since leptin is produced by adipocytes, and secreted into the blood, you will need to use adipocytes for isolation of mRNA in producing your cDNA library. ‐design an oligonucleotide probe corresponding to the amino acid sequence with the least degeneracy. Show the sequence and indicate it will be end‐labeled for use as a probe to screen the library. cDNA library production ‐isolate mRNA from elephant adipocytes using oligo dT cellulose chromatography ‐mention each step of the process including the enzymes involved with each step ‐indicate how the cDNA will be placed in the cloning vector and how the vector will be put in host Verification methods ‐Northern blotting‐ use your selected cDNA as a probe to examine RNA isolated from adipocytes. Does a hybridizing band of ~980nt appear (16,000Da*1AA/100Da*3nt/AA=480 +500 (for 5’&3’UT) ‐DNA sequencing‐sequence your selected cDNA; determine whether it matches the AA sequence ‐Hybrid select translation‐use your selected cDNA to isolate an mRNA that is translated in the presence of 35S‐Met. Does it produce a 16kDa protein band on an SDS‐PAGE gel? 2. Sequencing, assembly, and annotation ‐use massively parallel sequencing which involves attaching digested genomic DNA onto individual beads for pyrosequencing (i.e., 454 sequencing) ‐assembly is facilitated by the use of computer software which recognizes overlaps in the sequenced genomic DNA fragments ‐annotation is facilitated the use of computer algorithms to recognize putative genes and by database searches against know genomes and published literature on homologous sequences in order to arrive at tentative identifications for as many genes as possible Microarray experiment ‐first need to consider which organ(s) you wish to monitor; than isolate mRNAs from this organ(s) from elephants on a high fat diet and those on a normal diet. ‐create an elephant gene microarray chip or slide using gene fragments or oligonucleotides corresponding to the various elephant genes identified by your elephant genome sequencing project ‐produce first strand cDNA from the high fat mRNAs using a red fluorescently labelled nucleotide and produce first strand cDNA from the normal mRNAs using a green fluorescently labelled nucleotide. Hybridize these two mRNAs in equimolar amounts to the microarray. ‐wash the microarray and scan with a laser to identify genes induced (red) or repressed (green) by the high fat diet. Repeat experiment 2 more times to insure accuracy.
3. Construction and screening of a rose genomic library ‐ Isolate genomic DNA from essentially any rose organ (e.g., leaves, stems, roots), do a partial digest of this DNA with Sau3A, and ligate it into the left and right arms of lambda phage following digestion with Bam HI. Then do a packaging reaction and transfect E. coli cells and spread on a petri dish. Library screening involves hybridization of nitrocellulose plaque lifts to your RBC cDNA probe produced by random primer labeling or nick‐translation using an �‐32P‐dNTP.
Restriction enzyme map
S S
B B
E B S E B E S B S E
4. Reverse Transcription PCR: Reverse Transcription PCR involves PCR amplification of the cDNA synthesized from mRNA by the Reverse Transcriptase enzyme and appropriate primer sequences. So, in addition to the normal steps of PCR, reverse transcription PCR involves one extra step of reverse transcription of the mRNA to DNA before amplification. This is an important step as the Taq polymerase cannot use the mRNA as a template. This step utilizes 37 ̊C temperature condition for the reverse transcriptase enzyme works. Once the cDNA is formed, the PCR proceeds as a normal reaction using primers. The amplified DNA is then measured after the reaction using an Agarose‐gel electrophoresis. RT‐PCR tells us whether our RNA of interest is being produced or not. It does not tell us of the relative abundance, but using gels of standard samples, semi‐quantitative results can be obtained. qRT‐PCR(quantitative Real‐time PCR): This technique involves quantitatively measuring the amount of DNA in real time after each cycle of the PCR. The principle difference between this method and other PCR methods is that this method measures amount of DNA formed after each cycle of PCR rather than at the end. The binding of the fluorescent dye/DNA probe to the DNA product gives a fluorescent reading corresponding to the amount of target DNA present in the sample. This principle is used to determine the level of product. The RNA sample is added, along with the enzyme reverse transcriptase, to convert the RNA to cDNA. Reagents then added are the fluorescent DNA probes or DNA binding dyes. PCR is then started. The DNA bound fractions of the probes/dyes fluoresce when incident with a characteristic wavelength which is detected by the instrument after each cycle. The time required for a particular sample to reach a threshold fluorescence with respect to the background is recorded. This method of PCR gives information about the relative expression levels of mRNAs. So, the experimenter can get the relative mRNA copy number for the gene. This enables the experimenter to compare the expression levels of genes. Comparison of RT‐PCR, qRT‐PCR & Northern blot for RNA analysis: RT‐PCR & qRT‐PCR require very minute amounts of RNA sample as compared to Northern blot. They are faster methods than Northern blot. Also, they are much more sensitive than northern blot. qRT‐PCR can give relative expression levels, but the others cannot. So, it is useful to study gene expression. The flip side is qRT‐PCR does not tell anything about the size of the RNA, which can be obtained from Northern Blotting. Note that RT‐PCR does not tell us anything about the mRNA size either. One big advantage of northern blotting is that it can so the size of the transcript and it can reveal if more than one transcript of different sizes are present. One advantage about qRT‐PCR is that it can be used to study si‐RNA. One disadvantage is that the thermocycler for qRT‐PCR is specialized and quite expensive.
Q5. Answers a. Citations of papers for this gene & name: 1. Title: A novel bioinformatics approach identifies candidate genes for the synthesis and feruloylation of arabinoxylan . Authors: Mitchell RA Dupree P Shewry PR Journal: Plant Physiol Year: 2007 Doc ID: 39671 2. Title: A new technique for activation tagging in Arabidopsis . Authors: Pogorelko GV Fursova OV Ogarkova OA Tarasov VA Journal: Gene Year: 2008 Doc ID: 43633 3. Title: A highly efficient miPCR method for isolating FSTs from transgenic Arabidopsis thaliana plants . Authors: Pogorelko GV Fursova OV Journal: J Genet Year: 2008 Doc ID: 47031 b. Gene sequence and organization: Gene sequence: 1 GTTGAGAAGA AGGAAGAAGA AGAAGACGAT GAAGCTCTCT GTGTTTCGAT 51 TGAGCTATTG GAACCGTCGA GGAAGTAGTT TCAGATCATC GCCGTCGTTG 101 GATCCATCAT TCGATGGCAA ATCTCCGTCG TCTGTGTTTT GGTTCGTGAT 151 TCATGGTCTC TGCTGCTTGA TCAGCTTGAT TCTAGGGTTC CGATTCAGCC 201 ATTTAGTACT CTTCTTCCTT TTCTCGACTT CCGTCACCAA TCTATACACA 251 ACGCCATTTC TCTTTGCCGG AAACGGCGGT GTAAGCCAGC TTCTCCGGCT 301 AAAACCTCTG GAAACAGCGA CTAACAGCAC GGTGAAGAAG AACTCTCGAG 351 TGGTGGTTGG AAGACACGGG ATCCGGATCC GTCCATGGCC TCACCCGAAT 401 CCGATTGAGG TATTGAGAGC TCATCAGTTG CTTGTGAGAG TACAGAAAGA 451 GCAGAAATCG ATGTACGGTG TGAGGAGCCC TAGGACTGTG ATTGTGGTGA 501 CGCCGACTTA TGTACGGACT TTTCAGGCGC TTCATTTGAC CGGAGTTATG 551 CACTCGCTTA TGCTTGTTCC GTACGATTTG GTTTGGATCG TTGTGGAAGC 601 TGGTGGAATC ACTAACGAGA CTGCTTCGTT TATCGCAAAA TCAGGATTAA 651 AGACGATTCA CTTAGGATTC GATCAGAAAA TGCCTAATAC ATGGGAAGAT 701 CGTCACAAAT TGGAGACCAA AATGAGACTT CACGCCTTGA GGTTAAATCT 751 TTGATTCACA TTTTGTTTTG CTGTTTTGAT TGATTGATTC TTATTTGACT 801 TGAAATTTAT TGTTCTTGTG TTGATGTTTG CTCATCAGAG TTGTGAGAGA 851 GAAGAAGTTA GATGGGATTG TTATGTTTGC TGATGATAGC AATATGCATA 901 GTATGGAGCT TTTTGATGAG ATTCAAACTG TGAAATGGTT TGGTGCTCTA 951 TCTGTTGGTA TACTTGCTCA TTCTGGTAAT GCAGATGAAT TATCATCGAT 1001 CTTGAAGAAT GAACAAGGGA AGAACAAAGA GAAACCTTCA ATGCCAATCC 1051 AAGGTCCTAG TTGTAATTCC TCTGAGAAAT TAGTGGGTTG GCACATTTTC 1101 AACACACAGC CTTATGCCAA GAAGACTGCA GTGTATATCG ATGAGAAAGC 1151 GCCTGTGATG CCTAGTAAGA TGGAATGGTC AGGGTTTGTG TTGAATTCTA 1201 GATTGCTCTG GAAGGAATCT TTAGATGATA AACCAGCATG GGTTAAAGAT 1251 CTCAGCTTGT TGGATGATGG TTATGCGGAA ATTGAGAGTC CTTTGTCTTT 1301 GGTGAAGGAT CCTTCCATGG TGGAGCCACT TGGAAGCTGT GGCCGTCGTG 1351 TCTTGCTTTG GTGGCTTCGA GTTGAAGCTC GAGCTGATAG CAAATTCCCA 1401 CCTGGGTTAG TTTTCTTTAA TCATTCTCTC TGATGGAAAG CAAATTTCTC 1451 ATCACATTAT CACTTGCTGA GTTGCAGTCT TCTAACTTGT ATTGTATGAA 1501 CGCAGCTGGA TCATAAAGTC ACCTTTAGAA ATCACAGTGC CATCAAAGCG 1551 GACACCCTGG CCAGACTCTT CCTCAGAGCT CCCAGCGGCG GCGATCAAAG 1601 AGGCAAAAAG CAACTCTAAG CCAAGAGTGT CGAAGAGCAA GAGCTATAAG 1651 GAGAAACAAG AACCTAAAGC TTTCGATGGT GTCAAAGTGT CAGCAACTAG 1701 CTGAAGAAGC TTCTATAGAT AGATCGAGTT TCATTTCATA TTCTTTTTCA 1751 CCGTAAATAT CGGAAAATTG TTTTGTGTGC ATTAGCTTCA TTGTATACAG 1801 TACAAAAATC ATATGAGATG TTTTAAGTTC TTCAAATCGA TACTTAGCTC 1851 T
Beginning of start codon: 26,839,731 bp End of stop codon: 26,841,408 bp
# Exon Contents 5' End 3' End + strand - strand Size (bp)
1 5' UTR + ORF 26,839,704 26,840,444 1,3,5,7,9,11,13,15 2,4,6,8,10,12,14,16 740
Intron
2 ORF 26,840,542 26,841,108 17,19,21 18,20,22 566
Intron
3 ORF + 3' UTR 26,841,209 26,841,554 23,25,27,28 24,26,29 345
Within 3' UTR
c. Full length mRNA or cDNA sequence of the gene: >gi|145359772|ref|NM_126123.4| Arabidopsis thaliana glycosyl transferase family 43 protein (AT5G67230) mRNA, complete cds GTTGAGAAGAAGGAAGAAGAAGAAGACGATGAAGCTCTCTGTGTTTCGATTGAGCTATTGGAACCGTCGA GGAAGTAGTTTCAGATCATCGCCGTCGTTGGATCCATCATTCGATGGCAAATCTCCGTCGTCTGTGTTTT GGTTCGTGATTCATGGTCTCTGCTGCTTGATCAGCTTGATTCTAGGGTTCCGATTCAGCCATTTAGTACT CTTCTTCCTTTTCTCGACTTCCGTCACCAATCTATACACAACGCCATTTCTCTTTGCCGGAAACGGCGGT GTAAGCCAGCTTCTCCGGCTAAAACCTCTGGAAACAGCGACTAACAGCACGGTGAAGAAGAACTCTCGAG TGGTGGTTGGAAGACACGGGATCCGGATCCGTCCATGGCCTCACCCGAATCCGATTGAGGTATTGAGAGC TCATCAGTTGCTTGTGAGAGTACAGAAAGAGCAGAAATCGATGTACGGTGTGAGGAGCCCTAGGACTGTG ATTGTGGTGACGCCGACTTATGTACGGACTTTTCAGGCGCTTCATTTGACCGGAGTTATGCACTCGCTTA TGCTTGTTCCGTACGATTTGGTTTGGATCGTTGTGGAAGCTGGTGGAATCACTAACGAGACTGCTTCGTT TATCGCAAAATCAGGATTAAAGACGATTCACTTAGGATTCGATCAGAAAATGCCTAATACATGGGAAGAT CGTCACAAATTGGAGACCAAAATGAGACTTCACGCCTTGAGAGTTGTGAGAGAGAAGAAGTTAGATGGGA TTGTTATGTTTGCTGATGATAGCAATATGCATAGTATGGAGCTTTTTGATGAGATTCAAACTGTGAAATG GTTTGGTGCTCTATCTGTTGGTATACTTGCTCATTCTGGTAATGCAGATGAATTATCATCGATCTTGAAG AATGAACAAGGGAAGAACAAAGAGAAACCTTCAATGCCAATCCAAGGTCCTAGTTGTAATTCCTCTGAGA AATTAGTGGGTTGGCACATTTTCAACACACAGCCTTATGCCAAGAAGACTGCAGTGTATATCGATGAGAA AGCGCCTGTGATGCCTAGTAAGATGGAATGGTCAGGGTTTGTGTTGAATTCTAGATTGCTCTGGAAGGAA TCTTTAGATGATAAACCAGCATGGGTTAAAGATCTCAGCTTGTTGGATGATGGTTATGCGGAAATTGAGA GTCCTTTGTCTTTGGTGAAGGATCCTTCCATGGTGGAGCCACTTGGAAGCTGTGGCCGTCGTGTCTTGCT TTGGTGGCTTCGAGTTGAAGCTCGAGCTGATAGCAAATTCCCACCTGGCTGGATCATAAAGTCACCTTTA GAAATCACAGTGCCATCAAAGCGGACACCCTGGCCAGACTCTTCCTCAGAGCTCCCAGCGGCGGCGATCA AAGAGGCAAAAAGCAACTCTAAGCCAAGAGTGTCGAAGAGCAAGAGCTATAAGGAGAAACAAGAACCTAA AGCTTTCGATGGTGTCAAAGTGTCAGCAACTAGCTGAAGAAGCTTCTATAGATAGATCGAGTTTCATTTC ATATTCTTTTTCACCGTAAATATCGGAAAATTGTTTTGTGTGCATTAGCTTCATTGTATACAGTACAAAA ATCATATGAGATGTTTTAAGTTCTTCAAATCGATACTTAGCTCT
d. Protein sequence: >gi|15240245|ref|NP_201524.1| glycosyl transferase family 43 protein [Arabidopsis thaliana] MKLSVFRLSYWNRRGSSFRSSPSLDPSFDGKSPSSVFWFVIHGLCCLISLILGFRFSHLVLFFLFSTSVT NLYTTPFLFAGNGGVSQLLRLKPLETATNSTVKKNSRVVVGRHGIRIRPWPHPNPIEVLRAHQLLVRVQK EQKSMYGVRSPRTVIVVTPTYVRTFQALHLTGVMHSLMLVPYDLVWIVVEAGGITNETASFIAKSGLKTI HLGFDQKMPNTWEDRHKLETKMRLHALRVVREKKLDGIVMFADDSNMHSMELFDEIQTVKWFGALSVGIL AHSGNADELSSILKNEQGKNKEKPSMPIQGPSCNSSEKLVGWHIFNTQPYAKKTAVYIDEKAPVMPSKME WSGFVLNSRLLWKESLDDKPAWVKDLSLLDDGYAEIESPLSLVKDPSMVEPLGSCGRRVLLWWLRVEARA DSKFPPGWIIKSPLEITVPSKRTPWPDSSSELPAAAIKEAKSNSKPRVSKSKSYKEKQEPKAFDGVKVSA TS
e. Predicted molecular weight of the protein: The calculated molecular weight of the protein was found to be— 55348.7 f. pI The calculated pI for the protein was found to be 10.2906 g. Protein targeting sequences and likely cellular destinations:
Using neural networks (NN) and hidden Markov models (HMM) trained on eukaryotes >gi_15240245_ref_NP_201524.1_ glycosyl transferase family 43 protein _Arabidopsis thaliana_ SignalP-NN result:
# data
>gi_15240245_ref_NP_2 length = 70 # Measure Position Value Cutoff signal peptide? max. C 18 0.213 0.32 NO max. Y 54 0.211 0.33 NO max. S 44 0.674 0.87 NO mean S 1-53 0.239 0.48 NO D 1-53 0.225 0.43 NO SignalP-HMM result:
# data
>gi_15240245_ref_NP_201524.1_ Prediction: Signal anchor Signal peptide probability: 0.002 Signal anchor probability: 0.997 Max cleavage site probability: 0.002 between pos. 53 and 54 In the results of the peptide signal searching software SignalIP 3.0, the signal NN prediction graph found some C sites, but the S score for these sites did not correlate well to be selected as a cleavage site. So, the software predicts that there is no signal sequence in the protein. Also, in the SignalP-HMM prediction plot,the probability of the C-region is very low. The software also predicts that this protein is an anchored protein in the cell membrane.
Subcellular Localization GFP MS/MS Annotators Predictors GFP Images
no data no data no data iPSORT : mitochondrionLOCtree : no data MitoPred : mitochondrionMitoprot 2 : no data MultiLoc : mitochondrionPeroxP : no data Predotar : mitochondrionSubLoc : mitochondrion TargetP : mitochondrion WoLFPSORT : plastid
no images
Description (TAIR8) protein_coding glycosyl transferase family 43 protein similar to IRX14 (IRREGULAR XYLEM 14), transferase, transferring glycosyl groups / xylosyltransferase [Arabidopsis thaliana] (TAIR:AT4G36890.1); similar to unnamed protein product [Vitis vinifera] (GB:CAO71907.1); contains InterPro domain Glycosyl transferase, family 43; (InterPro:IPR005027)
Coordinates (TAIR8) chr5:+:26839704..26841554
Molecular Weight 55314.31 Da (calculated)
IEP 10.29 (calculated)
GRAVY ‐0.22 (calculated)
Length 492 amino acids
Sequence (TAIR8) (BLAST)
MKLSVFRL SYWNRRGS SFRSSPSL DPSFDGKS PSSVFWFV IHGLCCLI SLILGFRF SHLVLFFL FSTSVTNL YTTPFLFA GNGGVSQL LRLKPLET ATNSTVKK NSRVVVGR HGIRIRPW PHPNPIEV LRAHQLLV RVQKEQKS MYGVRSPR TVIVVTPT YVRTFQAL HLTGVMHS LMLVPYDL VWIVVEAG GITNETAS FIAKSGLK TIHLGFDQ KMPNTWED RHKLETKM RLHALRVV REKKLDGI VMFADDSN MHSMELFD EIQTVKWF GALSVGIL AHSGNADE LSSILKNE QGKNKEKP SMPIQGPS CNSSEKLV GWHIFNTQ PYAKKTAV YIDEKAPV MPSKMEWS GFVLNSRL LWKESLDD KPAWVKDL SLLDDGYA EIESPLSL VKDPSMVE PLGSCGRR VLLWWLRV EARADSKF PPGWIIKS PLEITVPS KRTPWPDS SSELPAAA IKEAKSNS KPRVSKSK SYKEKQEP KAFDGVKV SATS*
Hydropathy Plot (raw data)
h. Expression pattern of the gene:
1.http://jsp.weigelworld.org/expviz/expviz.jsp?experiment=development&normalization=absolute&probesetcsv=At5g67230&action=Run, 02/20/2010 i. 5 most similar DNA sequences related to the gene:
Accession Description
NM_126123.4 Arabidopsis thaliana glycosyl transferase family 43 protein (AT5G67230) mRNA, complete cds
AY070076.1 Arabidopsis thaliana putative UDP-glucuronyltransferase (At5g67230) mRNA, complete cds
BX829393.1 Arabidopsis thaliana Full-length cDNA Complete sequence from clone GSLTFB10ZF05 of Flowers and buds of strain col-0 of Arabidop
AL161590.2 Arabidopsis thaliana DNA chromosome 4, contig fragment No. 86
Z99707.1 Arabidopsis thaliana DNA chromosome 4, ESSA I AP2 contig fragment No. 1
AJ971041.1 Brassica napus mRNA for glycosyltransferase (pglcat8 gene)
NM_119853.3 Arabidopsis thaliana IRX14 (irregular xylem 14); transferase, transferring glycosyl groups / xylosyltransferase (IRX14) mRNA, comp
>gi|145359772|ref|NM_126123.4| Arabidopsis thaliana glycosyl transferase family 43 protein (AT5G67230) mRNA, complete cds GTTGAGAAGAAGGAAGAAGAAGAAGACGATGAAGCTCTCTGTGTTTCGATTGAGCTATTGGAACCGTCGA GGAAGTAGTTTCAGATCATCGCCGTCGTTGGATCCATCATTCGATGGCAAATCTCCGTCGTCTGTGTTTT GGTTCGTGATTCATGGTCTCTGCTGCTTGATCAGCTTGATTCTAGGGTTCCGATTCAGCCATTTAGTACT CTTCTTCCTTTTCTCGACTTCCGTCACCAATCTATACACAACGCCATTTCTCTTTGCCGGAAACGGCGGT GTAAGCCAGCTTCTCCGGCTAAAACCTCTGGAAACAGCGACTAACAGCACGGTGAAGAAGAACTCTCGAG TGGTGGTTGGAAGACACGGGATCCGGATCCGTCCATGGCCTCACCCGAATCCGATTGAGGTATTGAGAGC TCATCAGTTGCTTGTGAGAGTACAGAAAGAGCAGAAATCGATGTACGGTGTGAGGAGCCCTAGGACTGTG ATTGTGGTGACGCCGACTTATGTACGGACTTTTCAGGCGCTTCATTTGACCGGAGTTATGCACTCGCTTA TGCTTGTTCCGTACGATTTGGTTTGGATCGTTGTGGAAGCTGGTGGAATCACTAACGAGACTGCTTCGTT TATCGCAAAATCAGGATTAAAGACGATTCACTTAGGATTCGATCAGAAAATGCCTAATACATGGGAAGAT CGTCACAAATTGGAGACCAAAATGAGACTTCACGCCTTGAGAGTTGTGAGAGAGAAGAAGTTAGATGGGA TTGTTATGTTTGCTGATGATAGCAATATGCATAGTATGGAGCTTTTTGATGAGATTCAAACTGTGAAATG GTTTGGTGCTCTATCTGTTGGTATACTTGCTCATTCTGGTAATGCAGATGAATTATCATCGATCTTGAAG AATGAACAAGGGAAGAACAAAGAGAAACCTTCAATGCCAATCCAAGGTCCTAGTTGTAATTCCTCTGAGA AATTAGTGGGTTGGCACATTTTCAACACACAGCCTTATGCCAAGAAGACTGCAGTGTATATCGATGAGAA AGCGCCTGTGATGCCTAGTAAGATGGAATGGTCAGGGTTTGTGTTGAATTCTAGATTGCTCTGGAAGGAA TCTTTAGATGATAAACCAGCATGGGTTAAAGATCTCAGCTTGTTGGATGATGGTTATGCGGAAATTGAGA GTCCTTTGTCTTTGGTGAAGGATCCTTCCATGGTGGAGCCACTTGGAAGCTGTGGCCGTCGTGTCTTGCT TTGGTGGCTTCGAGTTGAAGCTCGAGCTGATAGCAAATTCCCACCTGGCTGGATCATAAAGTCACCTTTA GAAATCACAGTGCCATCAAAGCGGACACCCTGGCCAGACTCTTCCTCAGAGCTCCCAGCGGCGGCGATCA AAGAGGCAAAAAGCAACTCTAAGCCAAGAGTGTCGAAGAGCAAGAGCTATAAGGAGAAACAAGAACCTAA AGCTTTCGATGGTGTCAAAGTGTCAGCAACTAGCTGAAGAAGCTTCTATAGATAGATCGAGTTTCATTTC ATATTCTTTTTCACCGTAAATATCGGAAAATTGTTTTGTGTGCATTAGCTTCATTGTATACAGTACAAAA ATCATATGAGATGTTTTAAGTTCTTCAAATCGATACTTAGCTCT >gi|17979150|gb|AY070076.1| Arabidopsis thaliana putative UDP-glucuronyltransferase (At5g67230) mRNA, complete cds GTTGAGAAGAAGGAAGAAGAAGAAGACGATGAAGCTCTCTGTGTTTCGATTGAGCTATTGGAACCGTCGA GGAAGTAGTTTCAGATCATCGCCGTCGTTGGATCCATCATTCGATGGCAAATCTCCGTCGTCTGTGTTTT GGTTCGTGATTCATGGTCTCTGCTGCTTGATCAGCTTGATTCTAGGGTTCCGATTCAGCCATTTAGTACT CTTCTTCCTTTTCTCGACTTCCGTCACCAATCTATACACAACGCCATTTCTCTTTGCCGGAAACGGCGGT GTAAGCCAGCTTCTCCGGCTAAAACCTCTGGAAACAGCGACTAACAGCACGGTGAAGAAGAACTCTCGAG TGGTGGTTGGAAGACACGGGATCCGGATCCGTCCATGGCCTCACCCGAATCCGATTGAGGTATTGAGAGC TCATCAGTTGCTTGTGAGAGTACAGAAAGAGCAGAAATCGATGTACGGTGTGAGGAGCCCTAGGACTGTG ATTGTGGTGACGCCGACTTATGTACGGACTTTTCAGGCGCTTCATTTGACCGGAGTTATGCACTCGCTTA TGCTTGTTCCGTACGATTTGGTTTGGATCGTTGTGGAAGCTGGTGGAATCACTAACGAGACTGCTTCGTT TATCGCAAAATCAGGATTAAAGACGATTCACTTAGGATTCGATCAGAAAATGCCTAATACATGGGAAGAT CGTCACAAATTGGAGACCAAAATGAGACTTCACGCCTTGAGAGTTGTGAGAGAGAAGAAGTTAGATGGGA TTGTTATGTTTGCTGATGATAGCAATATGCATAGTATGGAGCTTTTTGATGAGATTCAAACTGTGAAATG GTTTGGTGCTCTATCTGTTGGTATACTTGCTCATTCTGGTAATGCAGATGAATTATCATCGATCTTGAAG AATGAACAAGGGAAGAACAAAGAGAAACCTTCAATGCCAATCCAAGGTCCTAGTTGTAATTCCTCTGAGA AATTAGTGGGTTGGCACATTTTCAACACACAGCCTTATGCCAAGAAGACTGCAGTGTATATCGATGAGAA AGCGCCTGTGATGCCTAGTAAGATGGAATGGTCAGGGTTTGTGTTGAATTCTAGATTGCTCTGGAAGGAA TCTTTAGATGATAAACCAGCATGGGTTAAAGATCTCAGCTTGTTGGATGATGGTTATGCGGAAATTGAGA GTCCTTTGTCTTTGGTGAAGGATCCTTCCATGGTGGAGCCACTTGGAAGCTGTGGCCGTCGTGTCTTGCT TTGGTGGCTTCGAGTTGAAGCTCGAGCTGATAGCAAATTCCCACCTGGCTGGATCATAAAGTCACCTTTA GAAATCACAGTGCCATCAAAGCGGACACCCTGGCCAGACTCTTCCTCAGAGCTCCCAGCGGCGGCGATCA AAGAGGCAAAAAGCAACTCTAAGCCAAGAGTGTCGAAGAGCAAGAGCTATAAGGAGAAACAAGAACCTAA AGCTTTCGATGGTGTCAAAGTGTCAGCAACTAGCTGAAGAAGCTTCTATAGATAGATCGAGTTTCATTTC ATATTCTTTTTCACCGTAAATATCGGAAAATTGTTTTGTGTGCATTAGCTTCATTGTATACAGTACAAAA ATCATATGAGATGTTTTAAGTTCTTCAAATCGATACTTAGCTCTAAAAAAAAAAAAAAAA
>gi|42454836|emb|BX829393.1| Arabidopsis thaliana Full-length cDNA Complete sequence from clone GSLTFB10ZF05 of Flowers and buds of strain col-0 of Arabidopsis thaliana (thale cress) GACGATGAAGCTCTCTGTGTTTCGATTGAGCTATTGGAACCGTCGAGGAAGTAGTTTCAGATCATCGCCG CTCGTTGGATCCATCATTCGATGGCAAATCTCCGTCGTCTGTGTTTTGGTTCGTGATTCATGGTCTCTGC TGCTTGATCAGCTTGATTCTAGGGTTCCGATTCAGCCATTTAGTACTCTTCTTCCTTTTCTCGACTTCCG
TCACCAATCTATACACAACGCCATTTCTCTTTGCCGGAAACGGCGGTGTAAGCCAGCTTCTCCGGCTAAA ACCTCTGGAAACAGCGACTAACAGCACGGTGAAGAAGAACTCTCGAGTGGTGGTTGGAAGACACGGGATC CGGATCCGTCCATGGCCTCACCCGAATCCGATTGAGGTATTGAGAGCTCATCAGTTGCTTGTGAGAGTAC AGAAAGAGCAGAAATCGATGTACGGTGTGAGGAGCCCTAGGACTGTGATTGTGGTGACGCCGACTTATGT ACGGAGTTTTCAGGCGCTTCATTTGACCGGAGTTATGCACTCGCTTATGCTTGTTCCGTACGATTTGGTT TGGATCGTTGTGGAAGCTGGTGGAATCACTAACGAGACTGCTTCGTTTATCGCAAAATCAGGATTAAAGA CGATTCACTTAGGATTCGATCAGAAAATGCCTAATACATGGGAAGATCGTCACAAATTGGAGACCAAAAT GAGACTTCACGCCTTGAGAGTTGTGAGAGAGAAGAAGTTAGATGGGATTGTTATGTTTGCTGATGATAGC AATATGCATAGTATGGAGCTTTTTGATGAGATTCAAACTGTGAAATGGTTTGGTGCTCTATCTGTTGGTA TACTTGCTCATTCTGGTAATGCAGATGAATTATCATCGATCTTGAAGAATGAACAAGGGAAGAACAAAGA GAAACCTTCAATGCCAATCCAAGGTCCTAGTTGTAATTCCTCTGAGAAATTAGTGGGTTGGCACATTTTC AACACACAGCCTTATGCCAAGAAGACTGCAGTGTATATCGATGAGAAAGCGCCTGTGATGCCTAGTAAGA TGGAATGGTCAGGGTTTGTGTTGAATTCTAGATTGCTCTGGAAGGAATCTTTAGATGATAAACCAGCATG GGTTAAAGATCTCAGCTTGTTGGATGATGGTTATGCGGAAATTGAGAGTCCTTTGTCTTTGGTGAAGGAT CCTTCCATGGTGGAGCCACTTGGAAGCTGTGGCCGTCGTGTCTTGCTTTGGTGGCTTCGAGTTGAAGCTC GAGCTGATAGCAAATTCCCACCTGGCTGGATCATAAAGTCACCTTTAGAAATCACAGTGCCATCAAAGCG GACACCCTGGCCAGACTCTTCCTCAGAGCTCCCAGCGGCGGCGATCAAAGAGGCAAAAAGCAACTCTAAG CCAAGAGTGTCGAAGAGCAAGAGCTATAAGGAGAAACAAGAACCTAAAGCTTTCGATGGTGTCAAAGTGT CAGCAACTAGCTGAAGAAGCTTCTATAGATAGATCGAGTTTCATTTCATATTCTTTTTCACCGTAAATAT CGGAAAATTGTTTTGTGTGCATTAGCTTCATGTATACAGTCAAAAATCATAGAGAT
>gi|66347020|emb|AJ971041.1| Brassica napus mRNA for glycosyltransferase (pglcat8 gene) GGAGAAGAAGCGGAGAAGATGAAGCTCCTCTCTGCATTTCGATCGAGTTACTTGAATCGTCGAGGAAGCA CCTTCAGATCGTTAGATCCGTCGTCGTTCGACGGAGCTTTCATCCCCAAGCCATCTCCATCCTCAATCTT CTGGCTAGCGATTCACTGCCTCTGCTGCTTGATCAGCTTGATTCTAGGGTTCCGATTCAGCCACTTGGTC CTCTTCTTCCTCTACTCCACCTCCGTCACCAATCTATACACCACCACCGCCGGGGTGAGCCGGCTTCTCC AGCTGAAGCCTCTGGAGAAAGCCAACGTCACGGCGAAGAGCTCTCGAGTGGTGGTTGGAAGACACGGGAT CCGGATCCGTCCCTGGCCTCACCCGAACCCGATCGAAGTCATGAGAGCTCACCAGCTGCTCGAGAGAGTG CAGAAGGAGCAGAAGTCGCTCTACGGCGTGAGGAGCCCCAGGGCTGTGATCGCGGTGACGCCGACTTACG TACGGACGTTCCAGGCGCTTCACCTGACCGGAGTCATGCACTCGCTGATGCTCGTGCCTTACGTCGTGGT GTGGATCGTGGTGGAAGCTGGTGGGAAGAGCAACGAGACGGCTTCGTTCGTCGGGAAATCGAGATTGAAG ACGATTCACGTTGGTTTTGATCAGAAGATGCCGAATTACTGGGAAGATCGCGGCAAGCTGGAGAGTAAGA TGAGACTTCGAGCTTTGAGAGTTGTGAAGGAAGAGAAGCTTGATGGGGTTGTGATGTTTGCTGATGATAG TAACATGCATAGTATGGAGTTTTTCGATGAGATTCAGAACGTGAAGTGGTTCGGTGCTGTTTCCGTTGGA ATCTTGGCGCATTCGGGGAATGCGGAAGAGATGGTTATGTCGATGGATAAGAGAAGAGAGATGGAGAAAG AAGAGGTACAAGGTCCTGCGTGTAACGCGACCGATAAGCTGATCGGTTGGCATGTTTTCAATACGTTGCC ATACGCGGGGAAGAGTGCGGTTTATATAGACGATGTAGCTGCGGTTTTGCCGCAGAAGCTGGAGTGGTGT GGGTTTGTATTGAACTCGAGGATTCTTTGGGATGAGGCTGAGAGTAAGCCGGAGTGGGTTAAGGAGTTTG GGTTGTTGAACGAGAACGAAGGCGTGGAGAGTCCTTTGTCTCTGTTGAATGATCCTTCGATGGTTGAGCC TCTTGGAAGCTGTGGAAGACAGGTTCTGCTTTGGTGGCTTCGTGTTGAAGCACGCGCTGATAGCAAGTTC CCCTCCCGGGATGGTGATTGATCCTCC
>gi|186516987|ref|NM_119853.3| Arabidopsis thaliana IRX14 (irregular xylem 14); transferase, transferring glycosyl groups / xylosyltransferase (IRX14) mRNA, complete cds ATGGTGGTGGTGGTGAGAGAGCAGCAACAACAGCAAGAAGAGAAAGCGATAATCGAACTGATTAAGATCG TGAAATCCAAGTAATCTCTGTTGCTTAATCTCAGATCTTTTTGATAAGGAGAAGGAAGCAGAAGAAAGAG GTCAACGAAGAAGATGAAGCTCTCTGCTTTACATCAGAGTTACTTAAATCGCCGGAGTAATAGCTTCAGA TCTCCGACGTCTCTTGATTCTTCCGTTGATGGCTCCGGGAAGTCTTTAATCGCTGTGTTTTGGCTTATCC TGCACTGTCTTTGTTGCTTGATTAGTCTAGTTCTCGGCTTTCGATTCTCCAGATTAGTCTTCTTCTTCCT CTTCTCTACTTCTTCAACCAATCTCTACTCTCTTCCGTTTCGTCCTGACTTACCTGTGAAACACCTCGAT GTTCACACAATCGGCCGTACTCTCGATCCCGGAGCTAACGGAACGACGGTGGTGGCGACGGCGACGAAAA GCTCTCGTGTTGTTGTTGGAAGACACGGGATCCGGATCCGTCCTTGGCCGCATCCGAATCCCGTTGAGGT AATGAAAGCTCATCAGATCATTGGGAGAGTTCAAAAAGAGCAAAAGATGATCTTTGGGATGAAAAGTAGT AAGATGGTTATAGCTGTGACACCGACTTATGTGAGGACTTTTCAAGCTTTGCATTTGACTGGTGTGATGC ATTCTTTGATGCTTGTTCCTTATGATCTGGTTTGGATCGTTGTTGAAGCTGGTGGTGCTACTAATGAGAC CGGTTTGATTATTGCAAAATCAGGACTTAGGACCATTCATGTTGGGATTGATCAGAGAATGCCTAATACT TGGGAAGATCGTAGTAAATTAGAAGTCTTTATGAGACTTCAAGCTTTGAGAGTTGTGAGGGAAGAGAAGC TTGATGGGATTGTGATGTTTGCGGATGATAGTAATATGCATAGTATGGAGTTGTTTGATGAGATTCAGAA TGTGAAGTGGTTTGGTACTGTTTCTGTTGGGATATTGGCTCATTCAGGAAATGCGGAAGAGATGGTTTTG TCGATGGAAAAGAGGAAAGAGATGGAGAAAGAGGAAGAAGAGGAGAGCTCTTCGTTACCTGTACAGGGTC CTGCTTGTAACTCAACTGATCAGTTGATTGGTTGGCATATTTTCAATACATTGCCATATGCGGGGAAGAG TGCAGTTTATATAGATGATGTTGCTGCGGTTTTGCCTCAAAAACTAGAGTGGTCTGGGTTTGTGTTGAAC TCGAGATTGCTTTGGGAGGAAGCTGAGAATAAGCCAGAGTGGGTTAAGGACTTTGGGTCGTTGAATGAGA ATGAAGGTGTGGAGAGTCCTTTGTCTCTGTTGAAGGATCCTTCAATGGTGGAGCCTCTTGGAAGCTGTGG AAGACAAGTTCTGCTATGGTGGCTTCGAGTAGAAGCACGCGCTGATAGCAAATTCCCTCCCGGATGGATA ATTGATCCTCCGTTAGAGATCACAGTTGCGGCTAAACGCACTCCATGGCCAGATGTTCCTCCTGAGCCAC CAACTAAAAAGAAAGATCAAATGCCATTATCCCAAGGCAACACCGTCGTGGTAATACCAAAGCAGCAACA ACATCCAACGAAAATCCGAAAACCGAAACGCAAAAGTAAGAAAAGTAAACACGAACCTAGACCAACCGAT ACAACAACACAAGTTTATTCATCTTCGTCTAAGCATCAAGAAAGAAACTGAGAAAAGAAGAAAAAAGCTT TTGAGAGTTCAAGAGATTTTCTTCTACTATTATTATTCATTTGTTGCCAAAGTTTTTAGAGAAAACTCAG AGATCATCTTCTCCTCCAGATACCGCAAATATTACAACTAACGGAGAGGAAAACAAAACCGGGACAGTTT AGATTATGACAATGGAGCTCCCAAAAAAGGTATACAGGGTTTTTCGGGGAATGATGATCATTCGTTTTTC AACTATCGTTTTATTTTTTGGTCCTTTAAAACTTGTCTTTATTAGTAACAAAGATTGTTTGTTTCAATGT TCTTGAGGGTTACAAAAAGTGTAGAAACAAAACTTTGGGTTGTATAGTTTTGTAAGCATATTTCTCAATT
CATTTATGTTTTCTTCTCATGGACCCAAAACGTGAAAGCCGCCCCTCAATTTTACTTATACTGATGATAT AAATCAAATTTACCTTC
j. Sequence alignment of the At5g67230 protein and 5 similar protein sequences:
42071 1 MKLSVFRLSYWNRRGSSFRSSPS-LDPSFDG--KSPSSVFWFVIHGLCCLISLILGFRFSHLVLFFLFSTSVTNL--YTT 75
NP_201524 1 MKLSVFRLSYWNRRGSSFRSSPS-LDPSFDG--KSPSSVFWFVIHGLCCLISLILGFRFSHLVLFFLFSTSVTNL--YTT 75
XP_002283249 1 MKLSALQQSYTNRRSNSFRAAGG-LDSSVDGSGKSPAAIFWLVLHGLCCLISLVLGFRFSRLVFFLFFSTASNGG---TS 76
CBI21374 1 MKLSALQQSYTNRRSNSFRAAGG-LDSSVDGSGKSPAAIFWLVLHGLCCLISLVLGFRFSRLVFFLFFSTASNGG---TS 76
CAI93186 1 MKLSALQQSYLGRRSNSFRSSGP-LDSSSDGAFKSPAAVFWLVLHGLSCLISLLLGFRFSRLVFLLLSTSST-----YTS 74
NP_195407 1 MKLSALHQSYLNRRSNSFRSPTS-LDSSVDGSGKSLIAVFWLILHCLCCLISLVLGFRFSRLVFFFLFSTSSTNL--YSL 77
XP_002310709 1 MKFSLLQQSYNNRRSGSFRGSSAPLDSSPDNTIKSPAAIFWLFLHGICCLISLVLGFRFSRLVFFFLFSTSTTTTLYVTT 80
42071 76 PF---LF--AGNGGVSQLLR-LKPLETATN--S---TVK--KNSRVVVGRHGIRIRPWPHPNPIEVLRAHQLLVRVQKEQ 142
NP_201524 76 PF---LF--AGNGGVSQLLR-LKPLETATN--S---TVK--KNSRVVVGRHGIRIRPWPHPNPIEVLRAHQLLVRVQKEQ 142
XP_002283249 77 GLYPSTPFLGTTADIAGSLS-FQANPSPNLELPPNRTAGGISSSRVVVGRHGIRIRPWPHPNPDEVMKAHRIIERVQREQ 155
CBI21374 77 GLYPS---------------------------------------RVVVGRHGIRIRPWPHPNPDEVMKAHRIIERVQREQ 117
CAI93186 75 PFHSPTE-LAKTLDIRSVIP-ADPVGNVPLPFP-NKTA---TNSRVVVGRHGIRIRPWPHPNPVEVMKAHRIIERVQTEQ 148
NP_195407 78 PFRPDLP--VKHLDVHTIGRTLDPGANGTTVVA---TAT--KSSRVVVGRHGIRIRPWPHPNPVEVMKAHQIIGRVQKEQ 150
XP_002310709 81 PFHPLSK----TSDISNPLT-NSANDLPVI----NKTV----SSRVVVGRHGIRIRPWPHPNPSEVIKAHQIIERVQREQ 147
42071 143 KSMYGVRSPRTVIVVTPTYVRTFQALHLTGVMHSLMLVPYDLVWIVVEAGGITNETASFIAKSGLKTIHLGFDQKMPNTW 222
NP_201524 143 KSMYGVRSPRTVIVVTPTYVRTFQALHLTGVMHSLMLVPYDLVWIVVEAGGITNETASFIAKSGLKTIHLGFDQKMPNTW 222
XP_002283249 156 KLQFGIKNPRTVIVVTPTYVRTFQTLHLTGLMHSLMNVPYDLIWIVIEAGGTTNETASLLAKSGLRTIHIGFDRRMPNSW 235
CBI21374 118 KLQFGIKNPRTVIVVTPTYVRTFQTLHLTGLMHSLMNVPYDLIWIVIEAGGTTNETASLLAKSGLRTIHIGFDRRMPNSW 197
CAI93186 149 RLQFGVKDPRKIIVVTPTYVRTFHALHLTGVMHSLMLVPYDLVWIVVEAGGVSNETASLIAKSGLKTIHVGFNQRMPNSW 228
NP_195407 151 KMIFGMKSSKMVIAVTPTYVRTFQALHLTGVMHSLMLVPYDLVWIVVEAGGATNETGLIIAKSGLRTIHVGIDQRMPNTW 230
XP_002310709 148 SNQFGVKSPRSLIVVTPTYVRTFQTLHMTGVMHSLMLLPYDVVWIVVEAGGVTNETALIIAKSGVKTLHIGFNQKMPNSW 227
42071 223 EDRHKLETKMRLHALRVVREKKLDGIVMFADDSNMHSMELFDEIQTVKWFGALSVGILAHSGNADELSSILKN-----EQ 297
NP_201524 223 EDRHKLETKMRLHALRVVREKKLDGIVMFADDSNMHSMELFDEIQTVKWFGALSVGILAHSGNADELSSILKN-----EQ 297
XP_002283249 236 EDRHRLEAQMRLRALRIVREEKLDGILMFGDDSNMHSMELFDEIQKVKWIGAVSVGILAHSGNTDELSSVAHK----KAE 311
CBI21374 198 EDRHRLEAQMRLRALRIVREEKLDGILMFGDDSNMHSMELFDEIQKVKWIGAVSVGILAHSGNTDELSSVAHK----KAE 273
CAI93186 229 EERHKLESKMRLRALRIIREKKLDGIVMFADDSNMHSMELFDEIQNVKWFGAVSVGILTHSVNTDEMA--------GRKK 300
NP_195407 231 EDRSKLEVFMRLQALRVVREEKLDGIVMFADDSNMHSMELFDEIQNVKWFGTVSVGILAHSGNAEEMVLSMEKRK-EMEK 309
XP_002310709 228 EGRHRLETKMRLRALRVVREEKMDGIVMFADDSNMHSMELFDEIQNVKWFGAVSVGILVHSGGADETLLTAAAAMVDKEA 307
42071 298 GKNKEKPSMPIQGPSCNSSEKLVGWHIFNTQPYAKKTAVYIDEKAPVMPSKMEWSGFVLNSRLLWKESLDDKPAWVKDLS 377
NP_201524 298 GKNKEKPSMPIQGPSCNSSEKLVGWHIFNTQPYAKKTAVYIDEKAPVMPSKMEWSGFVLNSRLLWKESLDDKPAWVKDLS 377
XP_002283249 312 EENLP---PPVQGPACNSSEKLVGWHIFNSLPYVGNGATYIDDRATVLPRKLEWSGFVLNSRLLWKAA-EDRPEWVKDLD 387
CBI21374 274 EENLP---PPVQGPACNSSEKLVGWHIFNSLPYVGNGATYIDDRATVLPRKLEWSGFVLNSRLLWKAA-EDRPEWVKDLD 349
CAI93186 301 DEEEN-PRMPVQGPACNASDMLAGWHTFNTLPFAGKSAVYIDDRATVLPRKLEWSGFVLNTRLLWKDS-SDKPKWIKDID 378
NP_195407 310 EEEEESSSLPVQGPACNSTDQLIGWHIFNTLPYAGKSAVYIDDVAAVLPQKLEWSGFVLNSRLLWEEA-ENKPEWVKDFG 388
XP_002310709 308 EENLPNPVVPVQGPACNASNKLVGWHTFNSLPYEGKSAVYIDDRATVLPRKLEWAGFMLNSRLLWKEA-EDKPEWVKDMD 386
42071 378 LLD---DGYAEIESPLSLVKDPSMVEPLGSCGRRVLLWWLRVEARADSKFPPGWIIKSPLEITVPSKRTPWPDSSSELPA 454
NP_201524 378 LLD---DGYAEIESPLSLVKDPSMVEPLGSCGRRVLLWWLRVEARADSKFPPGWIIKSPLEITVPSKRTPWPDSSSELPA 454
XP_002283249 388 KLDGVREE---IESPLSLLKDPSMVEPLGSCGRKVLLWWLRVEARTDSKFPARWIIDPPLEVTVPAKRTPWPDAPPELPS 464
CBI21374 350 KLDGVREE---IESPLSLLKDPSMVEPLGSCGRKVLLWWLRVEARTDSKFPARWIIDPPLEVTVPAKRTPWPDAPPELPS 426
CAI93186 379 MLN---GD---IESPLGLVNDPSVVEPLGNCGRQVLLWWIRVEARADSKFPPRWIIDPPLEITVPSKRTPWRDAPPELPA 452
NP_195407 389 SLN---ENEG-VESPLSLLKDPSMVEPLGSCGRQVLLWWLRVEARADSKFPPGWIIDPPLEITVAAKRTPWPDVPPEPPT 464
XP_002310709 387 LVD---EN---IENPLALLKDPSMVEPLGSCGRQVLLWWLRVEARADSKFPPGWIIDPPLEITVPSKRTPWPDAPPELPS 460
42071 455 ---------------AAIKEAKSNSKPRVSK-SKSYKEKQEPKAFDGV-KVSATS------- 492
NP_201524 455 ---------------AAIKEAKSNSKPRVSK-SKSYKEKQEPKAFDGV-KVSATS------- 492
XP_002283249 465 NVKEISIQEH---------TEKRHAKSRASRSKHSSRSKRKHESRTADPQVSSKVS---EE- 513
CBI21374 427 NVKEISIQEH---------TEKRHAKSRASRSKHSSRSKRKHESRTADPQVSSKVS---EE- 475
CAI93186 453 NEKPAMGIQD---------PIVKHSPKRASRSKH---------------------------- 477
NP_195407 465 KKKDQMPLSQGNTVVVIPKQQQHPTKIRKPK-RKSKKSKHEPRPTDTTTQVYSSSSKHQERN 525
XP_002310709 461 NEKISVNQEQ---------TAKRSSKTRSPRSKRSSRSKRKHEVVLAETQVSARHS---EQN 510
k. Clustal W analysis of protein sequences from “j” CLUSTAL 2.0.12 multiple sequence alignment gi|225438805|ref|XP_002283249. MKLSALQQSYTNRRSNSFRAAGG-LDSSVDGSGKSPAAIFWLVLHGLCCL 49 gi|270232051|emb|CBI21374.1| MKLSALQQSYTNRRSNSFRAAGG-LDSSVDGSGKSPAAIFWLVLHGLCCL 49 gi|224096716|ref|XP_002310709. MKFSLLQQSYNNRRSGSFRGSSAPLDSSPDNTIKSPAAIFWLFLHGICCL 50 gi|63087742|emb|CAI93186.1| MKLSALQQSYLGRRSNSFRSSGP-LDSSSDGAFKSPAAVFWLVLHGLSCL 49 gi|30690793|ref|NP_195407.2| MKLSALHQSYLNRRSNSFRSPTS-LDSSVDGSGKSLIAVFWLILHCLCCL 49 gi|15240245|ref|NP_201524.1| MKLSVFRLSYWNRRGSSFRSSPS-LDPSFDG--KSPSSVFWFVIHGLCCL 47 **:* :: ** .**..***.. **.* *. ** ::**:.:* :.** gi|225438805|ref|XP_002283249. ISLVLGFRFSRLVFFLFFSTASNG---GTSGLYPST-PFLGTTADIAGSL 95 gi|270232051|emb|CBI21374.1| ISLVLGFRFSRLVFFLFFSTASNG---GTSGLYP---------------- 80 gi|224096716|ref|XP_002310709. ISLVLGFRFSRLVFFFLFSTSTTTTLYVTTPFHP-----LSKTSDISNPL 95 gi|63087742|emb|CAI93186.1| ISLLLGFRFSRLVFLLLSTSSTYT-----SPFHSP--TELAKTLDIRSVI 92 gi|30690793|ref|NP_195407.2| ISLVLGFRFSRLVFFFLFSTSSTN--LYSLPFRPDLPVKHLDVHTIGRTL 97 gi|15240245|ref|NP_201524.1| ISLILGFRFSHLVLFFLFSTSVTN--LYTTPFLFAG---NGGVSQLLRLK 92 ***:******:**:::: ::: : gi|225438805|ref|XP_002283249. SFQANPSPNLELPPNRTAGGISSSRVVVGRHGIRIRPWPHPNPDEVMKAH 145 gi|270232051|emb|CBI21374.1| -----------------------SRVVVGRHGIRIRPWPHPNPDEVMKAH 107 gi|224096716|ref|XP_002310709. TNSANDLPVIN--------KTVSSRVVVGRHGIRIRPWPHPNPSEVIKAH 137 gi|63087742|emb|CAI93186.1| PADPVGNVPLPFPN----KTATNSRVVVGRHGIRIRPWPHPNPVEVMKAH 138 gi|30690793|ref|NP_195407.2| DPGANGTTVVAT-------ATKSSRVVVGRHGIRIRPWPHPNPVEVMKAH 140 gi|15240245|ref|NP_201524.1| PLETATNSTVK----------KNSRVVVGRHGIRIRPWPHPNPIEVLRAH 132 ******************** **::**
gi|225438805|ref|XP_002283249. RIIERVQREQKLQFGIKNPRTVIVVTPTYVRTFQTLHLTGLMHSLMNVPY 195 gi|270232051|emb|CBI21374.1| RIIERVQREQKLQFGIKNPRTVIVVTPTYVRTFQTLHLTGLMHSLMNVPY 157 gi|224096716|ref|XP_002310709. QIIERVQREQSNQFGVKSPRSLIVVTPTYVRTFQTLHMTGVMHSLMLLPY 187 gi|63087742|emb|CAI93186.1| RIIERVQTEQRLQFGVKDPRKIIVVTPTYVRTFHALHLTGVMHSLMLVPY 188 gi|30690793|ref|NP_195407.2| QIIGRVQKEQKMIFGMKSSKMVIAVTPTYVRTFQALHLTGVMHSLMLVPY 190 gi|15240245|ref|NP_201524.1| QLLVRVQKEQKSMYGVRSPRTVIVVTPTYVRTFQALHLTGVMHSLMLVPY 182 ::: *** ** :*::..: :*.*********::**:**:***** :** gi|225438805|ref|XP_002283249. DLIWIVIEAGGTTNETASLLAKSGLRTIHIGFDRRMPNSWEDRHRLEAQM 245 gi|270232051|emb|CBI21374.1| DLIWIVIEAGGTTNETASLLAKSGLRTIHIGFDRRMPNSWEDRHRLEAQM 207 gi|224096716|ref|XP_002310709. DVVWIVVEAGGVTNETALIIAKSGVKTLHIGFNQKMPNSWEGRHRLETKM 237 gi|63087742|emb|CAI93186.1| DLVWIVVEAGGVSNETASLIAKSGLKTIHVGFNQRMPNSWEERHKLESKM 238 gi|30690793|ref|NP_195407.2| DLVWIVVEAGGATNETGLIIAKSGLRTIHVGIDQRMPNTWEDRSKLEVFM 240 gi|15240245|ref|NP_201524.1| DLVWIVVEAGGITNETASFIAKSGLKTIHLGFDQKMPNTWEDRHKLETKM 232 *::***:**** :***. ::****::*:*:*::::***:** * :** * gi|225438805|ref|XP_002283249. RLRALRIVREEKLDGILMFGDDSNMHSMELFDEIQKVKWIGAVSVGILAH 295 gi|270232051|emb|CBI21374.1| RLRALRIVREEKLDGILMFGDDSNMHSMELFDEIQKVKWIGAVSVGILAH 257 gi|224096716|ref|XP_002310709. RLRALRVVREEKMDGIVMFADDSNMHSMELFDEIQNVKWFGAVSVGILVH 287 gi|63087742|emb|CAI93186.1| RLRALRIIREKKLDGIVMFADDSNMHSMELFDEIQNVKWFGAVSVGILTH 288 gi|30690793|ref|NP_195407.2| RLQALRVVREEKLDGIVMFADDSNMHSMELFDEIQNVKWFGTVSVGILAH 290 gi|15240245|ref|NP_201524.1| RLHALRVVREKKLDGIVMFADDSNMHSMELFDEIQTVKWFGALSVGILAH 282 **:***::**:*:***:**.***************.***:*::*****.* gi|225438805|ref|XP_002283249. SGNTDE------LSSVAHKKAEEENLP---PPVQGPACNSSEKLVGWHIF 336 gi|270232051|emb|CBI21374.1| SGNTDE------LSSVAHKKAEEENLP---PPVQGPACNSSEKLVGWHIF 298 gi|224096716|ref|XP_002310709. SGGADETLL--TAAAAMVDKEAEENLPNPVVPVQGPACNASNKLVGWHTF 335 gi|63087742|emb|CAI93186.1| SVNTDE--------MAGRKKDEEENPR---MPVQGPACNASDMLAGWHTF 327 gi|30690793|ref|NP_195407.2| SGNAEEMVLSMEKRKEMEKEEEEESSS---LPVQGPACNSTDQLIGWHIF 337 gi|15240245|ref|NP_201524.1| SGNADE-------LSSILKNEQGKNKEKPSMPIQGPSCNSSEKLVGWHIF 325 * .::* .: :. *:***:**::: * *** * gi|225438805|ref|XP_002283249. NSLPYVGNGATYIDDRATVLPRKLEWSGFVLNSRLLWKA-AEDRPEWVKD 385 gi|270232051|emb|CBI21374.1| NSLPYVGNGATYIDDRATVLPRKLEWSGFVLNSRLLWKA-AEDRPEWVKD 347 gi|224096716|ref|XP_002310709. NSLPYEGKSAVYIDDRATVLPRKLEWAGFMLNSRLLWKE-AEDKPEWVKD 384 gi|63087742|emb|CAI93186.1| NTLPFAGKSAVYIDDRATVLPRKLEWSGFVLNTRLLWKD-SSDKPKWIKD 376 gi|30690793|ref|NP_195407.2| NTLPYAGKSAVYIDDVAAVLPQKLEWSGFVLNSRLLWEE-AENKPEWVKD 386 gi|15240245|ref|NP_201524.1| NTQPYAKKTAVYIDEKAPVMPSKMEWSGFVLNSRLLWKESLDDKPAWVKD 375 *: *: : *.***: *.*:* *:**:**:**:****: .::* *:** gi|225438805|ref|XP_002283249. LDKLDGVREEIESPLSLLKDPSMVEPLGSCGRKVLLWWLRVEARTDSKFP 435 gi|270232051|emb|CBI21374.1| LDKLDGVREEIESPLSLLKDPSMVEPLGSCGRKVLLWWLRVEARTDSKFP 397 gi|224096716|ref|XP_002310709. MDLVD---ENIENPLALLKDPSMVEPLGSCGRQVLLWWLRVEARADSKFP 431 gi|63087742|emb|CAI93186.1| IDMLNG---DIESPLGLVNDPSVVEPLGNCGRQVLLWWIRVEARADSKFP 423 gi|30690793|ref|NP_195407.2| FGSLNE-NEGVESPLSLLKDPSMVEPLGSCGRQVLLWWLRVEARADSKFP 435 gi|15240245|ref|NP_201524.1| LSLLDDGYAEIESPLSLVKDPSMVEPLGSCGRRVLLWWLRVEARADSKFP 425 :. :: :*.**.*::***:*****.***:*****:*****:***** gi|225438805|ref|XP_002283249. ARWIIDPPLEVTVPAKRTPWPDAPPELPSNVK---------EISIQEHTE 476 gi|270232051|emb|CBI21374.1| ARWIIDPPLEVTVPAKRTPWPDAPPELPSNVK---------EISIQEHTE 438 gi|224096716|ref|XP_002310709. PGWIIDPPLEITVPSKRTPWPDAPPELPSNEK---------ISVNQEQTA 472 gi|63087742|emb|CAI93186.1| PRWIIDPPLEITVPSKRTPWRDAPPELPANEK---------PAMGIQDPI 464 gi|30690793|ref|NP_195407.2| PGWIIDPPLEITVAAKRTPWPDVPPEPPTKKKDQMPLSQGNTVVVIPKQQ 485 gi|15240245|ref|NP_201524.1| PGWIIKSPLEITVPSKRTPWPDSSSELP---------------AAAIKEA 460 . ***..***:**.:***** * ..* * . gi|225438805|ref|XP_002283249. KRHAKSRASR--SKHSSRSKRKHESRTADPQVSSKVSEE- 513 gi|270232051|emb|CBI21374.1| KRHAKSRASR--SKHSSRSKRKHESRTADPQVSSKVSEE- 475 gi|224096716|ref|XP_002310709. KRSSKTRSPR--SKRSSRSKRKHEVVLAETQVSARHSEQN 510 gi|63087742|emb|CAI93186.1| VKHSPKRASR--SKH------------------------- 477 gi|30690793|ref|NP_195407.2| QHPTKIRKPKRKSKKSKHEPRPTDTTTQVYSSSSKHQERN 525 gi|15240245|ref|NP_201524.1| KSNSKPRVSK--SKSYKEKQEPKAFDGVKVSATS------ 492 : * .: ** l. Names and Locations of any mutants for this gene.
Name of Mutant Location/Insertion position T-DNA.GK-322A11-015946 26839848T-DNA.SAIL_598_F01.v1 26839724T-DNA_LB.T-DNA.SAIL_598_D01.v1 26839724T-DNA.SALK_066961.16.30.x 26840629
6. 5 ́‐ACCCAAGGGTGCGCGCGTGGCTCCTGGCGCGCCGAGGCCCT‐3 ́ A primer complementary to this DNA sequence is used during the sequencing procedure. This means that the sequence which we get at the end of the experiment will be the sequence of the complementary strand to the above strand. 5 ́‐ACCCAAGGGTGCGCGCGTGGCTCCTGGCGCGCCGAGGCCCT‐3 ́‐‐‐DNA sequence 3 ‐TGGGTTCCCA CGCGCGCACCGAGGACCGCGCGGCTCCGGGA‐ 5́‐‐‐complementary sequence formed in 5 ́ 3 ́ direction. By automated dideoxy sequencing method:
5 ‐A G G G C C T C G G C G C G C C A G G A G C C A C G C G C G C A C C C T T G G G T‐ 3 ́
By pyro‐sequencing :
G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A G C T A etc.
A BLAST search with this sequence reveal that is corresponds to the promoter region of the human leptin (ob) gene.
7. Transgenic production of human growth hormone (hGH): Gene construction In tobacco and goats ‐for tobacco, use a leave‐specific promoter like the RBC promoter to drive expression of the hGH cDNA (do not use the 35S CaMV promoter which is expressed everywhere in the plant) ‐for goats, use a mammary gland‐specific promoter like the beta‐lactoglobulin promoter to drive expression of the hGH cDNA Of course you will need a 5’UT, signal sequence, and 3’UT containing the poly A tail addition sequence, but the hGH cDNA sequence should provide these sequences, even in plants Introduction of the transgene construct into tobacco and goats ‐ explain the binary Ti plasmid system, including transformation of E. coli and subsequently transformation of Agrobacterium which already contains the disarmed Ti plasmid ‐explain the Agrobacterium infection process and the tissue culturing of transformed plant tissue to regenerate tobacco plants ‐for goats, need to obtain fertilized egg cells and then use microinjection to deliver your gene construct into one of the two pronuclei in the 1 cell embryo. Subsequently, implant several microinjected embryos into surrogate goat mothers. The use of embryonic stem cells (ESC) is unnecessary and more complicated. Testing for the transgenic DNA, RNA and protein ‐To test for the transgenic DNA, isolate DNA from any tobacco or goat organ/tissue and run a PCR using primers corresponding to the transgene. Alternatively, do a Southern blot with this DNA using the transgene as a probe. ‐To test for transgenic RNA, isolate RNA from tobacco leaves or goat mammary glands and conduct RT‐PCR using primers corresponding to the transcribed transgene. Alternatively, do northern blotting with this RNA using the transgene as a probe. ‐To test for transgenic protein, you would need to have an antibody against hGH. If so, you can isolate protein from tobacco leaves or goat milk and perform an ELISA or western blot using this antibody as your primary antibody and an appropriate secondary antibody conjugated to an enzyme like alkaline phosphatise or horseradish peroxidise. Expression of the transgene Expression will be variable in different transgenic tobacco plants as well as in different transgenic goats. The main reasons for variable expression (of RNA and protein) is because of the random, variable location of the site of insertion (i.e., “position effect”) and because of the variable number of transgene constructs copies inserted (i.e., “copy number”).