PREDICTED: Drosophila obscura protein sidekick (LOC111066347),
LOCUS XM_041591960 7358 bp mRNA linear INV 14-MAY-2021
transcript variant X9, mRNA.
ACCESSION XM_041591960
VERSION XM_041591960.1
DBLINK BioProject: PRJNA728747
KEYWORDS RefSeq.
SOURCE Drosophila obscura
ORGANISM Drosophila obscura
Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta;
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera;
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila; Sophophora.
COMMENT MODEL REFSEQ: This record is predicted by automated computational
analysis. This record is derived from a genomic sequence
(NW_024542752.1) annotated using gene prediction method: Gnomon.
Also see:
Documentation of NCBI's Annotation Process
##Genome-Annotation-Data-START##
Annotation Provider :: NCBI
Annotation Status :: Full annotation
Annotation Name :: Drosophila obscura Annotation
Release 101
Annotation Version :: 101
Annotation Pipeline :: NCBI eukaryotic genome annotation
pipeline
Annotation Software Version :: 8.6
Annotation Method :: Best-placed RefSeq; Gnomon
Features Annotated :: Gene; mRNA; CDS; ncRNA
##Genome-Annotation-Data-END##
FEATURES Location/Qualifiers
source 1..7358
/organism="Drosophila obscura"
/mol_type="mRNA"
/isolate="BZ-5 IFL"
/db_xref="taxon:7282"
/chromosome="Unknown"
/sex="male"
/tissue_type="whole fly"
/dev_stage="Adult fly"
/geo_loc_name="Serbia: Babin Zub"
/collection_date="2017"
gene 1..7358
/gene="LOC111066347"
/note="Derived by automated computational analysis using
gene prediction method: Gnomon. Supporting evidence
includes similarity to: 7 Proteins, and 100% coverage of
the annotated genomic feature by RNAseq alignments,
including 10 samples with support for all annotated
introns"
/db_xref="GeneID:111066347"
CDS 351..7079
/gene="LOC111066347"
/codon_start=1
/product="protein sidekick isoform X9"
/protein_id="XP_041447894.1"
/db_xref="GeneID:111066347"
/translation="MKRDQRRSSASSLRRRRRWCVDVNEKGTRMWLKISLSQPLEASL
FVLAALLLLNADSCSCYADANPQQQQQLVQQQQQQLQAPRFTTHPSSSGSIVSEGSTK
ILQCHALGYPQPTYRWLKDGKSVGEFSSSQFYRFHSTRREDAGSYQCIAKNDAGSIFS
EKSDVVVAYMGIFENVTEGRLTVVSGHPAIFDMPAIESVPTPSVLWQSADGSLNYDIK
YAFTQANQLIILSVDENDRRGYRARAINTQLGKEEISAFVHLNVSGDPYIEVAPEIIV
RPQDVKVKTGTGVLELQCIANARPLHELETIWLKDGLAVDTTGVRHTLNDPWNRTLAL
LQANSSHSGEYTCQVRLRSGGYPTVTASARVQILEPPVFFTPMRAETFGEFGGQVQLP
CDVVGEPTPQVEWFRNAESVEANVQSGRYSLGEDNTLIIKKLILDDSAMFQCLARNEA
GENSASTWLRVKTSAPVFEQPPQNVTALDGKDATISCRAIGSPNPNVTWIYNETQLVE
ISSRVQILESGDLLISNIRATDAGLYICVRANEAGSVKGEALLSVLVRTQIIQPPVDT
IVLLGLTATLQCKVSSDPSVPYNIDWYREGQMAPISNSQRIGVQADGQLEIQAVRASD
VGSYSCVVTSPGGNETRSARLSVIELPFPPSNVRVERLPEPQQRSINVSWTPGFDGNS
PISKFIIQRREVSELEKFVGPVPDPLLNWITELSNVSANQRWMLLENLKAATVYQFRV
SAVNRVGEGSPSEPSNVVELPQEAPSGPPVGFVGSARSMSEIITQWQPPQEEHRNGQI
LGYILRYRLFGYNNVPWSYQNITNEAQRNFLIQELITWKDYIVQIAAFNNMGVGVYTE
GAKIKTKEGVPEAPPTNVRVKALNSTAAQITWKPPNPQQINGINQGYKIQAWQRRQLD
GEERDMERRMMTVPPSLIDPLAEQTTVLGGLDKFAKFNVTVLCFTDPGDGVASQLVPV
ETLDDVPDEITALHFDDVSDRSVKVLWAPPRFANGILTGYTVRYQVKDRPETMKFFNL
TADDNELTVNQLQATTHYWFEVCAWTRVGSGPPKTATIQSGVEPVLPHAPTTLALSNI
EAFSVVLQFTPGFDGNSSITKWKVEAQTARNMTWFTLCEISDPDAETLTVTGLMPFTQ
YRLRLSATNVVGSSRPSDPTKDFQTIQAKPMHAPFNVTVRAMSALQLRVRWIPLQQME
WFGNPRGYNVTYRQMERTGKPSKHPPRSVMIEDHTANSHVLEGLEEWTLYEVIMNACN
DVGCSLDSGLAMERTREAVPSYGPLHVEANATSSTTVVVRWGEIPPHHRNGQIDGYKV
YYAATERGMQVLYKTIPNNSSFTTTLTELQKFVVYHVQVLAYTRLGNGALSTPPIRVQ
TFEDTPGSPSNVSFPDVTFSMARIIWDVPMDPNGEILAYQVTYTLNGSANLNYSREFP
PSDRTFRATGLMPERYYSFSVTAQTRLGWGKTASVLVYTTNNRDRPQAPSGPQVSRSQ
IQAHQITFNWTPGRDGFAPLRYYTVEMRENEGRWQPLPERVDPTLSSYTALGLRPYTT
YQFRIQATNDLGPSAFSRESIVVRTLPAAPAVGVGGLKVVPITTTSVRVQWGALETGM
WNGDAATGGYRILYQQLSDFAPALQSTPKTDVMGINENSVVLSDLQQDRNYEIVVLPF
NSQGPGPATPPTAVYVGEAVPTGEPRGVDATAISSTEVRLSWKPPKQSSQNGEILGYK
IFYLVTWSPQALEPGRKFEEEIEVVSATATSHSLVFLDKFTEYRIQLLAFNPAGDGPR
SAPVTAKTMPGVPSAPLNLRFSDITMQSLEVTWDPPKLLNGEIVGYLVTYETTEENEK
FSKQVKQKVSNTTLRVQNLEEEVTYTFTVRAQTNDYGPAVSANVTTGPQDGSPVAPRD
LTLTKTLSSVEVHWVNGPSGRGPILGYLIEAKKRENGEPSFIYDSRWTKIEQSRKGTM
KEFTVSYHILMPSTAYLFRVIAYNKYGISFPVYSKDSILTPSKLHLEYGYLQHKPFYR
QTWFMVSLAATSIVIIVMVIAVLCVKSKSYKYKQEAQKTLEESMAMSIDERQELALEL
YRSRHGVGTGTLNSVGTLRSGTLGTLGRKSTNRHQPVSVHLGKSPPRPSPASVAYHSD
EESLKCYDENPDDSSVTEKPSEVSSSEASQHSESENESVRSDPHSFVNHYANVNDSLR
QSWKKTKPVRNYSSYTDSEPEGSAVMSLNGGQIIVNNMARSRAPLPGFSSFV"
misc_feature 597..809
/gene="LOC111066347"
/note="Immunoglobulin domain; Region: Ig_3; pfam13927"
/db_xref="CDD:464046"
misc_feature 1164..1451
/gene="LOC111066347"
/note="Immunoglobulin domain; Region: Ig; cl11960"
/db_xref="CDD:472250"
misc_feature 1218..1232
/gene="LOC111066347"
/note="Ig strand B [structural motif]; Region: Ig strand
B"
/db_xref="CDD:409353"
misc_feature 1263..1277
/gene="LOC111066347"
/note="Ig strand C [structural motif]; Region: Ig strand
C"
/db_xref="CDD:409353"
misc_feature 1338..1352
/gene="LOC111066347"
/note="Ig strand E [structural motif]; Region: Ig strand
E"
/db_xref="CDD:409353"
misc_feature 1380..1397
/gene="LOC111066347"
/note="Ig strand F [structural motif]; Region: Ig strand
F"
/db_xref="CDD:409353"
misc_feature 1422..1433
/gene="LOC111066347"
/note="Ig strand G [structural motif]; Region: Ig strand
G"
/db_xref="CDD:409353"
misc_feature 1461..1736
/gene="LOC111066347"
/note="Immunoglobulin domain; Region: Ig; cl11960"
/db_xref="CDD:472250"
misc_feature 1515..1529
/gene="LOC111066347"
/note="Ig strand B [structural motif]; Region: Ig strand
B"
/db_xref="CDD:409543"
misc_feature 1554..1568
/gene="LOC111066347"
/note="Ig strand C [structural motif]; Region: Ig strand
C"
/db_xref="CDD:409543"
misc_feature 1632..1643
/gene="LOC111066347"
/note="Ig strand E [structural motif]; Region: Ig strand
E"
/db_xref="CDD:409543"
misc_feature 1671..1688
/gene="LOC111066347"
/note="Ig strand F [structural motif]; Region: Ig strand
F"
/db_xref="CDD:409543"
misc_feature 1710..1721
/gene="LOC111066347"
/note="Ig strand G [structural motif]; Region: Ig strand
G"
/db_xref="CDD:409543"
misc_feature 1746..2009
/gene="LOC111066347"
/note="Immunoglobulin I-set domain; Region: I-set;
pfam07679"
/db_xref="CDD:400151"
misc_feature 1797..1811
/gene="LOC111066347"
/note="Ig strand B [structural motif]; Region: Ig strand
B"
/db_xref="CDD:409562"
misc_feature 1836..1850
/gene="LOC111066347"
/note="Ig strand C [structural motif]; Region: Ig strand
C"
/db_xref="CDD:409562"
misc_feature 1908..1919
/gene="LOC111066347"
/note="Ig strand E [structural motif]; Region: Ig strand
E"
/db_xref="CDD:409562"
misc_feature 1947..1964
/gene="LOC111066347"
/note="Ig strand F [structural motif]; Region: Ig strand
F"
/db_xref="CDD:409562"
misc_feature 1986..1997
/gene="LOC111066347"
/note="Ig strand G [structural motif]; Region: Ig strand
G"
/db_xref="CDD:409562"
misc_feature 2022..2291
/gene="LOC111066347"
/note="Immunoglobulin I-set domain; Region: I-set;
pfam07679"
/db_xref="CDD:400151"
misc_feature 2070..2084
/gene="LOC111066347"
/note="Ig strand B [structural motif]; Region: Ig strand
B"
/db_xref="CDD:409544"
misc_feature 2115..2129
/gene="LOC111066347"
/note="Ig strand C [structural motif]; Region: Ig strand
C"
/db_xref="CDD:409544"
misc_feature 2187..2201
/gene="LOC111066347"
/note="Ig strand E [structural motif]; Region: Ig strand
E"
/db_xref="CDD:409544"
misc_feature <2229..>3014
/gene="LOC111066347"
/note="Fibronectin type 3 domain [General function
prediction only]; Region: FN3; COG3401"
/db_xref="CDD:442628"
misc_feature 2229..2246
/gene="LOC111066347"
/note="Ig strand F [structural motif]; Region: Ig strand
F"
/db_xref="CDD:409544"
misc_feature 2268..2279
/gene="LOC111066347"
/note="Ig strand G [structural motif]; Region: Ig strand
G"
/db_xref="CDD:409544"
misc_feature 2301..2624
/gene="LOC111066347"
/note="Fibronectin type 3 domain; One of three types of
internal repeats found in the plasma protein fibronectin.
Its tenth fibronectin type III repeat contains an RGD cell
recognition sequence in a flexible loop between 2 strands.
Approximately 2% of all...; Region: FN3; cd00063"
/db_xref="CDD:238020"
misc_feature order(2301..2303,2544..2546,2589..2591)
/gene="LOC111066347"
/note="Interdomain contacts [active]"
/db_xref="CDD:238020"
misc_feature order(2592..2597,2601..2606)
/gene="LOC111066347"
/note="Cytokine receptor motif [active]"
/db_xref="CDD:238020"
misc_feature 2958..3272
/gene="LOC111066347"
/note="Fibronectin type 3 domain; One of three types of
internal repeats found in the plasma protein fibronectin.
Its tenth fibronectin type III repeat contains an RGD cell
recognition sequence in a flexible loop between 2 strands.
Approximately 2% of all...; Region: FN3; cd00063"
/db_xref="CDD:238020"
misc_feature order(3237..3242,3246..3251)
/gene="LOC111066347"
/note="Cytokine receptor motif [active]"
/db_xref="CDD:238020"
misc_feature order(3285..3287,3480..3482,3525..3527)
/gene="LOC111066347"
/note="Interdomain contacts [active]"
/db_xref="CDD:238020"
misc_feature 3288..3539
/gene="LOC111066347"
/note="Fibronectin type III domain; Region: fn3;
pfam00041"
/db_xref="CDD:394996"
misc_feature 3579..3848
/gene="LOC111066347"
/note="Fibronectin type 3 domain; One of three types of
internal repeats found in the plasma protein fibronectin.
Its tenth fibronectin type III repeat contains an RGD cell
recognition sequence in a flexible loop between 2 strands.
Approximately 2% of all...; Region: FN3; cd00063"
/db_xref="CDD:238020"
misc_feature order(3579..3581,3777..3779,3822..3824)
/gene="LOC111066347"
/note="Interdomain contacts [active]"
/db_xref="CDD:238020"
misc_feature order(3825..3830,3834..3839)
/gene="LOC111066347"
/note="Cytokine receptor motif [active]"
/db_xref="CDD:238020"
misc_feature 3885..4151
/gene="LOC111066347"
/note="Fibronectin type 3 domain; One of three types of
internal repeats found in the plasma protein fibronectin.
Its tenth fibronectin type III repeat contains an RGD cell
recognition sequence in a flexible loop between 2 strands.
Approximately 2% of all...; Region: FN3; cd00063"
/db_xref="CDD:238020"
misc_feature 4206..4463
/gene="LOC111066347"
/note="Fibronectin type III domain; Region: fn3;
pfam00041"
/db_xref="CDD:394996"
misc_feature 4425..6080
/gene="LOC111066347"
/note="Fibronectin type 3 domain [General function
prediction only]; Region: FN3; COG3401"
/db_xref="CDD:442628"
misc_feature order(4449..4454,4458..4463)
/gene="LOC111066347"
/note="Cytokine receptor motif [active]"
/db_xref="CDD:238020"
misc_feature 6033..6326
/gene="LOC111066347"
/note="Fibronectin type 3 domain; One of three types of
internal repeats found in the plasma protein fibronectin.
Its tenth fibronectin type III repeat contains an RGD cell
recognition sequence in a flexible loop between 2 strands.
Approximately 2% of all...; Region: FN3; cd00063"
/db_xref="CDD:238020"
misc_feature order(6033..6035,6258..6260,6303..6305)
/gene="LOC111066347"
/note="Interdomain contacts [active]"
/db_xref="CDD:238020"
misc_feature order(6306..6311,6315..6320)
/gene="LOC111066347"
/note="Cytokine receptor motif [active]"
/db_xref="CDD:238020"
ORIGIN
1 cctgcgcgca taacggttgt tcttgccgac tacgtcgcat cgtcgtcgtc attttcgtcg
61 tcgctcgtag ttcgtggctc tcggtcgctc caacttgctg cggcgcgtgt ttcgaaacac
121 agcagaccag acgaggaaga agtctagaga agcaaatggt tcaataaaga aacaacattg
181 aaagcaggcg caagagaaag tcaacatgaa ttaagaaaaa cgcaagaaaa ttcaaaaaca
241 attaaaattt aatcaaacaa aacaacaaaa aaaccaaaaa caaaattcag aagcaggcga
301 aaagaatcaa cgatatcgaa gaaaaacagc cgcaagagag agagccgaaa atgaagagag
361 accagcggcg atcttcagcg tcgtcgctgc gtcgccgtcg tcgttggtgc gtcgacgtca
421 acgaaaaagg aacacgaatg tggctcaaaa tttcgctgtc gcagccgctg gaagcgtcgc
481 tgtttgtgct ggcagcgctg ctgctgctca atgcggacag ctgctcatgt tacgcggatg
541 ccaatccgca gcaacaacag cagctggtcc agcagcagca gcaacaactt caggcgccac
601 gttttaccac acacccatcg tcatcgggct cgattgtgag cgagggcagc accaagatcc
661 tacagtgcca tgctttgggt tatccacagc cgacatatcg ttggctgaag gacggcaagt
721 ccgtgggcga gttctcatcg agtcagttct atcggttcca cagcacacgg cgcgaggatg
781 cgggcagcta tcagtgcatt gccaagaacg atgccggatc catattcagc gagaagagcg
841 acgttgtagt ggcctacatg ggcatctttg agaacgtcac cgagggacgc ctaactgttg
901 tgagcggaca tccggccatc ttcgatatgc cggccattga gtcggtgcca acgccatcgg
961 tgctgtggca gtcggcggac gggtcgctca actacgacat caagtacgcc ttcacccagg
1021 ccaatcagct gattatactg agcgtggacg agaacgatcg gaggggctac cgggcgcggg
1081 cgatcaacac gcagctgggc aaggaggaga tcagcgcgtt cgttcatctg aatgtcagtg
1141 gcgatccgta catagaggtg gcacccgaga taattgtacg gccgcaggat gtcaaggtca
1201 agaccggcac tggcgtcctc gagctgcagt gcatcgccaa tgcgcgaccc ctgcacgaac
1261 tggagacgat ttggctgaag gacggcctcg ccgtggacac gaccggcgtg cggcacaccc
1321 tcaacgatcc ctggaaccgc accctggccc tcctgcaagc caacagctcg cactccggcg
1381 agtacacctg tcaggtgcgc ctgcgcagcg gtggctatcc aacggtcacc gcctcagccc
1441 gcgtccaaat tctcgagccg cccgtcttct tcacgcccat gcgagcggaa acctttggtg
1501 aatttggcgg ccaggtgcag ctgccctgcg atgtggtggg cgagcccacg ccccaagttg
1561 aatggttccg gaatgcggag tctgtcgagg cgaatgtgca aagcggaaga tactcactgg
1621 gagaggataa tacgctgata attaagaaac taatactgga tgattcggcc atgtttcagt
1681 gcctggcccg aaatgaggcc ggcgagaact cagccagcac ctggctgcgc gtcaaaacct
1741 cagcgccggt ctttgagcag ccgccccaga atgtgaccgc cctggatggc aaggatgcga
1801 cgatctcctg tcgggccatt ggctcgccca atcccaatgt tacctggatc tacaatgaaa
1861 cccaactggt tgagatatcc agtcgcgttc agatactcga atcgggtgat ttactcatct
1921 cgaatatccg tgccacggac gcgggactct acatctgtgt gcgggccaac gaggcgggca
1981 gcgtcaaggg cgaggccttg ctaagcgtgt tagtgcggac acaaatcata cagccgccag
2041 tggacaccat cgtgctgctg ggcctgaccg cgacactgca gtgcaaggtg tccagcgacc
2101 cgagcgtgcc ctacaacatc gactggtacc gggagggcca aatggcgccc atcagcaact
2161 cgcagcggat tggagtgcag gcggacgggc agctggagat ccaggcggtg cgggccagcg
2221 atgtgggcag ctattcgtgc gtggttacat cgccgggcgg caatgagaca cggtcggccc
2281 gtctcagtgt catcgagctg cccttcccgc ccagcaacgt gcgggtggag cgtctgccag
2341 agccgcagca gcgcagcatc aatgtgtcct ggacgcccgg attcgatggc aacagtccaa
2401 tctccaaatt tattatccag cgacgtgagg tctctgaatt ggaaaaattc gtaggtccag
2461 ttccagatcc ccttctcaat tggatcaccg aactgagcaa cgtatcggcc aatcagcggt
2521 ggatgctgct ggagaacctc aaggcggcca ccgtctatca gtttcgtgtc agtgccgtca
2581 atcgggtcgg cgagggctcc ccctcggagc ccagcaatgt tgttgagctg ccccaagaag
2641 ctccttcggg accgcctgtg ggctttgtgg gctcggcacg gtccatgtcc gagatcatta
2701 cgcagtggca gccgccgcag gaggagcatc gcaacggaca gatcctgggc tacattctgc
2761 gctatcgcct gttcgggtac aacaatgtgc cgtggtccta ccagaacatc accaacgagg
2821 cgcagcgcaa ctttctgatc caggagctga tcacgtggaa ggactacatc gtgcagattg
2881 cggccttcaa caacatgggc gtgggcgtct acacggaggg ggccaagatc aagaccaagg
2941 agggtgtgcc cgaggcaccg cccaccaacg tcagggtgaa ggccctcaac tcgacggcgg
3001 cgcagatcac gtggaagccg ccgaatccgc agcagatcaa cggcatcaac cagggctaca
3061 agatccaggc atggcagcga cggcagctcg atggggagga gcgggacatg gagcggcgca
3121 tgatgacggt gccgcccagc ctgatcgatc cactggccga gcagacgacg gtgctcggtg
3181 gcctggacaa gttcgccaag ttcaatgtga ccgtactctg cttcaccgat cccggtgacg
3241 gtgtggccag ccagctggtg ccggtggaga ctttggacga cgtgcccgac gagataacgg
3301 ccctgcactt tgacgatgtc tccgatcggt ccgtcaaagt gctgtgggcg ccgccgcgct
3361 tcgccaacgg catcctcacc ggctacacgg tgcgctacca ggtcaaggat cgccccgaga
3421 cgatgaagtt cttcaacctg accgccgacg acaacgagct gacggtgaac cagctgcagg
3481 cgacgaccca ctactggttc gaggtgtgcg cctggacgcg ggtgggcagc gggccgccca
3541 agacggcgac gatccaatcg ggcgtggagc cggtgctgcc gcatgcgccc accacactgg
3601 ccctgtccaa catcgaagcg ttttcggtgg tgctgcagtt cacgcccggc ttcgacggca
3661 actcgagcat caccaagtgg aaggtggagg cgcagacggc ccgcaacatg acctggttca
3721 cgctctgtga aatcagcgat cccgatgcgg agaccctcac cgtgaccggc ctgatgccct
3781 tcacccagta ccggctgcgg ctgagcgcca ccaatgtggt gggcagctcc cggccctcgg
3841 accccaccaa ggactttcaa accattcagg ccaagccgat gcacgccccc ttcaatgtga
3901 cggtacgcgc aatgagcgcc ctgcagctgc gcgtccgctg gataccgctg cagcagatgg
3961 agtggttcgg caatccgcgc ggctacaatg tcacctaccg gcaaatggag cgcaccggca
4021 agccctccaa gcacccgccc cgctccgtga tgatcgagga tcacacggcc aactcgcatg
4081 tgctcgaggg gctcgaggag tggaccctct acgaagtgat catgaacgcc tgcaacgatg
4141 tgggctgctc gctggacagc ggcctggcca tggagcgcac cagggaagcg gtgcccagct
4201 acggcccgct gcatgtggag gcgaacgcca cctcctcgac gacggtggtg gtgcgctggg
4261 gcgagatacc gccccaccat cgcaacggcc agatcgatgg ctacaaggtg tactacgcgg
4321 ccaccgagcg cggcatgcag gtgctctaca agacgatacc caacaacagc tccttcacca
4381 ccaccctcac cgagctgcag aagtttgtgg tgtaccacgt ccaggtgctg gcctacacgc
4441 ggctcggcaa cggcgccctc agcaccccgc ccatccgggt gcagacgttc gaggacacgc
4501 ccggatcacc gtccaatgtg agcttcccgg acgtcacctt ctcgatggcg cgcatcatct
4561 gggacgtgcc gatggacccc aatggcgaga tactcgccta ccaggtcacc tacacgctca
4621 acggaagcgc caatctgaac tacagccgcg agtttccgcc ctcggatcgc accttccggg
4681 ccaccggcct gatgcccgag cgctactaca gcttcagcgt gacggcccag acacgcctcg
4741 gctggggcaa aacggcctcg gtgctggtgt acacgaccaa caacagggac cgtccgcagg
4801 caccgtccgg gccgcaggtg tcgcgcagcc agatccaggc ccatcagatc accttcaact
4861 ggacgccggg ccgcgacggg ttcgccccgc tgcgatacta cacggtcgag atgcgggaga
4921 acgagggccg ctggcagccg ctgcccgagc gcgtcgatcc cacactcagc tcgtacacgg
4981 ccctgggtct gcgtccgtac accacctacc agttccgcat tcaggcgacc aacgatctgg
5041 gcccgtcggc gttcagccga gagagcattg tggtgcgcac cctgcccgcc gccccagcgg
5101 tgggtgtggg gggactgaag gtggtgccca taacgaccac ctcggtgcgg gtgcagtggg
5161 gggcgctgga gacgggcatg tggaacggcg acgcggccac cgggggatac cgcatactgt
5221 accagcagct gtcggacttc gcaccggccc tgcagtcgac cccgaagacg gatgtgatgg
5281 gcatcaatga gaacagcgtg gtgctgtccg atctgcagca ggaccgcaac tacgagatcg
5341 tggtgctgcc attcaattcg cagggaccgg gcccggccac accgccgacc gccgtctatg
5401 tgggcgaggc ggtgcccact ggagagccgc ggggcgtgga tgccacggcc atttccagca
5461 cggaggtgcg cctgagctgg aagccaccga agcagagcag ccagaacgga gagatactcg
5521 gctacaagat attctatttg gtgacgtggt cgccgcaggc cctcgagccg ggccgcaaat
5581 tcgaggagga aatcgaagtg gtctcggcca cggccacatc gcacagcctg gtctttctcg
5641 ataagttcac cgagtaccgc atccagttgc tggccttcaa tccggccgga gacgggccga
5701 ggtccgcccc cgtcactgcg aagacgatgc cgggcgtgcc cagtgccccg ctcaatctgc
5761 gcttttcgga catcacaatg cagagcctgg aggtgacctg ggacccgccc aagctgctca
5821 acggcgagat tgttggctat ctggtcacct acgagaccac cgaggagaac gaaaagttca
5881 gcaagcaggt gaagcagaag gtgtccaaca ccacgctgcg tgtgcagaat ctggaggagg
5941 aggtcaccta caccttcacc gtgcgcgccc agacgaacga ctatggaccg gcggtgagcg
6001 cgaatgtgac cacaggcccc caggatggct ccccggtggc accgcgcgat ctcacactca
6061 caaagacact gtccagcgtt gaggtacatt gggtcaatgg accctccggc cggggcccca
6121 tactgggcta cctcatcgag gccaagaagc gagaaaatgg agagccctca tttatttacg
6181 actcccgctg gactaagatt gagcagtcca gaaagggtac catgaaggag tttaccgtca
6241 gctaccacat cctgatgcca tcgacggcgt atttgttccg ggtaattgct tacaataagt
6301 atggcatatc gttccctgtt tactcgaagg actcgatact gacgccctcg aagctgcatc
6361 tggagtacgg ctatctgcag cacaagccct tctacaggca gacctggttc atggtctccc
6421 tggcggccac ctcgatcgtc atcattgtca tggtcattgc ggtgctctgt gtgaagagca
6481 agagctacaa gtacaagcag gaggcacaaa agacgctgga ggagtccatg gccatgtcga
6541 ttgatgagcg ccaggagctg gccctggagc tgtatcgttc gcgtcacggc gtcggcaccg
6601 gcaccctgaa cagcgttgga acattgcgca gcggaacttt gggaaccctc ggccgtaagt
6661 ccaccaaccg acaccagccg gtgagtgtgc atttgggtaa gagtccaccg cgaccctcgc
6721 ccgcatcggt ggcgtaccac agcgatgagg agagtctcaa gtgctacgac gagaatcccg
6781 acgacagcag tgttacggaa aagccatccg aggtgagcag ctcggaggca tcccagcact
6841 cggagagcga gaacgagagc gtgaggagcg atccgcactc gttcgtcaat cactatgcga
6901 atgtgaatga ctcgctgcgg cagtcctgga agaagaccaa gcccgtgcgc aactactcga
6961 gctacacaga ctccgagccg gagggcagtg cagtgatgag tctcaatggt ggccagatta
7021 ttgtcaataa tatggccaga tcgagggcac cactgcccgg cttctcgtca tttgtctgac
7081 aatcaaccga attctaagat ctatgccgtg gtagcagcag caccgtcatc cgcgagacat
7141 ttgtctgaat tattttggaa acgataacgg aaaacggaaa aacggaggct gaagctgaaa
7201 ccggagctgg agttgcagtg gggagcgttc taacgagttc gacacggatg tagcgagtgg
7261 gctaaactgc ctgcctgcct gcaactgttc tgtctggctc tccctggatc ttcgtagctg
7321 tccggcgagg cgctgctaca tggatattta tcgtagtt