Unfortunately due to lack of commercial feasibility, the SkyBLAST service has been suspended from December 1st, 2025.
All subscriptions for paid accounts have been paused. For further information or enquiries, please email [email protected]

basement membrane-specific heparan sulfate proteoglycan core


LOCUS       XP_044252015            4032 aa            linear   INV 09-DEC-2024
            protein isoform X23 [Drosophila takahashii].
ACCESSION   XP_044252015
VERSION     XP_044252015.1
DBLINK      BioProject: PRJNA1194641
DBSOURCE    REFSEQ: accession XM_044396080.1
KEYWORDS    RefSeq.
SOURCE      Drosophila takahashii
  ORGANISM  Drosophila takahashii
            Eukaryota; Metazoa; Ecdysozoa; Arthropoda; Hexapoda; Insecta;
            Pterygota; Neoptera; Endopterygota; Diptera; Brachycera;
            Muscomorpha; Ephydroidea; Drosophilidae; Drosophila; Sophophora.
COMMENT     MODEL REFSEQ:  This record is predicted by automated computational
            analysis. This record is derived from a genomic sequence
            (NC_091683) annotated using gene prediction method: Gnomon.
            Also see:
                Documentation of NCBI's Annotation Process
            
            ##Genome-Annotation-Data-START##
            Annotation Provider         :: NCBI RefSeq
            Annotation Status           :: Full annotation
            Annotation Name             :: GCF_030179915.1-RS_2024_12
            Annotation Pipeline         :: NCBI eukaryotic genome annotation
                                           pipeline
            Annotation Software Version :: 10.3
            Annotation Method           :: Gnomon; cmsearch; tRNAscan-SE
            Features Annotated          :: Gene; mRNA; CDS; ncRNA
            Annotation Date             :: 12/07/2024
            ##Genome-Annotation-Data-END##
            COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     source          1..4032
                     /organism="Drosophila takahashii"
                     /strain="IR98-3 E-12201"
                     /db_xref="taxon:29030"
                     /chromosome="X"
                     /sex="female"
                     /tissue_type="Whole fly"
                     /dev_stage="Adult fly"
                     /collected_by="Originally obtained from EHIME-Fly"
     Protein         1..4032
                     /product="basement membrane-specific heparan sulfate
                     proteoglycan core protein isoform X23"
                     /calculated_mol_wt=445686
     Region          84..>169
                     /region_name="CCDC66"
                     /note="Coiled-coil domain-containing protein 66;
                     pfam15236"
                     /db_xref="CDD:434558"
     Region          385..415
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(385,394,405..406)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(398,401,405,411..412)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            408..412
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          418..451
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(423,430,441..442)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(434,437,441,447..448)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            444..448
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          504..536
                     /region_name="LDLa"
                     /note="Low-density lipoprotein receptor domain class A;
                     smart00192"
                     /db_xref="CDD:197566"
     Site            order(510,518,529..530)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(522,525,529,535..536)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            532..536
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          544..579
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(549,558,569..570)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(562,565,569,575..576)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            572..576
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          589..623
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(594,602,613..614)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(606,609,613,619..620)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            616..620
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          675..710
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(680,689,700..701)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(693,696,700,706..707)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            703..707
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          791..825
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(796,804,815..816)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(808,811,815,821..822)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            818..822
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          831..865
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(836,844,855..856)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(848,851,855,861..862)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            858..862
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          880..948
                     /region_name="Ig_3"
                     /note="Immunoglobulin domain; pfam13927"
                     /db_xref="CDD:464046"
     Region          980..1014
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(985,993,1004..1005)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(997,1000,1004,1010..1011)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            1007..1011
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          1020..1054
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(1025,1033,1044..1045)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(1037,1040,1044,1050..1051)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            1047..1051
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          1063..1097
                     /region_name="LDLa"
                     /note="Low Density Lipoprotein Receptor Class A domain, a
                     cysteine-rich repeat that plays a central role in
                     mammalian cholesterol metabolism; the receptor protein
                     binds LDL and transports it into cells by endocytosis; 7
                     successive cysteine-rich repeats of about...; cd00112"
                     /db_xref="CDD:238060"
     Site            order(1068,1076,1087..1088)
                     /site_type="active"
                     /note="putative binding surface [active]"
                     /db_xref="CDD:238060"
     Site            order(1080,1083,1087,1093..1094)
                     /site_type="other"
                     /note="calcium-binding site [ion binding]"
                     /db_xref="CDD:238060"
     Site            1090..1094
                     /site_type="other"
                     /note="D-X-S-D-E motif"
                     /db_xref="CDD:238060"
     Region          1115..1190
                     /region_name="Ig"
                     /note="Immunoglobulin domain; cl11960"
                     /db_xref="CDD:472250"
     Region          1118..1122
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:143220"
     Region          1131..1135
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:143220"
     Region          1154..1158
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:143220"
     Region          1168..1173
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:143220"
     Region          1182..1185
                     /region_name="Ig strand G"
                     /note="Ig strand G [structural motif]"
                     /db_xref="CDD:143220"
     Region          1289..1419
                     /region_name="Laminin_B"
                     /note="Laminin B (Domain IV); pfam00052"
                     /db_xref="CDD:459652"
     Region          1476..1529
                     /region_name="Laminin_EGF"
                     /note="Laminin EGF domain; pfam00053"
                     /db_xref="CDD:395007"
     Site            order(1476,1478,1490,1500,1502,1511)
                     /site_type="active"
                     /note="EGF-like motif [active]"
                     /db_xref="CDD:238012"
     Region          1536..1573
                     /region_name="EGF_Lam"
                     /note="Laminin-type epidermal growth factor-like domai;
                     smart00180"
                     /db_xref="CDD:214543"
     Region          1654..1790
                     /region_name="Laminin_B"
                     /note="Laminin B (Domain IV); pfam00052"
                     /db_xref="CDD:459652"
     Region          <1791..1817
                     /region_name="EGF_Lam"
                     /note="Laminin-type epidermal growth factor-like domain;
                     laminins are the major noncollagenous components of
                     basement membranes that mediate cell adhesion, growth
                     migration, and differentiation; the laminin-type epidermal
                     growth factor-like module occurs in...; cd00055"
                     /db_xref="CDD:238012"
     Region          1825..1874
                     /region_name="EGF_Lam"
                     /note="Laminin-type epidermal growth factor-like domain;
                     laminins are the major noncollagenous components of
                     basement membranes that mediate cell adhesion, growth
                     migration, and differentiation; the laminin-type epidermal
                     growth factor-like module occurs in...; cd00055"
                     /db_xref="CDD:238012"
     Site            order(1826,1828,1835,1842,1845,1854)
                     /site_type="active"
                     /note="EGF-like motif [active]"
                     /db_xref="CDD:238012"
     Region          <1911..1941
                     /region_name="Laminin_EGF"
                     /note="Laminin EGF domain; pfam00053"
                     /db_xref="CDD:395007"
     Region          2007..2141
                     /region_name="Laminin_B"
                     /note="Laminin B (Domain IV); pfam00052"
                     /db_xref="CDD:459652"
     Region          2234..2316
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          2245..2249
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409353"
     Region          2261..2265
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409353"
     Region          2280..2284
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409353"
     Region          2294..2299
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409353"
     Region          2309..2312
                     /region_name="Ig strand G"
                     /note="Ig strand G [structural motif]"
                     /db_xref="CDD:409353"
     Region          <2343..2396
                     /region_name="Ig_3"
                     /note="Immunoglobulin domain; pfam13927"
                     /db_xref="CDD:464046"
     Region          2427..2510
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          2438..2442
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409544"
     Region          2451..2455
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409544"
     Region          2474..2478
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409544"
     Region          2488..2493
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409544"
     Region          2523..2603
                     /region_name="Ig"
                     /note="Immunoglobulin domain; cl11960"
                     /db_xref="CDD:472250"
     Region          2540..2544
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409570"
     Region          2553..2557
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409570"
     Region          2572..2576
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409570"
     Region          2586..2591
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409570"
     Region          2599..2602
                     /region_name="Ig strand G"
                     /note="Ig strand G [structural motif]"
                     /db_xref="CDD:409570"
     Region          2621..2697
                     /region_name="I-set"
                     /note="Immunoglobulin I-set domain; pfam07679"
                     /db_xref="CDD:400151"
     Region          2629..2633
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409544"
     Region          2642..2646
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409544"
     Region          2663..2667
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409544"
     Region          2677..2682
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409544"
     Region          2690..2693
                     /region_name="Ig strand G"
                     /note="Ig strand G [structural motif]"
                     /db_xref="CDD:409544"
     Region          2710..2797
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          2720..2724
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409548"
     Region          2733..2737
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409548"
     Region          2777..2782
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409548"
     Region          2922..2998
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          2931..2935
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409541"
     Region          2944..2948
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409541"
     Region          2964..2968
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409541"
     Region          2978..2983
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409541"
     Region          3010..>3077
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          3021..3025
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409390"
     Region          3037..3044
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409390"
     Region          3060..3064
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409390"
     Region          3154..3222
                     /region_name="Ig_3"
                     /note="Immunoglobulin domain; pfam13927"
                     /db_xref="CDD:464046"
     Region          3252..3318
                     /region_name="IG_like"
                     /note="Immunoglobulin like; smart00410"
                     /db_xref="CDD:214653"
     Region          3260..3264
                     /region_name="Ig strand B"
                     /note="Ig strand B [structural motif]"
                     /db_xref="CDD:409390"
     Region          3273..3277
                     /region_name="Ig strand C"
                     /note="Ig strand C [structural motif]"
                     /db_xref="CDD:409390"
     Region          3292..3296
                     /region_name="Ig strand E"
                     /note="Ig strand E [structural motif]"
                     /db_xref="CDD:409390"
     Region          3306..3311
                     /region_name="Ig strand F"
                     /note="Ig strand F [structural motif]"
                     /db_xref="CDD:409390"
     Region          3340..3487
                     /region_name="LamG"
                     /note="Laminin G domain; Laminin G-like domains are
                     usually Ca++ mediated receptors that can have binding
                     sites for steroids, beta1 integrins, heparin, sulfatides,
                     fibulin-1, and alpha-dystroglycans. Proteins that contain
                     LamG domains serve a variety of...; cd00110"
                     /db_xref="CDD:238058"
     Region          3510..3542
                     /region_name="EGF"
                     /note="EGF-like domain; pfam00008"
                     /db_xref="CDD:394967"
     Region          3590..3743
                     /region_name="LamG"
                     /note="Laminin G domain; Laminin G-like domains are
                     usually Ca++ mediated receptors that can have binding
                     sites for steroids, beta1 integrins, heparin, sulfatides,
                     fibulin-1, and alpha-dystroglycans. Proteins that contain
                     LamG domains serve a variety of...; cd00110"
                     /db_xref="CDD:238058"
     Region          3797..3829
                     /region_name="EGF_CA"
                     /note="Calcium-binding EGF-like domain, present in a large
                     number of membrane-bound and extracellular (mostly animal)
                     proteins. Many of these proteins require calcium for their
                     biological function and calcium-binding sites have been
                     found to be located at the...; cd00054"
                     /db_xref="CDD:238011"
     Region          3838..3991
                     /region_name="LamG"
                     /note="Laminin G domain; Laminin G-like domains are
                     usually Ca++ mediated receptors that can have binding
                     sites for steroids, beta1 integrins, heparin, sulfatides,
                     fibulin-1, and alpha-dystroglycans. Proteins that contain
                     LamG domains serve a variety of...; cd00110"
                     /db_xref="CDD:238058"
     CDS             1..4032
                     /gene="trol"
                     /coded_by="XM_044396080.1:209..12307"
                     /db_xref="GeneID:108055788"
ORIGIN      
        1 mgspgsqapa iaigisngrr ghtgslllrl llvafvlnac hvpptnakqi tkskvddqdf
       61 iladtqslqg svdldddeaf lippdseekl dkkftpegnw wsqglhrvrr slnsffgsdd
      121 ddqekererq rqrqnnrdaa nrqkelrrqq kesqnrekql rlerqerqrl akrnnhvvfn
      181 rvtdprkras dlydeneasg lneeetttyr tyfvvnepys ddykerdslq fqnlqkllde
      241 dlrnffnrnf ennddeeqei hstlervdqt kdhfkirvql rvdlpssind fgskleqqln
      301 vykridrlga stdgvftfte ssvykyehqy divtprvivs gqpitekpve apqefdedpi
      361 evalptdeve gsgqdsgscr gdatfqcrrs gkticdemrc dgsrdcpdae deegcevcne
      421 lqfkcdnkcl plnkrcdnry dcedqtdeag cqryeveesq pqpqpqpqpe pepepepepe
      481 pepepepepe epitdneqpe qnsecratef rcnngdcidi rkrcdhisdc segedeneec
      541 paacsgmeyq crdgtrcisl sqqcdghsdc sdaddeehcd gsgndgedcr fdefrcgtge
      601 cipmrqvcdn iydcndysde vscaeeeeds vgipigrppq rpapkhdwld eldaneyhvy
      661 hpsnvyelan sknpcasnqf rcattnvcip lhlrcdnfyh cndmsdekdc eqyqrrtttt
      721 trrpstsarp sftftfttqg pgllerrnst tsrttagstt rateapqwpw atrptetttt
      781 npittvgvas cyadqfrcnn gdciaesahc dgnidcsdqs deldcggdsq clpnqfrckn
      841 gqcvsstarc nkrsdcldgs deqncanepn nsgrgtnqlk lktypdnqii kesrevifrc
      901 rdegpnrakv kwsrpggrpl ppgftdrngr leipnirved agayvceavg yanyipgqhv
      961 tvnlnverln ereirpdsac teyqatcmng ecidksgicd ghpdcsdgsd ehscslglkc
     1021 qpnqfmcsns kcvdrtwrcd gendcgdnsd etscdpepsd apcrydefqc rsghcipksf
     1081 qcdymndctd gtdeigcsvp spmtlpapsi vvmeyevlel tcvgtgvptp tivwrlnwgh
     1141 vpekcesksy ggtgtlrcpn mrpqdsgays cefintrgtf ypktnsivtv tpvrsdvcka
     1201 gffnmlarks eecvqcfcfg vstncdsanl ftyaiqppil shrvvsvels pfrqivinea
     1261 spgqdlltlh hgvqfrasnv hyngretpfl alpaeymgnq lksyggnlry evryngngrp
     1321 vsgpdviitg nsftlthrvr thpgqnnrvt ipflpggwtk pdgrkgtred immilanvdn
     1381 ilirlgylds tarevdlini aldsagsadq glgsaslvek ctcppgyvgd scescasgyv
     1441 rqargpwlgh cvpftpepcp agtygnprlg vpcqecpcph agannfasgc qqspdgdvic
     1501 rcnegyagkr cehcaqgyqg nplapggvcr kipdsscnvd gtyniysngt cqckdsvige
     1561 qcdtcapksf hlnsftytgc iecfcsgvgl dcdssswyrn qvtstfgrtr vnhgfalisd
     1621 ymrntpvtvp vsmstqanal sfvgsaeqag ntlywslpaa flgnkltsyg gklsytlsys
     1681 plpsgimsrn sapdvviksg edlrlihyrk sqvspsvant yaveikesaw qrgdelvpnr
     1741 ehvlmalsni taiyikatyt tstkeaslrs vtldtatatn lgtaraveve qcrcpegylg
     1801 lsceqcapgy trdpeagiyl glcrpcecng hskycnsetg ecescsdnte gfncdrcaag
     1861 yvgdatrgts ydcqyddggy ptsrppapgn qtaeclvncq qegtagcrgy qceckrnvag
     1921 drcdqcrpgt yglsaqnpdg ckecycsglt nqcrsaslyr qlipvdfist pplitdefgd
     1981 imdrdnlvpd vprnvytykh tsytpkywsl rgsvlgnqll syggrleysl ivesvgrdhr
     2041 gkdvvligng lkliwsrpdg hdneqeyhvr lhedeqwtve drgsarqatr adfmtvlsdl
     2101 qhililatpk vptvstsisn vilessittr apgathasdi elcqcpsgyt gtscescapl
     2161 hyrdasgrcs qcpcdasnte scglvsggnv ecqcrprwrg drcreietne ptpepdtstd
     2221 dpvrtqiivs iarpeitilp vggsltlsct grmrwtnspv fvnwykqgsh lpegvevqgg
     2281 nlqlfnlqis dsgiyicqav snetghsftd hvsitvsqed qrspahivdl pndvtfeeyv
     2341 sneivceveg nppptvtwtr vdghadaqst rtdnnrlvfd sprksdegry rcqaenslsr
     2401 eekyvvvyvr snppqpppqq drlyitpqev ngvagdsfql scqftsaasl rydwshdgrs
     2461 lsassprnvv vrgnvlevrd anvrdsgtyt cvafdlrtrr nftesarvyi eqpnepgilg
     2521 dkphiltleq niiivqgedl sitceasgtp ypsikwtkvq enlaenvris gnvltiyggr
     2581 senrglysci aenshgsdqs stsidiepre rpsltidtat qkvsvgsqas lycaaqgipe
     2641 ptvewvrtdg qplsprhkvq apgyvviddi vlddsgtyec rasniagqvs glatinvqep
     2701 tlvriepdrq hhivtqgdel slscvgsgvp tpsvfwsfeg rdvdrmgvpe gavfaqpfrt
     2761 ntadvkifrv skenegiyvc hgsndagedq qyirvevqpr rgdvgaggdd ngdvdtrqpp
     2821 nrpqiqpnpl snerlttelg nnvtlicnvd nvntewervd gtplphnayt vrntlvivfv
     2881 epqnlgqyrc ngigrdgrve ahvvrelvll plpritfypn ipltvelgqn ldvycqvenv
     2941 rpedvhwttd nnrplpssvr iegnvlrfas itqaaageyr csatnqygsr sknarvvvkq
     3001 psgfqpvphs qvqqrqvgds iqlrcrlttq ygdevrgniq fnwyredgsp lprgvrpdsq
     3061 vlqlvklqpe degryicnsy dlgsgqqlpp vsidlqvlrt ttqypfnrfk ggvslkdtpc
     3121 mvlyicaavp aapqnpiylp pvapprsper ilepqlslsv qssnlpagdg ttvecfssdd
     3181 sypdvvwera dgaplsenvq qvgnnlvisn vastdagnyv ckcktdegdl yttsykleve
     3241 eqphelkssk ivyakvggna dlqcgadedr qpsyrwsrqy gqlqagrslq neklsldrvq
     3301 andagtyvcs aqysdgetvd fpnilvvtga ipqfrqeprs ymsfptlsns sfkfnfeltf
     3361 rpenadglll fngqtrgsgd yialslkdry aefrfdfggk pllvraeepl aldewhtvrv
     3421 srfkrdgyiq vddqhpvafp tsqhqqipql eliedlyigg vpnweflpae avgqqsgfvg
     3481 cisrltlqgr tvelireakf kegitdcrpc aqgpcqnkgv clesqteqay tcvcqpgwtg
     3541 rdcaiegtqc tagvcgsgrc entendmecl cplnragdrc qyneilneqs lnfksnsfaa
     3601 ygtpkvtkvn itlsvrpasl edsvilytae stlpsgdyla lvlrgghael lintaarldp
     3661 vvvrsaeplp lnrwtrieir rrlgegilkv gdgperkaka pgsdrilslk thlfvggvdr
     3721 stvkinrdvn itkgfdgcis klynsqksvn llgdirdaan vqncgeanei dddeyempva
     3781 lpspkvaene rqlmapcasd pcenggscse qedmaicscp fgfsgkhcqn hlqlsfnasf
     3841 rgdgyvelnr shfqpaleqt yshigivftt nkpngllfww gqeageeytg qdfiaaavvd
     3901 gyveysmrld geeavirnsd irvdngerhi viakrdenta mleldqildt gdtrptinka
     3961 mklpgnvfvg gapdvaaftg frykdnfngc ivvvegetvg qinlssaain gvnanvcpan
     4021 deplggtepp vv