Coiled-coil domain containing 166

CCDC166
Identifiers
AliasesCCDC166, coiled-coil domain containing 166
External IDsMGI: 1925902; HomoloGene: 109421; GeneCards: CCDC166; OMA:CCDC166 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001162914

NM_001163518
NM_146059

RefSeq (protein)

NP_001156386

n/a

Location (UCSC)Chr 8: 143.71 – 143.71 MbChr 15: 75.85 – 75.85 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Coiled-coil domain containing 166 is a protein that in humans is encoded by the CCDC166 gene. Its function is currently unknown. It contains a coiled-coil domain, hence the current origin of its name. It is primarily expressed in the testes. [5]

Gene

The gene currently is known to contain only two exons, and one isoform. This primary transcript consists of 1320 DNA base pairs. Its location is on chromosome 8q24.3, between positions 143706694-143708109, on the + strand. The gene is located near BREA2 and MAPK15.[6]

Transcripts

The gene has only a single transcript, due to only have two exons, both which are always transcribed. The coding portion of the mRNA is 1320 nucleotides.[6] In tissues found to express the transcript for this gene it is typically found in low levels.[7]

Protein

CCDC166 has only one isoform in humans, which has a molecular weight of 48.7 kDa and is composed of 439 amino acids. The pI of the protein is 10.537.[8] The protein has several amino acid repeat structures including; EREA, VQSL and (T)QLLH, all of which are conserved in mammals.[9] The composition of the protein reveals that it is high in serine, lysine, and arginine.[8] The protein contains three conserved domains including a coiled-coil domain between amino acids 27-74, a domain of unknown function between amino acids 72-260, and a serine-rich domain between amino acids 288-410.[10] It is believed that the 26-115 AA region is a SH3 domain.[11] The structure is mainly composed of alpha-helices that form a larger coiled-coil. It also contains several coiled-coils.

CCDC166
Possible CCDC166 structure
Proposed structure of CCDC166[12]

Gene level regulation

The gene seems to be expressed heavily in the testes, and this may be conserved in evolution.[13] The promoter region contains several conserved transcription factor binding sites. Notably among them are the CREB family, KLFs, and perhaps the most telling of which is the presence of Testis-determining factor.[14] These transcription factors are all important during the process of development.

Transcript level regulation

In situ hybridization (ISH) data has found the gene's mRNAs are mostly found in the nucleus of Sertoli cells, with low expression in Leydig cells.[13] The gene has also been found in other germ cell tumors.[7] In addition the gene's primary transcript contains several miRNA binding sites, including: hsa-miR-2278, hsa-miR-3178, and hsa-miR-4516.[15]

Protein level regulation

CCDC166 is predicted to be regulated by SUMO protein. It has a conserved IKAD sequence at amino acid 220-223.[16] This combined with a conserved nuclear localization signal of PKKKR starting at amino acid 3, supports that this protein is imported into the nucleus.[17] The gene also contains several predicted phosphorylation sites, most of which are predicted to be clustered into the serine-rich domain. The occurrences of highest probability occur at serine 10, serine 308, and serine 391.[18]

Homology / evolution

While the current function of the gene is unknown, many mammals possess on ortholog of the gene. In various primate species studies, several species have been found to possess on orthologous gene that shares 90% sequence identity.[10] While the gene does not seem to have paralogs, it has homologs that have been conserved throughout its evolutionary history. Evidence that its function has been conserved comes from the promoter region, which has predicted SRY-transcription factors binding sites conserved from zebrafish all the way to humans.[14]

"Evolutionary History of CCDC166"
Species Gene name Date of divergence[19] Percent similarity [20] Accession number
Human CCDC166 0 MYA 100% NP_001156386.1
Chimpanzees CCDC166 isoform 1 6.65 MYA 98% PNI46222.1
Grey mouse lemur CCDC166 74 MYA 76% XP_017516497.1
Horse CCDC166 96 MYA 85% XP_023504891.1
Florida manatee CCDC166 105 MYA 79% XP_004387488.1
Japanese gecko CCDC166-like protein 312 MYA 74% XP_007444987.1
Mallard duck CCDC166-like protein 312 MYA 39% ENSAPLG00000001712
Mexican tetra CCDC166 435 MYA 26% ENSAMXG00000003745.1

Function / biochemistry

The function of the protein is currently unknown.

Composition of CCDC166[21]
Amino Acid Number of Occurrences Percent Composition
Ala (A) 57 13.0%
Arg (R) 58 13.2%
Asn (N) 5 1.1%
Asp (D) 15 3.4%
Cys (C) 2 0.5%
Gln (Q) 32 7.3%
Glu (E) 36 8.2%
Gly (G) 21 4.8%
His (H) 12 2.7%
Ile (I) 6 1.4%
Leu (L) 56 12.8%
Lys (K) 11 2.5%
Met (M) 6 1.4%
Phe (F) 4 0.9%
Pro (P) 26 5.9%
Ser (S) 47 10.7%
Thr (T) 12 2.7%
Trp (W) 3 0.7%
Tyr (Y) 5 1.1%
Val (V) 25 5.7%

Interactions

The gene has been found to interact with FAT3, a tumor suppressor gene, as well as INTS2 a gene that is involved in snRNA processing and transcription.[22] Expression of CCDC166 has shown to be affected by methylphenidate, but the mechanism of this interaction is not known.[10]

Clinical significance

CCDC166 has some single nucleotide variants that are associated with lung, liver colon, thyroid pancreatic and testicular cancers.[23] The clinical significance of the protein has not been fully characterized as of yet, however.

References

  1. ^ a b c ENSG00000278749 GRCh38: Ensembl release 89: ENSG00000255181, ENSG00000278749Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000098176Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "Entrez Gene: Coiled-coil domain containing 166". Retrieved 2018-05-05.
  6. ^ a b "CCDC166 coiled-coil domain containing protein 166 [Homo sapiens (human)]". Retrieved 2018-05-05.
  7. ^ a b "Hs.730002 - CCDC166: Coiled-coil domain containing 166". Retrieved 2018-05-05.
  8. ^ a b Rice P., Longden I. and Bleasby A. (2000) EMBOSS: The European Molecular Biology Open Software Suite Trends Genet. 16(6)276-277 PubMed: 10827456 DOI: 10.1016/S0168-9525(00)02024-2
  9. ^ Holger Dinkel, Kim Van Roey, Sushama Michael, Manjeet Kumar, Bora Uyar, Brigitte Altenberg, Vladislava Milchevskaya, Melanie Schneider, Helen Kühn, Annika Behrendt, Sophie Luise Dahl, Victoria Damerell, Sandra Diebel, Sara Kalman, Steffen Klein, Arne C. Knudsen, Christina Mäder, Sabina Merrill, Angelina Staudt, Vera Thiel, Lukas Welti, Norman E. Davey, Francesca Diella, Toby J. Gibson; ELM 2016—data update and new functionality of the eukaryotic linear motif resource, Nucleic Acids Research, Volume 44, Issue D1, 4 January 2016, Pages D294–D300, https://doi.org/10.1093/nar/gkv1291
  10. ^ a b c "CCDC166 coiled-coil domain containing protein 166". Retrieved 2018-05-05.
  11. ^ "Conserved domains on [gi|347602472|sp|P0CW27.1]".
  12. ^ Sunyaev S.R., Eisenhaber F., Rodchenkov I.V., Eisenhaber B., Tumanyan V.G., and Kuznetsov E.N. "PSIC: Profile extraction from sequence alignments with position-specific counts of independent observations" Protein Engineering (1999) 12, No.5, 387-394
  13. ^ a b 6. CCDC166. (n.d.). Retrieved April 02, 2018, from https://www.proteinatlas.org/ENSG00000255181-CCDC166/antibody
  14. ^ a b Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, Frisch M, Bayerlein M, Werner T (2005) MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics 21, 2933-42
  15. ^ MiRDB - MicroRNA Target Prediction And Functional Study Database. Retrieved April 02, 2018, from http://mirdb.org/
  16. ^ Cheng TS, Chang LK, Howng SL, Lu PJ, Lee CI, Hong YR (February 2006). "SUMO-1 modification of centrosomal protein hNinein promotes hNinein nuclear localization". Life Sciences. 78 (10): 1114–20.
  17. ^ Kalderon, D., Roberts, B. L., Richardson, W. D., & Smith, A. E. (1984). A short amino acid sequence able to specify nuclear location. Cell,39(3), 499-509. doi:10.1016/0092-8674(84)90457-4
  18. ^ Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S. Proteomics: Jun;4(6):1633-49, review 2004.
  19. ^ Kumar S, Stecher G, Suleski M, Hedges SB (2017) TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol 34 (7): 1812-1819
  20. ^ Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.
  21. ^ Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A.; Protein Identification and Analysis Tools on the ExPASy Server; (In) John M. Walker (ed): The Proteomics Protocols Handbook, Humana Press (2005). pp. 571-607 "
  22. ^ "CCDC166". Retrieved 2018-05-05.
  23. ^ "CCDC166". Retrieved 2018-05-05.