Phenotypic Mutation 'seal' (pdf version)
Mutation Type splice site (2 bp from exon)
Coordinate94,947,184 bp (GRCm38)
Base Change T ⇒ A (forward strand)
Gene Col1a1
Gene Name collagen, type I, alpha 1
Synonym(s) Cola1, Mov-13, Col1a-1, Cola-1
Chromosomal Location 94,936,224-94,953,042 bp (+)
MGI Phenotype FUNCTION: This gene encodes the alpha-1 subunit of the fibril-forming type I collagen, the most abundant protein of bone, skin and tendon extracellular matrices. The encoded protein, in association with alpha-2 subunit, forms heterotrimeric type I procollagen that undergoes proteolytic processing during fibril formation. Mice lacking the encoded protein die in utero caused by the rupture of a major blood vessel. Transgenic mice expressing significantly lower levels of this gene exhibit morphological and functional defects in mineralized and non-mineralized connective tissue and, progressive loss of hearing. [provided by RefSeq, Nov 2015]
PHENOTYPE: Mutations in this locus cause variable phenotype, from embryonic lethal to viable/fertile with altered fibrillogenesis. Homozygotes can show impaired bone formation and fragility, osteoporosis, dermal fibrosis, impaired uterine postpartum involution, andaortic dissection. [provided by MGI curators]
Accession Number

NCBI RefSeq: NM_007742; MGI: 88467

Mapped Yes 
Amino Acid Change
Institutional SourceBeutler Lab
Gene Model not available
SMART Domains Protein: ENSMUSP00000001547
Gene: ENSMUSG00000001506

signal peptide 1 22 N/A INTRINSIC
VWC 31 86 1.04e-16 SMART
Pfam:Collagen 97 154 1.1e-9 PFAM
Pfam:Collagen 166 227 7e-10 PFAM
Pfam:Collagen 225 284 2.4e-13 PFAM
Pfam:Collagen 285 344 5.9e-12 PFAM
low complexity region 354 426 N/A INTRINSIC
internal_repeat_4 427 444 4.93e-7 PROSPERO
low complexity region 447 486 N/A INTRINSIC
low complexity region 495 516 N/A INTRINSIC
low complexity region 527 567 N/A INTRINSIC
internal_repeat_3 570 588 1.25e-9 PROSPERO
low complexity region 590 600 N/A INTRINSIC
low complexity region 603 627 N/A INTRINSIC
low complexity region 629 651 N/A INTRINSIC
internal_repeat_1 652 675 6.29e-11 PROSPERO
internal_repeat_4 658 675 4.93e-7 PROSPERO
low complexity region 678 699 N/A INTRINSIC
low complexity region 702 717 N/A INTRINSIC
internal_repeat_2 718 738 2.08e-10 PROSPERO
internal_repeat_1 718 741 6.29e-11 PROSPERO
internal_repeat_3 726 744 1.25e-9 PROSPERO
internal_repeat_5 737 752 9.8e-6 PROSPERO
Pfam:Collagen 768 827 2.8e-12 PFAM
Pfam:Collagen 828 887 6.8e-11 PFAM
internal_repeat_5 944 959 9.8e-6 PROSPERO
internal_repeat_2 952 972 2.08e-10 PROSPERO
Pfam:Collagen 1008 1077 4.8e-8 PFAM
Pfam:Collagen 1068 1127 1.2e-12 PFAM
Pfam:Collagen 1122 1184 2.8e-9 PFAM
PDB:3HR2|C 1185 1205 6e-6 PDB
COLFI 1217 1453 2.04e-162 SMART
Predicted Effect probably benign
Meta Mutation Damage Score Not available question?
Is this an essential gene? Essential (E-score: 1.000) question?
Phenotypic Category
Phenotypequestion? Literature verified References
skeleton phenotype
Candidate Explorer Status CE: no linkage results
Single pedigree
Linkage Analysis Data
Penetrance 100% 
Alleles Listed at MGI

All alleles(14) : Targeted, knock-out(3) Targeted, other(6) Gene trapped(2) Transgenic(1) Chemically induced(2)

Lab Alleles
AlleleSourceChrCoordTypePredicted EffectPPH Score
IGL00834:Col1a1 APN 11 94949378 missense unknown
IGL01383:Col1a1 APN 11 94945525 missense probably damaging 1.00
IGL01717:Col1a1 APN 11 94950777 missense unknown
IGL02889:Col1a1 APN 11 94951509 missense unknown
walrus UTSW 11 94942385 missense unknown
R0121:Col1a1 UTSW 11 94938069 missense unknown
R0400:Col1a1 UTSW 11 94941369 splice site probably benign
R0545:Col1a1 UTSW 11 94951594 missense unknown
R0661:Col1a1 UTSW 11 94949389 missense unknown
R1220:Col1a1 UTSW 11 94951131 missense unknown
R1717:Col1a1 UTSW 11 94948392 missense unknown
R1732:Col1a1 UTSW 11 94944415 splice site probably benign
R1879:Col1a1 UTSW 11 94951225 missense unknown
R1880:Col1a1 UTSW 11 94950568 missense unknown
R1901:Col1a1 UTSW 11 94946632 splice site probably null
R2113:Col1a1 UTSW 11 94948362 missense unknown
R2386:Col1a1 UTSW 11 94950391 missense unknown
R3803:Col1a1 UTSW 11 94938069 missense unknown
R4839:Col1a1 UTSW 11 94950095 critical splice acceptor site probably null
R4936:Col1a1 UTSW 11 94947132 missense unknown
R5081:Col1a1 UTSW 11 94951576 missense unknown
R5105:Col1a1 UTSW 11 94942385 missense unknown
R5110:Col1a1 UTSW 11 94941593 critical splice donor site probably null
R5247:Col1a1 UTSW 11 94947187 splice site probably null
R5773:Col1a1 UTSW 11 94939429 missense probably benign 0.10
R5776:Col1a1 UTSW 11 94949724 missense unknown
R5991:Col1a1 UTSW 11 94937919 missense unknown
R6415:Col1a1 UTSW 11 94940160 missense unknown
R6483:Col1a1 UTSW 11 94942618 splice site probably null
R7207:Col1a1 UTSW 11 94938526 missense unknown
Mode of Inheritance Autosomal Recessive
Local Stock Embryos
MMRRC Submission 030348-UCD
Last Updated 2017-05-02 10:56 AM by Katherine Timer
Record Created unknown
Record Posted 2007-10-15
Phenotypic Description
The seal phenotype was detected in two ENU-induced mutant G3 littermates as a defect in hindlimb movement resulting in an abnormal gait. The hindlegs are paralyzed such that seal mice appear to “waddle” like seals on land. Initially, transmissibility appeared sporadic. However, it became clear that homozygotes only demonstrate hind leg paralysis after being picked up for examination by grasping the loose skin over the nape of the neck. Otherwise, seal mice display normal locomotor activity (i.e. walking ability and cage activity). When it occurs, paralysis persists for approximately 8 days before hindlegs regain near-normal movement; most animals retain a mild “seal-like” gait. The abnormal gait of seal mice is associated with physical damage to the spine (visible hemorrhage results from grasping). Seal mice have thin bones and fragile tissues.


Nature of Mutation
The seal mutation mapped to Chromosome 11, and corresponds to a T to A transversion in the donor splice site of intron 36 (GTAAGT -> GAAAGT) in the Col1a1 gene (position 10914 in Genbank genomic region NC_000077 for linear genomic DNA sequence of Col1a1). Col1a1 cDNA from seal mice has not been sequenced. The mutation may result in skipping of the 108-nucleotide exon 36 (depicted below); this deletion would not destroy the reading frame of the encoded type I procollagen, α1 chain [α1(I)]. Col1a1 contains 51 total exons.
     <--exon 35 <--exon 36 intron 36-->  exon 37--> <--exon 51 
796   -G--P--P-……-G--P--I-               -G--N--V-………-F--V--*  1417
       correct    deleted                      correct
The donor splice site of intron 36, which is destroyed by the seal mutation, is indicated in blue lettering; the mutated nucleotide is indicated in red lettering.
Protein Prediction
Figure 1.  Domain structure of the mouse Col1a1 protein.  Col1a1 encodes the 1453 amino acid type I procollagen, α1 chain [α1(I)].  The seal mutation is an A to T transversion occurring in the donor splice site of intron 36, indicated by the red asterisk.  The mutation also results in the possible deletion of exon 36 (residues 807-841).  The triple helical region is flanked by N- and C-terminal propeptides.  The N propeptide contains a von Willebrand Factor type C domain (VWFC), shown in teal.  The C-terminal propeptide contains a fibrillar cartilage NC1 domain (FC NC1), indicated in dark blue.
Col1a1 encodes the 1453-amino acid type I procollagen, α1 chain [α1(I)] (Figure 1). α1(I) forms heterotrimers with the type I procollagen, α2 chain [α2(I)] with stoichiometry 2 α1(I): 1 α2(I), forming the extracellular matrix protein type I collagen (Figure 2). [All collagens consist of 3 α chains; each α chain is identified by an arabic numeral followed by a roman numeral in parentheses representing the collagen type it is part of (1).] The protein structure and folding of collagen has been extensively studied [see (2-4) and references therein]. It is described here briefly: Both α1(I) and α2(I) encode left-handed polyproline II-like chains, which supercoil together in a right-handed manner around a common axis to form a triple helix. The bulk of each chain consists of a triple helical domain encoded by 43 exons and 1014 amino acids in the uninterrupted repeating sequence Gly-X-Y. Flanking the triple helical domain are N- and C-terminal propeptides. Regions of the C-terminal propeptide nucleate the triple helix by mediating the association of the chains and tethering them together. The repeating Gly-X-Y sequence undergoes a series of required enzymatic modifications which allow the triple helix to form. These include hydroxylation of a subset of proline and lysine residues in the Y position, and subsequent glycosylation of specific hydroxylated lysines. Chain formation propagates in a C- to N-terminal direction, and depends critically on a glycine residue at every third position of the chain. Glycine is the only residue small enough to fit in the sterically constrained inner aspect of the helix, and mutation to any other residue disrupts helix folding.

Figure 2. Crystal structure of a collagen-like peptide. This protein triple helix represents a heteromic triple helix structure consistent with fiber diffraction studies on collagen supercoiling. Based on PDB ID: 1CAG, from Bella J, et. al., Crystal and molecular structure of a collagen-like peptide a 1.9 Angstrom resolution. Science, 1994. Model created with Pymol, Schrodinger, LTD.

Each of the 43 exons encoding the triple helical domain of α1(I) begins with a codon for glycine and ends with a Y-position codon. Thus, splice site mutants that undergo exon skipping can still yield in-frame transcripts, although use of cryptic splice sites may also result in one or several out-of-frame transcripts or a premature termination codon. The seal mutation is predicted to cause skipping of Col1a1 exon 36, which encodes amino acids near the center of the triple helical domain. The effect of the mutation on Col1a1 transcription or α1(I) translation is unknown, as is the effect on collagen formation. Such a structural mutation could result in an abnormal protein that is poorly secreted, or if secreted into the extracellular matrix in significant amounts, that dominantly interferes with fibrillogenesis, collagen-matrix or collagen-cell interactions, or with bone mineralization (3).
Collagens form the structural basis of skin, tendon, bone, cartilage, and other tissues. Some collagens have a restricted tissue expression pattern; for example types II, IX and XI are found almost exclusively in cartilage and type IV is only in basement membranes (1). Type I collagen, of which the α1(I) chain is a constituent, exists in tendon, bone and skin (4). Most collagens form some type of supramolecular structure, such as fibrils in the case of type I collagen. Collagens are secreted into the extracellular matrix.

Figure 3. Type I collagen formation and possible alterations in the structure due to the α chain mutations. The left side of the diagram above illustrates the normal steps in collagen type I formation and the assembly into mature collagen fibers: 1. Propeptide domains are secreted from the rough ER. 2. C-propeptides asssociate and form procollagen triple helices. 3. After secretion into the cytoplasm, propeptides are cleaved at the N- and C-terminus. 4. Triple helix molecules form cross-linkages to form mature fibrils. 5. Fibrils associate to form collagen type I fibers. The right side of the diagram demonstrates possible disruptions in helix and fiber formation that may be seen in OI. Figure adapted from references (2) and (4).

Collagens are the most abundant proteins in the human body, making up approximately 30% of its protein mass (1). There are at least 27 collagen types and 42 α chains in vertebrates, in addition to a variety of proteins containing the collagen triple helix motif (1;4). Fibril-forming collagen orthologues have been identified in invertebrates (5), as well as in bacteria and viruses (6).
Some 1100 mutations in various collagens leading to heritable human disease have been reported (1;4). One of the best characterized collagen diseases is osteogenesis imperfecta (OI), in which patients have bone fragility due to mutations in the genes encoding the α1(I) or α2(I) chains of type I collagen (3). OI results in fractures upon minimal or absent trauma, dentinogenesis imperfecta (DI; easily worn or broken grey or brown teeth with possible translucence), and hearing loss (7). OI has varying degrees of severity, from perinatal lethality to severe skeletal deformities with impaired mobility to mild predisposition to fractures (7). Fractures may occur in any bone, but most often occur in the extremities. Cases of OI are classified into seven types (I-VII) based on mode of inheritance, clinical presentation and radiographic examination. Most types (I-V) are inherited in an autosomal dominant manner, likely because mutations that deform an α chain can disrupt the formation of collagens containing wild type chains. Type VII is inherited recessively. Mutations in COL1A1 cause OI type I (OMIM #166200), IIA (OMIM #166210), III (OMIM #259420), and IV (OMIM #166220).
Two general classes of mutations in type 1 collagen chains result in OI: those that cause the absence of protein (e.g. prematurely terminated alleles, alleles encoding unstable proteins), and those that cause structural defects in the protein (Figure 3) (3). The first class causes a quantitative decrease in the amount of type I collagen produced, usually by half, and generally results in a mild OI type. The second class of structural mutations results in a wide range of OI phenotypes, depending on the type and location of the mutation. The most common of this type of mutation is a single base substitution in a glycine codon, replacing one of the invariant glycine residues of the triple helical domain with one of eight bulkier amino acids and resulting in destabilization of the triple helix (4). How different glycine substitutions lead to different severities of OI remains a subject of study, and no simple rule for understanding genotype-phenotype relationships exists (3). Various models propose the importance of the identity of the substituting residue, the distance of the mutation from the C-terminus, and the sequence surrounding the mutation (8).

There are several mouse mutants of Col1a1, and they display a range of defects similar to those of OI patients (9-13). A knock-in mutant designated Brittle IV (BrtlIV) expresses a mutant α1(I) chain with a Gly 349 to Cys mutation, reproducing a mutation in a type IV OI patient (12). These mice suffer skeletal deformity, fragility, osteoporosis and disorganized trabecular structure, and display a phenotypic variability ranging from perinatal lethality to long term survival, similar to human OI patients (12). Another notable mouse mutant, the exon2Δ mouse lacking the N-terminal propeptide of α1(I), displayed no defects in collagen fibril formation and appeared as wild type, demonstrating that the N-terminal propeptide is not essential for collagen biogenesis (14).

Putative Mechanism
After single base mutations that substitute a bulkier amino acid for glycine, the second most common type of structural mutation in type I collagen α chains is a splice site alteration (3). Most such α1(I) mutations result in a mild OI type. As mentioned above (Protein prediction), splice site mutations can yield in-frame transcripts, but also out-of-frame transcripts and premature termination codons. In addition to the consequences for the integrity of the collagen triple helix, splice site mutations may affect collagen secretion, or interfere with interactions with ligands such as cell adhesion receptors (e.g. integrins) or other extracellular matrix molecules.
In a case marginally similar to seal in which a four-exon in-frame deletion of Col1a1 exons 33-36 caused severe type III OI in two patients, no mutant protein could be detected in mutant dermal fibroblasts and osteoblasts (15).The seal mutation is predicted to cause skipping of exon 36 of Col1a1 and yield an in-frame transcript. However, the consequences of the seal mutation for transcripts, α1(I) protein, or collagen formation have not been examined.
Primers Primers cannot be located by automatic search.
Seal genotyping is performed by amplifying the region containing the mutation using PCR, followed by sequencing of the amplified region to detect the single nucleotide change.
Primers for PCR amplification
PCR program
1) 94°C             2:00
2) 94°C             0:30
3) 56°C             0:30
4) 72°C             1:00
5) repeat steps (2-4) 29X
6) 72°C             7:00
7) 4°C               ∞
Primers for sequencing
The following sequence of 1138 nucleotides (from Genbank genomic region NC_000077 for linear DNA sequence of Cola1) is amplified:
10352                                   tgctcccgt aagtacagaa gaccctgatc
10381 tctgttcatc ccttctcccc tacctgcttt ctgcccccac cgcaaacccc acccctttac
10441 tctatccgtt cctctccttc cctaatgttg agacatctct ccaaagtcgt ctccttcttc
10501 ttctagggag accgtggtga ggctggtccc cctggtcctg ctggctttgc cggcccccct
10561 gtgagtatca agaccctcct cattttctgt ccctagctga gacacgaggc atgggacctt
10621 ggggtggctg aatgaggaca gaagtgttac cctgagtcag aggagaaggg tggggaggta
10681 ctggtgtctc caagtgtctc tgcatctcca agtccctatc tgtggccctt cctctagccc
10741 agaggccctc tgctctcagg ctgcctcctc cactcctcca ctctccattc tccctcctgc
10801 ctagggtgct gatggccaac ctggtgcgaa aggtgaacct ggtgatactg gtgttaaagg
10861 tgatgctggt cctcctggcc ctgctggtcc tgctggaccc cccggcccca ttgtaagtat
10921 cttgtcttct gcaccataag ctttggatag ccttggactt ggggctagcc tggatctcat
10981 accttgacac tgtcttacag ggtaacgttg gtgctcctgg acccaaaggt cctcgtggtg
11041 ctgctggtcc ccctgtgagt atcatatgca tctctgtcgc gactccccaa aggcagagac
11101 tggagatgag gccaggtgac aggtgactgt tcacttctga ccacccaatg ttctctccta
11161 ccagggtgct actggcttcc ctggtgctgc tggccgtgtc ggtccccctg gtccctctgt
11221 gagtatctgt ggttctggaa tgaggatggg gtgagacatg tattgtcagg acagcaggcc
11281 tggctggggc ttgccactat gatgctttgg aagcctggac tctgacagtc cttcttgtgc
11341 ccatctaggg aaatgctgga ccccctggcc ctcccggtcc cgttggcaaa gaagggggca
11401 aaggtccccg tggtgagact ggccctgctg gacgtcctgg tgaagttggt cccccaggtc
11461 cccccggtcc tgctggtgag aaaggatct
PCR primer binding sites are underlined; sequencing primer binding sites are highlighted in gray; the mutated T is shown in red text.
Science Writers Eva Marie Y. Moresco
Illustrators Victoria Webster
AuthorsKoichi Tabeta, Xin Du, Bruce Beutler
Edit History
2011-01-07 9:39 AM (current)
2009-10-16 12:00 AM