Phenotypic Mutation 'Kraken' (pdf version)
Mutation Type splice site (3 bp from exon)
Coordinate45,327,866 bp (GRCm38)
Base Change A ⇒ T (forward strand)
Gene Col3a1
Gene Name collagen, type III, alpha 1
Synonym(s) Col3a-1
Chromosomal Location 45,311,538-45,349,706 bp (+)
MGI Phenotype FUNCTION: This gene encodes the alpha-1 subunit of the fibril-forming type III collagen found in bone, cartilage, dentin, tendon, bone marrow stroma and other connective tissue. The encoded protein forms homotrimeric type III procollagen that undergoes proteolytic processing during fibril formation. A majority of mice lacking the encoded protein die within two days of birth but about 5% of the animals survive to adulthood. The surviving mice exhibit severe cortical malformation and experience significantly shorter lifespan. The mutant mouse named "tight skin 2" exhibiting systemic sclerosis phenotype was found to harbor a missense point mutation in this gene. A pseudogene of this gene has been defined on chromosome 8. [provided by RefSeq, Nov 2015]
PHENOTYPE: Most homozygous mutants die within 48 hours after birth. Surviving mutants have reduced body size, skin lesions, enlarged intestines, and die by 6 months of age from ruptured blood vessels. Occasionally intestinal rupture also results in early death. Heterozygotes exhibit tight skin. [provided by MGI curators]
Accession Number

NCBI RefSeq: NM_009930; MGI:88453

Mapped Yes 
Amino Acid Change
Institutional SourceBeutler Lab
Gene Model predicted gene model for protein(s): [ENSMUSP00000085192 ]   † probably from a misspliced transcript
SMART Domains Protein: ENSMUSP00000085192
Gene: ENSMUSG00000026043

signal peptide 1 23 N/A INTRINSIC
VWC 33 89 2.73e-20 SMART
low complexity region 100 140 N/A INTRINSIC
low complexity region 163 227 N/A INTRINSIC
low complexity region 230 248 N/A INTRINSIC
internal_repeat_2 249 284 3.45e-13 PROSPERO
internal_repeat_1 250 290 1.9e-17 PROSPERO
Pfam:Collagen 293 366 7.3e-9 PFAM
low complexity region 368 419 N/A INTRINSIC
internal_repeat_4 423 476 2.52e-11 PROSPERO
internal_repeat_1 427 488 1.9e-17 PROSPERO
internal_repeat_3 427 491 7.39e-12 PROSPERO
internal_repeat_2 456 491 3.45e-13 PROSPERO
Pfam:Collagen 533 592 2.6e-11 PFAM
low complexity region 632 680 N/A INTRINSIC
low complexity region 683 776 N/A INTRINSIC
low complexity region 784 815 N/A INTRINSIC
low complexity region 818 855 N/A INTRINSIC
low complexity region 865 921 N/A INTRINSIC
low complexity region 925 950 N/A INTRINSIC
low complexity region 953 974 N/A INTRINSIC
internal_repeat_4 975 1028 2.52e-11 PROSPERO
internal_repeat_3 976 1029 7.39e-12 PROSPERO
internal_repeat_5 977 991 3.33e-5 PROSPERO
internal_repeat_5 1019 1033 3.33e-5 PROSPERO
low complexity region 1037 1058 N/A INTRINSIC
Pfam:Collagen 1076 1135 5.6e-13 PFAM
Pfam:Collagen 1136 1209 4.3e-11 PFAM
COLFI 1229 1464 5.73e-166 SMART
Predicted Effect probably null
Meta Mutation Damage Score 0.9755 question?
Is this an essential gene? Probably essential (E-score: 0.792) question?
Phenotypic Category
Phenotypequestion? Literature verified References
DSS: sensitive day 10
DSS: sensitive day 7
Candidate Explorer Status CE: excellent candidate; Verification probability: 0.43; ML prob: 0.436; human score: 0
Single pedigree
Linkage Analysis Data
Alleles Listed at MGI

All Mutations and Alleles(6) : Chemically induced (ENU)(1) Chemically induced (other)(1) Spontaneous(1) Targeted(3)

Lab Alleles
AlleleSourceChrCoordTypePredicted EffectPPH Score
IGL00504:Col3a1 APN 1 45347135 missense probably damaging 1.00
IGL00928:Col3a1 APN 1 45340858 intron probably benign
IGL00958:Col3a1 APN 1 45327595 missense unknown
IGL01353:Col3a1 APN 1 45333638 unclassified probably benign
IGL01820:Col3a1 APN 1 45321608 missense unknown
IGL01839:Col3a1 APN 1 45311830 missense unknown
IGL02517:Col3a1 APN 1 45325803 critical splice acceptor site probably null
IGL02879:Col3a1 APN 1 45340959 intron probably benign
IGL02960:Col3a1 APN 1 45328455 missense unknown
IGL03245:Col3a1 APN 1 45331109 unclassified probably benign
IGL03308:Col3a1 APN 1 45330617 splice site probably benign
IGL03050:Col3a1 UTSW 1 45328925 splice site probably null
PIT4520001:Col3a1 UTSW 1 45335783 critical splice donor site probably null
R0063:Col3a1 UTSW 1 45330541 splice site probably benign
R0122:Col3a1 UTSW 1 45340897 intron probably benign
R0131:Col3a1 UTSW 1 45328868 splice site probably benign
R0762:Col3a1 UTSW 1 45321526 missense unknown
R0765:Col3a1 UTSW 1 45336651 unclassified probably benign
R0853:Col3a1 UTSW 1 45343324 intron probably benign
R0898:Col3a1 UTSW 1 45333993 unclassified probably benign
R1170:Col3a1 UTSW 1 45327601 missense unknown
R1170:Col3a1 UTSW 1 45347724 missense probably damaging 1.00
R1440:Col3a1 UTSW 1 45343312 splice site probably null
R1449:Col3a1 UTSW 1 45321611 missense unknown
R1526:Col3a1 UTSW 1 45321688 missense unknown
R1572:Col3a1 UTSW 1 45345968 missense possibly damaging 0.95
R1585:Col3a1 UTSW 1 45327866 splice site probably null
R1616:Col3a1 UTSW 1 45328488 critical splice donor site probably null
R1691:Col3a1 UTSW 1 45348616 unclassified probably benign
R1876:Col3a1 UTSW 1 45342235 splice site probably null
R1937:Col3a1 UTSW 1 45334293 unclassified probably benign
R2093:Col3a1 UTSW 1 45332990 missense probably damaging 1.00
R2110:Col3a1 UTSW 1 45330145 missense unknown
R2119:Col3a1 UTSW 1 45346121 missense probably damaging 1.00
R2256:Col3a1 UTSW 1 45321632 missense unknown
R2327:Col3a1 UTSW 1 45338611 unclassified probably benign
R2518:Col3a1 UTSW 1 45337512 unclassified probably benign
R2991:Col3a1 UTSW 1 45335779 unclassified probably benign
R3405:Col3a1 UTSW 1 45338753 unclassified probably benign
R3784:Col3a1 UTSW 1 45347135 missense probably damaging 1.00
R3847:Col3a1 UTSW 1 45321990 missense unknown
R3848:Col3a1 UTSW 1 45321990 missense unknown
R3849:Col3a1 UTSW 1 45321990 missense unknown
R4502:Col3a1 UTSW 1 45348677 unclassified probably benign
R4503:Col3a1 UTSW 1 45348677 unclassified probably benign
R4764:Col3a1 UTSW 1 45346110 missense probably damaging 1.00
R4839:Col3a1 UTSW 1 45323803 splice site probably null
R4934:Col3a1 UTSW 1 45339952 unclassified probably benign
R5033:Col3a1 UTSW 1 45346110 missense probably damaging 1.00
R5123:Col3a1 UTSW 1 45333596 unclassified probably benign
R5190:Col3a1 UTSW 1 45329084 missense unknown
R5190:Col3a1 UTSW 1 45344807 intron probably benign
R5375:Col3a1 UTSW 1 45347899 splice site probably null
R5407:Col3a1 UTSW 1 45346052 missense probably benign 0.03
R5627:Col3a1 UTSW 1 45331560 unclassified probably benign
R5642:Col3a1 UTSW 1 45331712 unclassified probably benign
R6014:Col3a1 UTSW 1 45321579 nonsense probably null
R6052:Col3a1 UTSW 1 45345013 unclassified probably benign
R6263:Col3a1 UTSW 1 45321575 missense unknown
R6453:Col3a1 UTSW 1 45339378 unclassified probably benign
R6463:Col3a1 UTSW 1 45342205 intron probably benign
R6488:Col3a1 UTSW 1 45331534 unclassified probably benign
R6525:Col3a1 UTSW 1 45347179 missense possibly damaging 0.88
R6637:Col3a1 UTSW 1 45347730 missense probably damaging 1.00
R6704:Col3a1 UTSW 1 45347732 missense probably damaging 1.00
R6744:Col3a1 UTSW 1 45338622 unclassified probably benign
R6745:Col3a1 UTSW 1 45338622 unclassified probably benign
R6747:Col3a1 UTSW 1 45338622 unclassified probably benign
R6858:Col3a1 UTSW 1 45345984 missense probably damaging 1.00
R6903:Col3a1 UTSW 1 45331988 missense probably damaging 0.96
R7189:Col3a1 UTSW 1 45333657 missense unknown
R7194:Col3a1 UTSW 1 45331700 missense unknown
R7199:Col3a1 UTSW 1 45332141 missense probably null 0.99
R7204:Col3a1 UTSW 1 45322418 missense unknown
R7304:Col3a1 UTSW 1 45347811 missense unknown
R7378:Col3a1 UTSW 1 45327647 splice site probably null
R7398:Col3a1 UTSW 1 45327813 missense unknown
R7742:Col3a1 UTSW 1 45345001 missense unknown
R8072:Col3a1 UTSW 1 45321574 missense unknown
R8177:Col3a1 UTSW 1 45335764 missense unknown
R8183:Col3a1 UTSW 1 45334810 missense unknown
R8445:Col3a1 UTSW 1 45341180 nonsense probably null
R8490:Col3a1 UTSW 1 45345956 missense probably benign 0.01
R8546:Col3a1 UTSW 1 45340939 intron probably benign
R8720:Col3a1 UTSW 1 45347733 missense not run
Z1177:Col3a1 UTSW 1 45311800 missense unknown
Mode of Inheritance Autosomal Recessive
Local Stock Sperm, gDNA
Last Updated 2019-09-04 9:46 PM by Anne Murray
Record Created 2015-02-23 2:32 PM by Jeff SoRelle
Record Posted 2019-01-18
Phenotypic Description
Figure 1. Kraken mice exhibited susceptibility to DSS-induced colitis at day 10 after DSS treatment. Normalized data are shown. Abbreviations: WT, wild-type; REF, homozygous reference mice; HET, heterozygous variant mice; VAR, homozygous variant mice. Mean (μ) and standard deviation (σ) are indicated.
Figure 2. Kraken mice exhibit increased frequencies of peripheral T cells. Flow cytometric analysis of peripheral blood was utilized to determine T cell frequency. Normalized data are shown. Abbreviations: WT, wild-type; REF, homozygous reference mice; HET, heterozygous variant mice; VAR, homozygous variant mice. Mean (μ) and standard deviation (σ) are indicated.
Figure 3. Kraken mice exhibit increased frequencies of peripheral CD4+ T cells. Flow cytometric analysis of peripheral blood was utilized to determine T cell frequency. Normalized data are shown. Abbreviations: WT, wild-type; REF, homozygous reference mice; HET, heterozygous variant mice; VAR, homozygous variant mice. Mean (μ) and standard deviation (σ) are indicated.
Figure 4. Kraken mice exhibit increased frequencies of peripheral CD8+ T cells. Flow cytometric analysis of peripheral blood was utilized to determine T cell frequency. Normalized data are shown. Abbreviations: WT, wild-type; REF, homozygous reference mice; HET, heterozygous variant mice; VAR, homozygous variant mice. Mean (μ) and standard deviation (σ) are indicated.

The kraken phenotype was identified among N-ethyl-N-nitrosourea (ENU)-mutagenized G3 mice of the pedigree R1585, some of which showed susceptibility to dextran sodium sulfate (DSS)-induced colitis at 10 days after DSS exposure (Figure 1); weight loss is used to measure DSS susceptibility. Some mice also showed increased frequencies of T cells (Figure 2), CD4+ T cells (Figure 3), and CD8+ T cells (Figure 4) in the peripheral blood.

Nature of Mutation
Figure 5. Linkage mapping of the DSS susceptibility phenotype using a recessive model of inheritance. Manhattan plot shows -log10 P values (Y-axis) plotted against the chromosome positions of 82 mutations (X-axis) identified in the G1 male of pedigree R1585. Normalized phenotype data are shown for single locus linkage analysis without consideration of G2 dam identity. Horizontal pink and red lines represent thresholds of P = 0.05, and the threshold for P = 0.05 after applying Bonferroni correction, respectively.

Whole exome HiSeq sequencing of the G1 grandsire identified 82 mutations. All of the above anomalies were linked by continuous variable mapping to a mutation in Col3a1: an A to T transversion at base pair 45,327,866 (v38) on chromosome 1, or base pair 16,329 in the GenBank genomic region NC_000067 encoding Col3a1 within the donor splice site of intron 10. The strongest association was found with a recessive model of linkage to the DSS susceptibility phenotype, wherein two variant homozygotes departed phenotypically from 19 homozygous reference mice and 17 heterozygous mice with a P value of 1.636 x 10-5 (Figure 5).  


The effects of the mutation at the cDNA and protein level have not examined, but the mutation is predicted to result in the use of a cryptic site in exon 10. The resulting transcript would have a 17-base pair deletion of exon 10, which would cause a frame shifted protein product beginning after amino acid 260 of the protein and terminating after the inclusion of 1 aberrant amino acids.


           <--exon 9                 <--exon 10 intron 10-->       exon 11-->

244   ……-P--G--P--P- -G--I--K--G-……-K--G--H--R-                    G--L--*-
                  correct             deleted                      aberrant


Genomic numbering corresponds to NC_000067. The donor splice site of intron 10, which is destroyed by the kraken mutation, is indicated in blue lettering and the mutated nucleotide is indicated in red. 

Illustration of Mutations in
Gene & Protein
Protein Prediction
Figure 6. Domain organization of COL3A1. The kraken mutation destroys the donor splice site of intron 10. The propeptides are indicated. Abbreviations: SP, signal peptide; VWFC, von Willebrand factor type C; NC1, noncollagenous.

Col3a1 encodes the 1,464-amino acid type III procollagen, α1 chain [α1(III)]. In vertebrates, the collagen superfamily contains 28 different types of collagen (1;2). All collagens have 3 α chains; each α chain is identified by an Arabic numeral followed by a roman numeral in parentheses representing the collagen type (2).  In addition to forming a heterotrimer of three α1(III) chains, COL3A1 assembles with type I collagen (see the record seal for information about Col1a1) to form heterotypic fibrils (3).


Based on sequence homology, the α-chains can be divided into two groups; the α1, α3, and α5 chains, and the α2, α4, α6 chains. Each α chain contains three structurally distinct domains; an N-terminal domain of about 140 amino acids rich in cysteine and lysine residues (known as the 7S domain), a collagenous domain of about 1300 residues largely composed of Gly-Xaa-Yaa repeats, and a C-terminal noncollagenous (NC1) domain that is roughly 230 amino acids long (Figure 6). In the Gly-Xaa-Yaa repeats found in the large collagenous domain, X and Y are often proline and hydroxyproline residues. These sequences have a high propensity to form supercoiled triple helical structures (4). COL3A1 also has a von Willebrand factor type C (vWFC) domain at its N-terminus, which may promote protein-protein interactions and/or oligomerization (SMART).


During incorporation into the extracellular matrix, N- and C-terminal propeptides of COL3A1 are cleaved by bone morphogenic protein-1 and tolloid-like proteinases (5). The N-propeptide includes the VWFC domain, while the C-propeptide contains the NC1 domain. The N-terminal propeptide is a marker of liver fibrosis in patients with chronic liver diseases (6). The N-terminal propeptide is also correlated with the extent of interstitial fibrosis in the kidney (7).


The kraken mutation destroys the donor splice site of intron 10, deletes 17 nucleotides from the Col4a4 cDNA (exon 10), and results in premature protein truncation (Figure 6). 


COL3A1 is expressed in tissues that exhibit elastic properties, including skin, lung, liver, intestine and the arterial wall (8). Type III collagen is secreted into the extracellular matrix by fibroblasts and other mesenchymal cell types.

Figure 7. Components of extracellular matrix. The basemement membrane is synthesized by the epithelium and mesenchymal cells. It is found at the base of the epithelium and surrounds vasculature. The basemement membrane comprises laminin, proteoglycans, and Type IV collagen. The interstitial matrix is found surrounding cells in the connective tissue and is synthesized by fibroblasts. Proteogylcans, types I and III collagen, elastin, hyaluronan, and fibrillin are found in the interstitial matrix.

Collagens form the structural basis of skin, tendon, bone, cartilage, and other tissues (Figure 7). Some collagens have a restricted tissue expression pattern; for example types II, IX and XI are found almost exclusively in cartilage and type IV is only in basement membranes (2). Most collagens form some type of supramolecular structure, such as fibrils in the case of type I collagen. Collagens are the most abundant proteins in the human body, making up approximately 30% of its protein mass (2). There are at least 27 collagen types and 42 α chains in vertebrates, in addition to a variety of proteins containing the collagen triple helix motif (2;9). Fibril-forming collagen orthologues have been identified in invertebrates (10), as well as in bacteria and viruses (11).


The role of type III collagen in the organization and biological properties of the extracellular matrix is unknown. Collagen III functions in normal type I collagen fibrillogenesis in the cardiovascular system, intestines, and skin  (12). Together with collagen I, collagen III is a major component of the interstitial matrix.


Type III collagen is a ligand for several proteins, including G protein-coupled receptor-56 (GPR56), discoidin domain receptors (DDRs) 1 and 2, von Willebrand factor, and integrin α2β1 (13;14). The interaction between GPR57 and COL3A1 regulates cortical development by regulating pial basement membrane integrity and cortical lamination, subsequently inhibiting neural migration (15). The interaction between COL3A1 and DDR1/2 controls cell proliferation, adhesion, migration, and extracellular matrix remodeling. Aberrant interaction between COL3A1 and DDR1/2 is linked to fibrosis, atherosclerosis, cancer, and arthritis (16). Binding of COL3A1 to vWF and integrin α2β1 promote wound healing.


Mutations in COL3A1 have been linked to Ehrlos-Danlos syndrome type IV in humans (OMIM: #130050; (17;18)). Ehrlos-Danlos syndromes are connective tissue disorders; Ehrlos-Danlos syndrome type IV is the vascular type of Ehrlos-Danlos syndrome. Patients with Ehrlos-Danlos syndrome type IV exhibit acrogeria (i.e., emaciated face with prominent cheekbones and sunken cheeks and premature aging of the extremities), translucent skin with visible subcutaneous vessels on the trunk and lower back, easy bruising, spontaneous rupture of blood vessels, digestive ruptures, and perforations of the gravid uterus (17;19). The median age of death for Ehrlos-Danlos syndrome type IV patients is 50 years of age; arterial ruptures cause the majority of deaths (19).


Levels of matrix metalloproteinase-9-degraded type III collagen are elevated in penetrating (Montreal B3) Crohn’s disease (20). Also, plasma concentrations of COL3A1 is higher in patients with Crohn’s disease who later developed strictures compared to Crohn’s disease patients without strictures (21). COL3A1 is a disease-associated gene in gastroesophageal reflux disease as well as a male risk factor for hiatal hernia (22).


Homozygous mice expressing a spontaneous intergenic deletion of the first 38 exons of Col3a1 exhibited embryonic lethality before embryonic day 9.5 (23). Heterozygous mice exhibited premature death on average at 6 weeks of age due to spontaneous aortic dissection between 4 and 10 weeks of age (23). Most Col3a1-deficient (Col3a1-/-) mice exhibited postnatal lethality within the first 48 hours after birth primarily due to ruptured blood vessels and/or intestinal rupture; surviving Col3a1-/- mice exhibited an average survival rate of 5% at weaning age and were reduced in size compared to wild-type littermates (12). Heterozygous (Col3a1+/-) mice showed reduced wall strength in the aorta and colon  (24). Homozygous mice expressing an ENU-induced Col3a1 allele (Col3a1Tsk2/Tsk2) causing a cysteine to serine substitution at amino acid 33 (C33S) exhibited prenatal lethality. Heterozygous Col3a1Tsk2 mice exhibited adipose tissue and skin inflammation as well as thick and tight skin (25). Transgenic mice expressing a Gly182Ser mutation in COL3A1 showed vascular and dermal fragility as well as malformed dermal and aortic collagen fibrils (26).

Putative Mechanism

Fibrosis is a complication of chronic inflammation that occurs in inflammatory bowel disease, and COL3A1 is one of several profibrogenic extracellular matrix genes upregulated during the active/chronic inflammatory stage in a trinitrobenzene sulfonic acid (TNBS)-induced murine colitis model (27). The DSS-induced colitis phenotype observed in the kraken mice indicates loss of COL3A1-associated function in the intestine. However, some COL3A1 function may remain as the kraken mice did not exhibit premature death. The mutation in kraken may be leading to increased incidence of blood vessel rupture and/or aberrant fibrosis in the intestine.

Primers PCR Primer

Sequencing Primer

PCR program

1) 94°C 2:00
2) 94°C 0:30
3) 55°C 0:30
4) 72°C 1:00
5) repeat steps (2-4) 40x
6) 72°C 10:00
7) 4°C hold

The following sequence of 486 nucleotides is amplified (chromosome 1, + strand):

1   tgcctggacc tccagtaagt cttcattaaa taaactacct agaacataca aagtatttta
61  attttaactt caaaaaaaaa cccaaataat caaattaaaa taaacatcaa aataaatgca
121 tttttatcac tgaatgttga cttgaatgtt aattaacatc tttcccattc attactttta
181 gggtatcaaa ggcccagctg gcatgcctgg attccctggt atgaaaggac acagagtaag
241 tgaccctaat cctaaccctg tattcatgaa cttagatata atttataaat ggacagagca
301 cttttagttt taacaaactt tagataaagt caatccatcc aaaatctatc aaactaagaa
361 caatatccat caaactagag tttattcttc tatacttaag acaacttttt aatcatacta
421 aattgaatgt tctcattagt ttctcttttt tttgtaataa tacgttgcca aatgttcagc
481 agttca

Primer binding sites are underlined and the sequencing primers are highlighted; the mutated nucleotide is shown in red.

Science Writers Anne Murray
Illustrators Diantha La Vine
AuthorsJeff SoRelle, Emre Turer, William McAlpine, Noelle Hutchins, and Bruce Beutler