Methods
STUDY DESIGN AND POPULATION OF PATIENTS
We recruited patients with CCCA in Durban, South Africa, from 2013 through 2016 and in Winston-Salem, North Carolina, from 2014 through 2017. All the patients provided written informed consent to participate in this study. The cohort of patients comprised two sets: a discovery set and a replication set. Candidate genes were initially searched for with the use of exome sequencing in the discovery set, and we sequenced the top candidate gene in the replication set.
The study was approved by the institutional review boards in Durban and Winston-Salem. The Tel Aviv Medical Center, in Israel, where the sequencing of exomes and targeted sequencing of PADI3 were performed, has received approval from an institutional review board for the performance of genetic studies.
We attempted to identify candidate genes with variants inherited in a dominant fashion. First, we used three criteria to identify candidate genes in the discovery set: a minor allele frequency of less than 0.05 in persons of African descent and less than 0.0001 in persons of European descent, predicted pathogenicity, and the presence of the candidate gene variants in multiple patients. If a gene was identified as a strong candidate, we would obtain evidence for its role in the pathogenesis of the disease using functional biologic assays. The identification of variants that were functionally shown to be pathogenic in a gene known to encode a protein that plays a critical role in hair-shaft maturation would provide evidence of its role in CCCA pathogenesis.
NUCLEIC ACID EXTRACTION
Genomic DNA was extracted with the use of kits that were designed to obtain saliva samples (OG-500, DNA Genotek). Total RNA was extracted from biopsy samples of scalp skin with the use of the RNeasy Fibrous Tissue Mini Kit (Qiagen), according to the manufacturer’s instructions.
EXOME SEQUENCING
Exome sequencing in patients in the discovery set was performed by Fulgent Genetics or BGI. Whole-exome capture was carried out by in-solution hybridization with SureSelect Human All Exon, version 4.0 (Agilent), or the Roche NimbleGen Protocol (10GB), followed by massively parallel sequencing (Illumina HiSeq2000 or HiSeq4000) with 100–150-bp paired-end reads. Details regarding the methods and exome performance are provided in the Supplementary Appendix, available with the full text of this article at NEJM.org.
EXPRESSION STUDIES AND ENZYMATIC ASSAY
RNA sequencing libraries were prepared with the use of the TruSeq RNA v2 protocol (Illumina), and sequencing was performed on a HiSeq 2500 instrument (Illumina). Protein expression was assessed with the use of immunoblotting and immunofluorescence staining of cells and tissues. Enzymatic activity that had been obtained with the various PADI3 expression constructs was measured with the use of an antibody-based assay (ABAP, ModiQuest Research). More details are provided in the Supplementary Methods section in the Supplementary Appendix.
MUTAGENESIS
We used a vector containing the reference (nonvariant) PADI3 sequence9 to engineer the new mutant constructs. Mutagenesis primers were designed with the use of the Agilent QuikChange Primer Design tool (www.genomics.agilent.com/primerDesignProgram.jsp. opens in new tab) and are described in Table S1 in the Supplementary Appendix. The procedure was performed with the use of the QuikChange II XL site-directed mutagenesis kit (Agilent), according to the manufacturer’s instructions.
STATISTICAL ANALYSIS
We used the chi-square test and Fisher’s exact test to ascertain differences in frequencies of the PADI3mutations between the patients and a control population of women of African ancestry in a post hoc analysis. PADI3 was directly resequenced in both the discovery and replication sets. We did not control for population stratification. The P value threshold was set at less than 0.05 rather than the usual 2.5×10−6 that is used for genomewide association studies because this latter design was deemed to be inappropriate for investigating the cause of CCCA given the hypothesis of genetic heterogeneity and the patient sample size. We performed a post hoc analysis involving patients in both the discovery and replication sets, pooled into a single set of patients.
To ascertain differences in enzymatic activity, we used the unpaired Student’s t-test to compare normally distributed variables. Values are reported as means with standard deviations. All P values are two-tailed, and a P value of less than 0.05 was considered to indicate statistical significance.
Results
IDENTIFICATION OF CANDIDATE VARIANTS IN PATIENTS WITH CCCA
Figure 1.
We initially conducted exome sequencing in a discovery set, which included 16 women of African ancestry who had received a diagnosis of CCCA (Table S2 in the Supplementary Appendix). The discovery set included one familial case (from Family 1) (Fig. S1 in the Supplementary Appendix). Each patient received a clinical diagnosis from a dermatologist at each respective location. For each patient, the diagnosis was confirmed by means of biopsy (Figure 1). Clinical examination revealed hair loss over the crown, with centrifugal spread and a perifollicular grayish halo on dermoscopy in all patients. All the biopsy specimens showed decreased hair-follicle density and perifollicular lymphocytic infiltration with areas of fibrosis. CCCA was graded in all the patients according to the Central Hair Loss Grading scale (scores range from 0 to 5, with higher scores indicating more severe disease). Patients who were included in the discovery set had a moderate-to-severe condition (score of 3 to 5) (Table S2 in the Supplementary Appendix).10
Exome sequencing was performed by Fulgent Genetics in Patients 1 through 10 and by BGI in Patients 11 through 16. Variants that were identified by means of exome sequencing were classified according to their predicted effects on protein function with the use of the PolyPhen-2 (Polymorphism Phenotyping, version 2) tool,11 Provean (Protein Variation Effect Analyzer) software,12 the SIFT (Sorting Intolerant from Tolerant) algorithm,13 and the ConSurf server.14 The prevalence of CCCA is estimated to be 2.7% among black women in South Africa5 and 5.6% among black women in the United States.6 Because CCCA has been reported almost exclusively in women of African ancestry,4,15,16 we selected for further analysis variants that were predicted to be pathogenic, that were shared by the patients, and that had a minor allele frequency of less than 0.05 in the African population and of less than 10−4 in the European population. (Prevalence data were derived from the Genome Aggregation Database [http://gnomad.broadinstitute.org/. opens in new tab].)
Table 1.
Using this strategy, we identified four heterozygous mutations in the gene PADI3 (RefSeq accession number, NM_016233.2. opens in new tab) in 5 of 16 patients (31%): c.856A→G, c.1744G→A, c.1669C→T, and c.832-2A→G (Figure 1C and Table 1). PADI3 encodes the enzyme peptidyl arginine deiminase, type III. These four mutations included one splice-site mutation and three missense mutations. All the missense mutations had a minor allele frequency in the range of 0.0001 to 0.04 in the African population while being very rare among persons of European ancestry (Table 1). The missense mutations are predicted to have a deleterious effect on protein function (Table S3 in the Supplementary Appendix). The mutant amino acids are located in the second immunoglobulin-like domain or the catalytic domain of the enzyme (Figure 1D). Protein modeling suggested that these mutations would be likely to result in protein misfolding (Fig. S2 in the Supplementary Appendix).17,18The splice-site mutation c.832-2A→G is expected by several prediction tools19-22 to abrogate the acceptor splice site of intron 7 and to effect skipping of exon 8, which in turn is expected to lead to a frame shift.
CONSEQUENCES OF CCCA-ASSOCIATED MUTATIONS IN PADI3
PADI317 is a member of the peptidyl arginine deiminase family of enzymes, which are responsible for catalyzing the post-translational deimination of proteins by converting positively charged l-arginine residues into citrullines in the presence of calcium ions.23 They have distinct substrate specificities and tissue-specific expression patterns.23,24 PADI3 is detected mainly in the epidermis and hair follicles.25,26 In the skin, it is responsible for mediating the modification of proteins critical for normal hair-shaft formation and shaping, such as trichohyalin, and may also play a role in interfollicular epidermal differentiation.23
Although PADI3 has been associated with abnormal hair formation in patients who have the uncombable hair syndrome (Online Mendelian Inheritance in Man number, 191480. opens in new tab),9 it has been unclear whether it has a role in the pathogenesis of CCCA. In an attempt to obtain further in vivo evidence of the relevance of CCCA-associated mutations to the disease manifestations, we used deep sequencing of RNA extracted from biopsy samples of scalp skin obtained from three patients with CCCA who had mutations in PADI3 and from four healthy controls who were matched for ancestry population, age, and sex.
The expression of numerous genes differed between scalp-skin samples obtained from patients with CCCA and control samples (Fig. S3A and Table S4 in the Supplementary Appendix). Expression of PADI3was markedly lower in the skin of patients with CCCA than in the skin of controls (Table S4 in the Supplementary Appendix), as was the expression of genes encoding several peptidyl arginine deiminase substrates (including TCHH9 and S100A327), those known to be related to hair loss (including LIPH,28DSG4,29 HR,30 and CDSN31), and those encoding hair keratins and keratin-associated proteins (which contribute to the normal structure of hair fibers32). Ingenuity pathway analysis revealed that the expression of many genes encoding molecules that play a central role in hair-follicle development was reduced overall in the skin of patients with CCCA. Relevant RNA-sequencing data were validated with the use of quantitative reverse-transcriptase polymerase chain reaction. Details are provided in Figures S3 through S5 in the Supplementary Appendix.
Figure 2.
To further investigate the consequences of the CCCA-associated missense mutations in PADI3, HaCaT (a human keratinocyte cell line) cells were transiently transfected with constructs encoding nonvariant and mutated PADI3. Immunoblotting of cell extracts showed expression of all three mutant PADI3 constructs that was slightly lower than that of the nonvariant construct (Figure 2A). Accordingly, PADI3 expression was reduced in a scalp-skin sample obtained from a patient with CCCA (Figure 2B). We then examined the effect of PADI3mutations on the subcellular location of the enzyme. Immunofluorescence analyses showed a homogeneous cytoplasmic distribution of PADI3 in cells transfected with nonmutated PADI3, as previously shown,9 in contrast with cells transfected (one at a time) with the three mutated PADI3 variants. In these cells, we observed abnormal intracellular localization of the protein with formation of aggregates in the cytoplasm (Figure 2C).
We then assayed enzymatic activity associated with the three mutant constructs, as compared with nonvariant PADI3. A construct with a mutation that had been previously associated with the uncombable hair syndrome33 served as a positive control. We observed a significant decrease in enzymatic activity on transfection of the four constructs into HaCaT cells, as compared with the HaCaT cells transfected with the construct containing nonmutated PADI3 (Figure 2D).
FREQUENCY OF PADI3 MUTATIONS IN CCCA
We then sequenced PADI3 in a replication set, which included 42 patients (Table S2 in the Supplementary Appendix); we observed a PADI3 variant in 9 of them. Altogether, we identified a total of six different mutations in PADI3 (Figure 1C and Table 1), which were present in 14 of the 58 patients (24%) with CCCA who participated in this study. The two familial cases were identified; these patients were members of families with cosegregation of the mutations and affected status (Fig. S1 in the Supplementary Appendix).
In a post hoc analysis, the PADI3 mutation frequency among 58 women of African ancestry who had CCCA (116 alleles) was found to differ significantly from that calculated for a control cohort of women of African ancestry (from the gnomAD V2.1 control set) according to the chi-square test (P=0.002) and Fisher’s exact test (P=0.006). The difference remained significant after adjustment for relatedness of persons according to the chi-square test (P=0.03) and Fisher’s exact test (P=0.04). We did not control for population stratification. However, the mutation frequency was similar across various African subpopulations (Table S5 in the Supplementary Appendix).