DEPARTMENT OF MOLECULAR BIOLOGY AND GENETICS |
||||||||||
BACKGROUND Human NAT alleles/haplotypes Non-human NAT alleles/haplotypes |
The Database of Arylamine N-Acetyltransferases (NATs) Introduction
Arylamine
N-acetyltransferases (NATs, EC 2.3.1.5) are polymorphic enzymes
responsible for the inter-individual variability in the effects of arylamine
and arylhydrazine
drugs and carcinogens in human populations. Humans have two NAT isoenzymes,
encoded by polymorphic genes (NAT1 and NAT2)
on chromosome 8p22. A third inactive locus, the pseudogene NATP1, is located between NAT1 and
NAT2 in humans. Loci homologous to the human NAT genes have
been identified in several eukaryotic species (including protists,
fungi and animals, but not
plants), as well as in prokaryotes (bacteria and
archaea).
The pharmacogenetic and toxicogenetic significance of NAT is
well-established, and there is evidence that the NAT polymorphisms may affect
susceptibility to disease, especially cancer. Today, investigators employ the
NAT family as a model system to study enzymatic structure and function, gene
expression, population genetics, comparative genomics and evolution. NAT has
been investigated as a candidate pharmacological target in tuberculous
mycobacteria and as a putative biomarker in tumors responsive to steroid
hormones. NATs have also been investigated for their
functional variability in bacteria and fungi [References 1-35 for selected reviews]. NAT
nomenclature
The
discovery of numerous polymorphic NAT alleles in human populations and
model organisms led to the introduction of a consensus nomenclature for NATs
in 1995 [36].
The NAT Gene Nomenclature Committee
was formed at the first International NAT Workshop that took place in
1998 ( The
NAT committee has published two nomenclature updates [42, 43]
to advise investigators as to the proper use of symbols for the NAT genes
and alleles. General instructions regarding the correct naming of genes are
available from the HUGO Gene
Nomenclature Committee (HGNC), which has approved NAT as
the official gene symbol for arylamine N-acetyltransferase. The basic rules
for naming NAT genes and alleles are described in [36, 42-46]
and outlined here below: ·
The
symbols of NAT genes and alleles in all species except rodents are all
uppercase (NAT). In rodents, only the first letter is uppercase,
followed by lowercase (Nat). Protein products are always all uppercase
(NAT, for rodent species too). ·
Genes
and alleles are always italicized (NAT or Nat), while protein
products are not (NAT, for rodents and other species). ·
The
nomenclature is species-specific. An official organism identification code
should precede the gene symbol [e.g. (MOUSE)Nat]. Please ask the manager of this website for
appropriate organism identification codes. ·
For
the purpose of taxonomic classification, a unique identification number
should be provided for each species (e.g. 10090 for Mus musculus), but not
incorporated in the gene or allele symbol. Those taxon IDs must be obtained from NCBI’s Taxonomy Database. ·
Arabic
numerals placed immediately after the NAT symbol indicate different NAT
genes of the same organism [e.g. (RABIT)NAT1 and (RABIT)NAT2 are
two distinct (paralogous) genes of the rabbit, encoding for two functionally
differentiated isoenzymes]. ·
Arabic
numerals separated from the gene symbol with an asterisk indicate different
alleles of the same NAT gene [e.g. (MACMU)NAT2*1 and (MACMU)NAT2*2
are two polymorphic alleles of the NAT2 gene of the Rhesus macaque
and they produce variants of the NAT2 isoenzyme]. The asterisk is replaced by
underscore in the non-italicized symbol of the corresponding allozymes [i.e.
(MACMU)NAT2_1 and (MACMU)NAT2_2 are the protein variants produced by the polymorphic (MACMU)NAT2*1
and (MACMU)NAT2*2 alleles of the Rhesus NAT2 gene]. ·
When
more than one NAT locus is discovered in a specific genome, the
symbols NAT1, NAT2 etc. should be assigned hierarchically,
according to the deduced amino acid identity between each new sequence and a
NAT reference sequence. The reference sequences are: the NAT1 protein of Salmonella typhimurium LT2 (accession
no. BAA14331) for bacteria, the
deduced NAT1 protein of Halogeometricum
borinquense, strain DSM 11551 (accession no. BN001449) for
archaea, the NAT1 protein of Gibberella
moniliformis (accession no. EU552489) for fungi
and the NAT1 protein of Homo sapiens (accession
no. X17059) for animals. In the case
of protists, which constitute a highly divergent domain of eukaryotic life,
investigators are encouraged to contact the NAT committee for advice on
appropriate reference sequences [44]. For example,
a gene of the Rhesus macaque that encodes a protein with 94% identity to
human NAT1 is assigned symbol NAT1 and a second gene, whose product is
only 82% identical to human NAT1, is assigned symbol NAT2. If functional
data is available, those should be taken into account when allocating symbols
to new NAT genes, especially if the identity to the reference sequence
is not sufficiently informative. For example, rabbit NAT1 and NAT2 are both
75% identical to human NAT1, but studies have demonstrated that rabbit NAT1
and human NAT1 (as well as rabbit NAT2 and human NAT2) are functionally equivalent. The only exception to this rule is
the rodents, where the Nat2 gene is functionally more similar to human
NAT1 and vice versa. Although confusing, the NAT nomenclature of
rodents is widely accepted by scientists in the field and is currently a
consensus. ·
In humans, the legacy reference
alleles/haplotypes of the NAT1 and NAT2 genes have historically been
designated symbols NAT1*4 and NAT2*4. Human haplotypes are
commonly grouped into specific allelic groups, based on shared signature SNPs
(e.g. all haplotypes belonging to the NAT2*5
allelic group share signature SNP c.341T>C and are classified as NAT2*5A, *5B, *5C etc.). For human SNPs, it is useful to also indicate the
official “rs” numbers identifying the polymorphism in the dbSNP database (e.g. SNP c.341T>C is identified
by rs1801280). However, please
note that, from March 2024, this legacy nomenclature has been discontinued
for human NAT2 alleles (but not for
human NAT1 alleles), as the
Pharmacogenetics Variation Consortium (PharmVar) launched a
new NAT2 webpage (https://www.pharmvar.org/gene/NAT2), with many
(but not all) NAT2 alleles now
transitioned into the new PharmVar nomenclature system. As a consequence, important changes have been introduced
to the legacy NAT2 nomenclature that should be adopted and implemented
from now on. The present page will remain active as a
record of the legacy NAT2 allele nomenclature
used in the literature before, but it will no longer be updated with new NAT2 alleles. New submissions of human NAT2 alleles should be directed to PharmVar (https://www.pharmvar.org/submission). ·
In non-human species, the reference allele
of a NAT gene is assigned symbol NAT1*1. This is usually either
the wild type allele or the first allele identified for a specific organism.
The capital letters used to indicate NAT allelic groups in the humans
(e.g. NAT2*5A, *5B, *5C etc.) should not be used in non-human NAT symbols,
even if two alleles share common SNPs (e.g. former rat alleles Nat2*21A and
Nat2*21B have now been discontinued and replaced by Nat2*2 and Nat2*3). ·
SNPs
are not reported for the NAT genes of non-human species, unless they
are validated experimentally. Likewise, SNPs identified outside the open
reading frame of human or non-human NAT genes (e.g. in the promoter,
5΄-/3΄-untranslated regions or introns) are not reported, unless a
functional effect is demonstrated. ·
To
add a non-human NAT gene to the database, the sequence of the open
reading frame and deduced protein product should be provided, together with
the official (latin) name and taxon ID of the species. Additional
information, e.g. regarding the position of SNPs or non-coding exons, may
also accompany the submission. If available, previous
scientific literature relevant to the submitted sequences can be provided. ·
When
reporting the position of SNPs, non-coding exons, transcriptional regulatory
elements etc. of NAT genes, the A of the ATG translation initiation
codon should always be considered as number 1. Upstream positions are designated
with negative numbers and downstream positions with positive numbers. ·
Nucleotide
and amino acid variation must be reported according to the Human Genome Variation Society (HGVS)
nomenclature guidelines and recommendations (e.g. c.590G>A and p.Arg197Gln
for the nucleotide and corresponding amino acid substitution, respectively). Scientists
who wish to name new NAT sequences should follow the above rules and
contact the manager of this website who will provide official symbols for new
NAT genes or alleles. Colleagues are encouraged to request official
symbols for NAT sequences prior to their publication in the scientific
literature, as well as to submit their gene-specific data to the NAT website
(see below), whenever they judge that this information can be made public.
Release of gene-specific data on the NAT website does not preclude its
submission to central sequence repositories, such as the ENA/GenBank/DDBJ databases, which is encouraged. The NAT
website
An
official website, launched and maintained by Dr. D.W. Hein at the University
of Louisville, was created by the NAT
Gene Nomenclature Committee after the 1998 NAT workshop and contained
information relevant to the consensus nomenclature of all NAT genes
and alleles in humans and other organisms [37, 42]. At the
2007 NAT workshop, it was agreed that a second website, launched and maintained by Dr. S. Boukouvala at the University of
Thrace [39, 43, 45], would be dedicated to the
nomenclature of non-human NAT genes. With the number of NAT-homologous
genes identified in sequenced prokaryotic and eukaryotic genomes increasing
day after day, these databases were intended as a useful resource
for investigators who wish to study the genetic, evolutionary and functional
diversity of NAT homologues. At the
2010 NAT workshop, it was decided that the two websites would be consolidated
into a single website hosted by the Please direct
all requests for new NAT gene or
allelic symbols to Dr. S. Boukouvala (sboukouv@mbg.duth.gr), apart from new submissions of human NAT2 alleles which should be directed to
PharmVar instead (https://www.pharmvar.org/submission). Literature
More
|
The
NAT Gene Nomenclature Committee:
Created & maintained
by: Dr.
Sotiria Boukouvala
Tel.: +30-25510-30632 Special thanks to: Eirini Vagena Vasiliki Garefalaki Georgia Papanikolaou Maria-Aggeliki Tsatiri Dimitra Basdani for help with collection,
annotation and presentation of the data Contact address: Department of Molecular Biology and Genetics Fax.: +30-25510-30632 _ _ |
|