The 20 Amino Acids

These 20 molecular building blocks combine in endless ways to create every protein in every living organism on Earth.

The Mathematics of Protein Diversity

Theoretical Combinations

For a protein of length n:

20ⁿ possible sequences

10 amino acids: 20¹⁰ = 10,240,000,000,000 (10 trillion)

100 amino acids: 20¹⁰⁰ = 10¹³⁰ combinations

300 amino acids (average protein): 20³⁰⁰ = 10³⁹⁰ combinations

For perspective: There are only about 10⁸⁰ atoms in the observable universe. The number of possible 300-amino-acid proteins (10³⁹⁰) is unimaginably larger!

Discovered Proteins

~500,000+

Proteins in UniProt (reviewed)

~250 million+

Total protein sequences in databases

~200 million

Structures predicted by AlphaFold

Human proteome: ~20,000 protein-coding genes producing ~100,000+ distinct proteins through alternative splicing and modifications.

Why So Few Proteins Exist?

Evolution

Only proteins that provide survival advantages are selected. Most random sequences would not fold properly or have useful functions.

Folding Constraints

Only ~1 in 10⁷⁷ random sequences can fold into a stable, functional structure. Most sequences are "junk" that would misfold or aggregate.

Protein Families

Proteins evolved from common ancestors. There are only ~2,000 distinct protein fold families, with variations built on these templates.

Amino Acid Categories

Nonpolar (9)
Aromatic (3)
Polar Uncharged (6)
Acidic - Negative (2)
Basic - Positive (3)

Detailed Amino Acid Profiles

Nonpolar7.2% abundance

Glycine

G
Gly
Formula:C₂H₅NO₂
MW:75.07 Da
pI:5.97
Hydropathy:-0.4
Side Chain (R group):H (hydrogen)

Smallest amino acid. Highly flexible due to no side chain. Found in collagen (every 3rd residue).

Nonpolar8.3% abundance

Alanine

A
Ala
Formula:C₃H₇NO₂
MW:89.09 Da
pI:6
Hydropathy:+1.8
Side Chain (R group):-CH₃ (methyl)

Simple, small hydrophobic residue. One of the most common amino acids. Important in energy metabolism.

Nonpolar6.6% abundance

Valine

V
Val
Formula:C₅H₁₁NO₂
MW:117.15 Da
pI:5.96
Hydropathy:+4.2
Side Chain (R group):-CH(CH₃)₂ (isopropyl)

Branched-chain amino acid (BCAA). Essential amino acid. Important for muscle metabolism.

Nonpolar9.7% abundance

Leucine

L
Leu
Formula:C₆H₁₃NO₂
MW:131.17 Da
pI:5.98
Hydropathy:+3.8
Side Chain (R group):-CH₂CH(CH₃)₂ (isobutyl)

Branched-chain amino acid. Most abundant essential amino acid. Key regulator of protein synthesis (mTOR).

Nonpolar5.3% abundance

Isoleucine

I
Ile
Formula:C₆H₁₃NO₂
MW:131.17 Da
pI:6.02
Hydropathy:+4.5
Side Chain (R group):-CH(CH₃)CH₂CH₃ (sec-butyl)

Branched-chain amino acid. Essential. Important for hemoglobin synthesis and blood sugar regulation.

Nonpolar5.1% abundance

Proline

P
Pro
Formula:C₅H₉NO₂
MW:115.13 Da
pI:6.3
Hydropathy:-1.6
Side Chain (R group):-(CH₂)₃- (cyclic)

Unique cyclic structure. Introduces kinks in protein chains. Abundant in collagen. Disrupts secondary structures.

Aromatic3.9% abundance

Phenylalanine

F
Phe
Formula:C₉H₁₁NO₂
MW:165.19 Da
pI:5.48
Hydropathy:+2.8
Side Chain (R group):-CH₂-C₆H₅ (benzyl)

Essential aromatic amino acid. Precursor to tyrosine. People with PKU cannot metabolize it properly.

Aromatic2.9% abundance

Tyrosine

Y
Tyr
Formula:C₉H₁₁NO₃
MW:181.19 Da
pI:5.66
Hydropathy:-1.3
Side Chain (R group):-CH₂-C₆H₄-OH (p-hydroxybenzyl)

Precursor to dopamine, epinephrine, thyroid hormones. Can be phosphorylated for cell signaling.

Aromatic1.1% abundance

Tryptophan

W
Trp
Formula:C₁₁H₁₂N₂O₂
MW:204.23 Da
pI:5.89
Hydropathy:-0.9
Side Chain (R group):-CH₂-indole

Largest amino acid. Essential. Precursor to serotonin and melatonin. Rarest amino acid in proteins.

Polar6.9% abundance

Serine

S
Ser
Formula:C₃H₇NO₃
MW:105.09 Da
pI:5.68
Hydropathy:-0.8
Side Chain (R group):-CH₂OH (hydroxymethyl)

Polar, can be phosphorylated. Important in enzyme active sites. Precursor to glycine and cysteine.

Polar5.4% abundance

Threonine

T
Thr
Formula:C₄H₉NO₃
MW:119.12 Da
pI:5.6
Hydropathy:-0.7
Side Chain (R group):-CH(OH)CH₃ (1-hydroxyethyl)

Essential amino acid. Can be phosphorylated. Important for collagen, elastin, and tooth enamel.

Polar1.4% abundance

Cysteine

C
Cys
Formula:C₃H₇NO₂S
MW:121.16 Da
pI:5.07
Hydropathy:+2.5
Side Chain (R group):-CH₂SH (thiol)

Contains sulfur. Forms disulfide bonds critical for protein structure. Important antioxidant (glutathione).

Nonpolar2.4% abundance

Methionine

M
Met
Formula:C₅H₁₁NO₂S
MW:149.21 Da
pI:5.74
Hydropathy:+1.9
Side Chain (R group):-CH₂CH₂SCH₃ (methylthioethyl)

Essential. Start codon (AUG) codes for Met. Important for methylation reactions. Contains sulfur.

Polar4.0% abundance

Asparagine

N
Asn
Formula:C₄H₈N₂O₃
MW:132.12 Da
pI:5.41
Hydropathy:-3.5
Side Chain (R group):-CH₂CONH₂ (carbamoylmethyl)

Amide of aspartic acid. Common site for N-linked glycosylation. Important for protein folding.

Polar3.9% abundance

Glutamine

Q
Gln
Formula:C₅H₁₀N₂O₃
MW:146.15 Da
pI:5.65
Hydropathy:-3.5
Side Chain (R group):-CH₂CH₂CONH₂

Most abundant amino acid in blood. Important nitrogen carrier. Fuel for rapidly dividing cells.

Acidic5.3% abundance

Aspartic Acid

D
Asp
Formula:C₄H₇NO₄
MW:133.1 Da
pI:2.77
Hydropathy:-3.5
Side Chain (R group):-CH₂COOH (carboxymethyl)

Negatively charged at pH 7. Important in enzyme active sites. Precursor to other amino acids.

Acidic6.2% abundance

Glutamic Acid

E
Glu
Formula:C₅H₉NO₄
MW:147.13 Da
pI:3.22
Hydropathy:-3.5
Side Chain (R group):-CH₂CH₂COOH

Negatively charged. Major excitatory neurotransmitter. Used as flavor enhancer (MSG).

Basic5.7% abundance

Lysine

K
Lys
Formula:C₆H₁₄N₂O₂
MW:146.19 Da
pI:9.74
Hydropathy:-3.9
Side Chain (R group):-(CH₂)₄NH₂ (aminobutyl)

Essential. Positively charged. Important for collagen crosslinking. Can be acetylated/methylated (histones).

Basic5.7% abundance

Arginine

R
Arg
Formula:C₆H₁₄N₄O₂
MW:174.2 Da
pI:10.76
Hydropathy:-4.5
Side Chain (R group):-(CH₂)₃NHC(=NH)NH₂ (guanidino)

Positively charged. Precursor to nitric oxide. Important in immune function and wound healing.

Basic2.3% abundance

Histidine

H
His
Formula:C₆H₉N₃O₂
MW:155.16 Da
pI:7.59
Hydropathy:-3.2
Side Chain (R group):-CH₂-imidazole

pKa near physiological pH - acts as buffer. Essential for infants. Found in hemoglobin active site.

Essential vs Non-Essential

Essential Amino Acids (9)

Must be obtained from diet - body cannot synthesize them.

HisIleLeuLysMetPheThrTrpVal

Remember: "PVT TIM HaLL" - Phe, Val, Thr, Trp, Ile, Met, His, Arg (conditionally), Leu, Lys

Non-Essential Amino Acids (11)

Body can synthesize these from other molecules.

AlaArgAsnAspCysGlnGluGlyProSerTyr

Note: Arginine, Cysteine, Glutamine, Tyrosine, Glycine, and Proline are "conditionally essential" during illness or stress.

The Genetic Code

Each amino acid is encoded by one or more three-nucleotide sequences (codons) in mRNA. This redundancy helps protect against mutations.

Amino Acid1-Letter3-LetterCodons# Codons
AlanineAAlaGCU, GCC, GCA, GCG4
ArginineRArgCGU, CGC, CGA, CGG, AGA, AGG6
AsparagineNAsnAAU, AAC2
Aspartic AcidDAspGAU, GAC2
CysteineCCysUGU, UGC2
Glutamic AcidEGluGAA, GAG2
GlutamineQGlnCAA, CAG2
GlycineGGlyGGU, GGC, GGA, GGG4
HistidineHHisCAU, CAC2
IsoleucineIIleAUU, AUC, AUA3
LeucineLLeuUUA, UUG, CUU, CUC, CUA, CUG6
LysineKLysAAA, AAG2
MethionineMMetAUG (Start)1
PhenylalanineFPheUUU, UUC2
ProlinePProCCU, CCC, CCA, CCG4
SerineSSerUCU, UCC, UCA, UCG, AGU, AGC6
ThreonineTThrACU, ACC, ACA, ACG4
TryptophanWTrpUGG1
TyrosineYTyrUAU, UAC2
ValineVValGUU, GUC, GUA, GUG4
Stop Codons*StopUAA, UAG, UGA3

Total: 64 codons encoding 20 amino acids + 3 stop signals = genetic code redundancy

Explore Proteins Made From These Amino Acids

See how these 20 building blocks combine to create the proteins that power life.