Species Selection for Toxicology Studies: ICH S6 and S9 Guide

Species selection is one of those decisions that feels straightforward until you realize the downstream consequences of getting it wrong. Pick the wrong species for your tox studies and you've generated data that FDA considers irrelevant. Not useless in a scientific sense — irrelevant in a regulatory sense. Which means you repeat the study in the right species, on the CRO's timeline, with your runway burning.

I've spent a significant portion of my career thinking about pharmacology-toxicology relevance. The technical question — which animal model best predicts human response — is fascinating. The regulatory question is more practical: which species does ICH say you need, and what happens if your data doesn't come from that species?

The answer depends on what you're developing.

Small molecules: ICH M3(R2) default

For small molecules, ICH M3(R2) requires repeat-dose toxicity studies in two mammalian species: one rodent, one non-rodent. The default pairing is rat and dog. It has been the default for decades, and unless you have a specific reason to deviate, it's what FDA expects.

Rat: Sprague-Dawley is the standard strain for most programs. Wistar is also acceptable. Rats are well-characterized, relatively inexpensive, and there's an enormous historical database for comparison. Clinical pathology reference ranges, spontaneous tumor rates, organ weight ratios — all available from the CRO for the specific strain and age group.

Dog: Beagle, almost without exception. There's something slightly absurd about this if you think about it too long — the beagle was selected decades ago for practical reasons (good size for serial blood sampling, cooperative temperament, big historical control database), not because beagles are metabolically similar to humans. They're not, particularly for CYP2D6-mediated pathways. But the regulatory infrastructure is built around beagle data, FDA reviewers are calibrated to interpret it, and switching to a different breed would mean starting the historical database from scratch. So beagles it is. Path dependence in action.

When you might deviate:

Minipig instead of dog: Göttingen minipigs are increasingly used, particularly for dermal studies (skin physiology more similar to humans than dog), and for drugs where canine metabolism is known to be non-representative. FDA accepts minipig data, but discuss it in the pre-IND meeting. Some review divisions prefer dog.
Non-human primate instead of dog: For certain therapeutic areas — particularly CNS — NHP may be more relevant. But NHP studies are expensive ($500K-$1M+), ethically complex, and subject to increasing scrutiny. Use NHP only when scientific justification is strong.

For small molecules, the species selection rationale in your Nonclinical Overview should be straightforward. State the species, cite M3(R2), provide any relevant metabolic comparison data (in vitro metabolic profiling across species is standard practice), and move on.

Biologics: ICH S6(R1) — the relevance requirement

This is where species selection gets complicated, and where I've seen the most costly mistakes.

ICH S6(R1) governs nonclinical evaluation of biotechnology-derived pharmaceuticals — monoclonal antibodies, fusion proteins, recombinant proteins, gene therapies, etc. The key principle: use a pharmacologically relevant species.

A pharmacologically relevant species is one in which the test article is pharmacologically active. For a monoclonal antibody targeting human PD-1, you need a species where the antibody binds to the homologous receptor and produces a pharmacological response. If your antibody doesn't bind to rat PD-1 (and it probably doesn't — species cross-reactivity varies widely for biologics), then conducting rat tox studies generates safety data from an animal that wasn't actually exposed to your drug's pharmacological effects. That data is scientifically misleading and regulatorily useless.

Finding the relevant species

The relevance determination typically involves:

Target homology: Sequence comparison of the target protein across species. High sequence homology suggests the biologic will bind the target in that species.
Binding affinity: In vitro binding studies (SPR, ELISA, flow cytometry) measuring affinity of the biologic for the target in multiple species. You want quantitative Kd values, not just "positive" or "negative."
Functional activity: In vitro functional assays demonstrating pharmacological activity in the candidate species. Binding isn't enough — the biologic should produce a measurable downstream effect.
Tissue cross-reactivity: Immunohistochemistry study showing where the biologic binds in tissues from the candidate species. This also identifies potential off-target binding.

For most monoclonal antibodies, the result of this analysis is: cynomolgus monkey is the relevant species, and rodents are not. This is why so many biologic IND programs rely on cynomolgus monkey as the single toxicology species.

One species or two?

ICH S6(R1) explicitly states that if only one species is pharmacologically relevant, toxicology studies in one species are acceptable. This is a major departure from the two-species requirement for small molecules under M3(R2).

But you must justify the single-species approach with actual data. Not a paragraph saying "we evaluated multiple species." The justification needs species evaluation data — binding, functional activity, tissue cross-reactivity — and a clear explanation of why the second species isn't relevant. I've reviewed submissions where the justification was essentially "our antibody doesn't bind rat PD-1, so we only used cynomolgus." That's the right conclusion, but it was stated without the supporting binding data. The reviewer sent questions. Three weeks lost because somebody skipped a table.

If two species are relevant, use both. FDA prefers two species when the data supports it.

The surrogate molecule trap

When no animal species expresses a target that cross-reacts with your biologic, one option is to develop a surrogate molecule — a species-specific version of your biologic that binds the homologous target in the test species. For example, an anti-mouse PD-1 antibody used in mice as a surrogate for your anti-human PD-1 clinical candidate.

Surrogates have been used successfully, but they come with significant limitations:

The surrogate is a different molecule with potentially different pharmacological properties
PK/TK comparison between the surrogate and the clinical candidate is difficult
Manufacturing the surrogate adds time and cost
FDA's confidence in surrogate data is generally lower than data from the clinical candidate tested in a relevant species

ICH S6(R1) acknowledges surrogates as an option but notes their limitations. Use surrogates when there's no relevant species. Don't use them when a relevant species exists — it's more work for lower-confidence data.

Transgenic models

Another option: genetically modified animals expressing the human target. Transgenic mice expressing human PD-1, for example. These models are available for some targets. They allow testing of the actual clinical candidate rather than a surrogate.

The challenge: the transgenic animal's immune system and physiology are otherwise murine. The pharmacological context may not fully recapitulate human biology. And the historical database for toxicology endpoints in transgenic models is limited compared to conventional strains.

Discuss the use of transgenic models with FDA before committing. The pre-IND meeting is the right venue for this conversation.

Oncology: ICH S9 flexibility

ICH S9 covers nonclinical evaluation for anticancer pharmaceuticals. It provides significant flexibility compared to M3(R2), reflecting the benefit-risk calculus for patients with life-threatening diseases.

Key species selection considerations under S9:

Reduced species requirements. For cytotoxic drugs with well-understood mechanisms, toxicology in one species (typically rodent) may be sufficient for Phase 1 in patients with advanced disease. This is a meaningful reduction — saving $500K+ and 6-12 months compared to a two-species program.

Pharmacologically relevant species requirement still applies. For targeted therapies (kinase inhibitors, monoclonal antibodies, ADCs), the species must express the target. This follows the same logic as S6(R1): data from a species where the drug isn't pharmacologically active doesn't tell you about target-mediated toxicity.

Genotoxicity flexibility. Standard genotoxicity battery may not be required before Phase 1 for cytotoxic drugs. The rationale: cytotoxic drugs are expected to be genotoxic (they kill dividing cells), and the benefit-risk for patients with advanced cancer supports deferral. But for non-cytotoxic oncology drugs, the standard genotoxicity battery applies.

Safety pharmacology flexibility. For cytotoxic drugs, standalone safety pharmacology studies may not be needed if the relevant endpoints (cardiac, respiratory, CNS) are assessed within the general toxicology studies. For targeted therapies, the standard ICH S7A core battery applies.

I should note: ICH S9 flexibility applies specifically to studies supporting Phase 1 trials in patients with advanced malignancies. If your oncology program includes healthy volunteer studies (some do, particularly for certain targeted therapies), the standard M3(R2) requirements apply. The patient population determines the regulatory framework, not the drug's intended therapeutic area.

ADCs: a special case

Antibody-drug conjugates combine a biologic component (the antibody) with a small molecule component (the cytotoxic payload). Species selection needs to account for both:

The antibody component: pharmacological relevance per ICH S6(R1) — does the antibody bind the target in the test species?
The payload: systemic toxicity from the small molecule — which may have species-specific metabolism

This frequently results in different species for different studies. NHP for the intact ADC (based on antibody-target binding), and rat for the free payload (based on metabolic relevance). The resulting data package is complex to interpret, and the nonclinical written summary needs to clearly integrate findings from both approaches.

Gene therapies and cell therapies

Species selection for gene and cell therapies follows the general principles of ICH S6(R1) — pharmacological relevance — but with additional complexity:

Viral vector tropism: AAV serotypes have species-specific tissue tropism. AAV9 crosses the blood-brain barrier efficiently in humans and NHP but not in most rodent strains. If your program relies on CNS delivery via AAV9, mouse tox data may not predict human biodistribution.

Immune response: Animals may mount immune responses to human transgene products. This complicates interpretation of repeat-dose toxicity studies and may limit study duration.

Species-specific biology: For cell therapies, the human cells may not survive or function in animal models due to immune rejection (absent immunocompromised or humanized mice).

These are areas where FDA guidance is still evolving. The pre-IND meeting is essential for gene and cell therapy programs. Ask FDA specifically about their species expectations.

The species selection document

Include a dedicated species selection rationale in your IND. It should appear in the Nonclinical Overview (Module 2.4) and include:

Drug modality — small molecule, biologic, ADC, gene therapy, etc.
Applicable ICH guideline — M3(R2), S6(R1), S9, or combination
Species evaluated — list all species considered
Relevance data — binding affinity, functional activity, metabolic profiling, tissue cross-reactivity (as applicable)
Selected species — with justification
Rejected species — with explanation of why they're not relevant

This section doesn't need to be long — 2-4 pages is typical. But it needs to be complete. A reviewer who doesn't understand your species choice will question all of your toxicology data. Getting this right up front prevents questions downstream.

Mistakes I keep seeing

The one that costs the most money: defaulting to rat/dog for biologics without checking cross-reactivity first. I've watched — well, "watched" is too strong, I've heard about after the fact — companies that ran full rat and dog tox programs for monoclonal antibodies that don't cross-react with the target in either species. $600K+ in studies that FDA will consider irrelevant. The data isn't bad. It's just not evidence of what FDA needs evidence of.

Test cross-reactivity during lead optimization, not after candidate selection. If you discover at the IND-enabling stage that your antibody doesn't bind cynomolgus monkey target, you've just added 6+ months for surrogate development. That's a discovery you want to make when it's cheap.

The NHP assumption is related. Not every biologic needs cynomolgus monkey. If your molecule cross-reacts in rabbit or pig, that may work — particularly for shorter studies. NHP tox runs $500K-$1M per study and carries ethical weight that sponsors increasingly have to justify to their own teams, not just to FDA. I think there's a broader conversation happening in the field about NHP use that goes beyond any individual IND program, but that's... honestly a topic for a different article. Or a different decade.

One more: make sure your PK characterization species matches your tox species. I've seen PK data generated in rats paired with tox data from dogs, with no PK in dogs. You can't relate exposure to toxicity if the data comes from different species. The reviewer will ask, and the answer is always embarrassing.

Pick the right species early. Get the binding data. Discuss it at the pre-IND meeting. Then run the studies.

Have you confirmed species relevance with binding data before committing to your tox program? If not, that's probably worth doing before you schedule anything at a CRO.

Related reading:

FDA IND Submission Checklist 2026 — complete requirements by CTD module
ICH S2 Genotoxicity Testing Guide — S2(R1) battery requirements
IND-Enabling Studies: Timeline, Cost, and What You Need — planning the nonclinical package