Salem Witch Trials Model

Witch Hunt

The Salem witch trails provide a horrifying example of the effects of mass hysteria. In 1692, twenty unfortunate souls in Salem, Massachusetts were accused of witchcraft, convicted and hanged. This was the subject of Arthur Miller’s famous play “The Crucible”. This subject is typically studied in english or history class, but I hope to demonstrate in this post it also can provide an example of mathematical modelling.

Using a mathematical model to understand this phenomena allows us to get at the core of the driving dynamics. Why did the witch trials in Salem spiral out of control? What unique factors in Salem may have contributed to the hysteria? How are witch trials similar to other phenomena like the growth in a biological population, rumor spreading or disease epidemics? A good mathematical model can help us understand all of these questions.

In the witch trials the accused were almost always convicted and then given the choice of hanging or confessing. If they confessed then they had to provide the names of other people they had observed when consorting with the devil. Those named were in turn faced with the same dilemma: name people or hang.

Defining Variables

The first step in developing a mathematical model is define the variables and parameters which you would like to consider. There is a definite art to this process, although it is typically best to start as simple as possible. If the model is deemed overly simple you can always add complexity later.

In this vein we define the total population of Salem as {N_s}. It is convenient to organize our thinking into discrete trials {t=1,2,3,...} so that everyone who is currently accused of witchcraft stands trial together. Thus, let {W_t} be the number of people accused of witchcraft at trial {t}. Since we will assume that if you are accused of witchcraft you will most assuredly be convicted, we can divide the accused witches into two groups: Those who will confess to witchcraft and name {k} other witches and those that will hang for their silence. Let {p_m} be the fraction of martyrs in the Salem population which will refuse to confess.

Let us assume that those who choose to name names do so in a purely random manner. That is when they are accused they choose k people randomly from the Salem population to implicate in witchcraft. This of course is an oversimplification as people will likely have allies, friends and enemies and will be more likely to name people they dislike. This phenomena could be added to the model later. Also, notice we allow the accused to name those who may already have been implicated.

At this point we also should make clear what information we hope to get out of our model. Notice, the witch hunt comes to a close when no one is accused of witchcraft any longer W_N \rightarrow 0. Therefore, we can ask how many people we expect to hang, call this {H}, when the dust settles and the witch hunt is called off. We can ask how this total hung is affected by the fraction of martyrs {p_m} and by {k} the number of new witches named by each non-martyrs accused.

Extreme Cases

It is often useful when developing a model to examine extreme cases first. This can give an idea as to whether the model lines up with our intuition. First we can consider a population of 100 percent martyrs. Here the very first person who is accused has to be a martyr and will hang without naming anyone else. Therefore, if we assume that the witch hunt begins with a single accusation ({W_1=1}) then if {p_m=1} we have {H=1}.

At the other extreme, in a population with no martyrs then no one will hang as everyone will keep naming names {H=0}. We can also derive an upper bound for how many people can possibly be hung {H \leq p_m * N_s} the total number of martyrs in Salem. When {H=p_m N_s} then everybody in the village who was willing to be hung instead of confessing has already been hung!

Thought Experiments

For simplicity let us assume that {W_1=1} that is the witch hunt is started by a single accusation. Clearly if the first person accused is a martyr then they will hang without naming anyone else and they will be the only one hung. This will happen with probability {p_m} if we assume the first accusation occurs at random in the population. However, if the first person accused wants to save themselves they choose to name {k} other people then the witch hunt will continue. The second trial will commence with the {k} people named by the first witch. We can estimate the probability of the witch hunt ending after just two trials the probability of the first trial leading to a confession {(1-p_m)} times the probability that all {k} of those accused of witchcraft in the second trial refuse to name names and are hung {\approx p_m^k}. Thus we have that the probability that {H=k} is roughly {(1-p_m)p_m^k}.

These calculations will become more complicated as we move on. The important lesson we can take from this section is that we expect the witch hunt to end after only one trial in p_m fraction of cases. In the rest we can expect to see the witch hunt begin to grow quickly as people begin naming names and thus the number of hanged people will increase quickly.

This hints that our distribution of hanged people should be bimodal. In the case that the first person is hung the witch hunt is quickly contained. However, if the first person chooses to confess then the witch hunt will grow quickly, but will extinguish eventually as nobody is left to be hung.


It might be relatively difficult to calculate the probability of a certain number of people being hung, but it is relatively easy to simulate the witch trial process. If we simulate the process many times and record the number of people hung in each run we can get an idea of the probability distribution for {H}.

The below plot shows the number of hung people for 10,000 simulations with p_m=0.23, N_s=100 and k=3.



We can see that the plot has two peaks,  one as expected from our analysis sits has only one person being hung. This corresponds to the case that the first person accused is a martyr. However, we see a second peak with around twenty people being hung during the witch hunt which happens about 1-p_m or \approx 75% of the time.

If we increase the percentage of the martyrs in the population we will see that the chances that a full-fledged witch hunt ensues will decrease. However, if a witch hunt starts it may lead to a larger number of people being hanged. Below are the results for a simulation with p_m=0.40.


The Salem witch trials provide a good example of the use of mathematical modelling. From a relatively simple model we were able to understand the role of the population structure on the evolution of a witch hunt. Having a large percentage of martyrs in the population leads to an inherent resistance against a witch hunt escalating. However, once the hunt has begun growing it can easily lead to the death of the majority of the martyrs.

I mentioned earlier that a strength of mathematical modelling is that is can allow us to see parallels between seemingly unrelated phenomena. For example our witch hunt model could be easily changed to study rumor spreading in a population which can be divided into gossips-who when they receive the information spread it to k other individuals and trustworthy individuals who don’t spread the rumor when it reaches them. In this case we would also expect a bimodal distribution depending on whether the rumor is initially spread to a gossip. As we have all experienced a rumor can quickly get out of control when it reaches the wrong ears!

The model we considered here is an example of a branching process a fascinating topic for which much mathematical theory exists.

Python Code

If you are curious the python source code for the simulations is given below.

import scipy as sp
import numpy as np
import seaborn as sbn
import pylab as plt

class Witch:

    def __init__(self, pm=0.5, Nw=100, W1=1, k=3):
        self.Mcutoff=int(pm*Nw) #establish the martyr pop
        self.pop=range(1,Nw+1) #total population
    def runSim(self):

        self.currentWitch=[] #currently accused witches

        self.currentWitch.append(np.random.choice(self.pop,1)) #choose an initial witch at random
        while (len(self.currentWitch)>0):
            for g in range(len(self.currentWitch)):
                if w < self.Mcutoff:
                    newWitches=np.random.choice(self.pop, self.k) #choose k people to accuse
                    for nw in newWitches:
                        if nw not in self.accused:
                    self.accused+=list(newWitches) #add the new accusations to the list, can't be accused twice


if __name__=="__main__":
    for i in range(10000):
    sbn.distplot(dataHung, bins=MaxW, kde=False, norm_hist=True, color='red')
    plt.xlabel('Number Hung')
    plt.ylabel('Probability of Occurance')
    plt.title('Salem Witches')