Implementing Gillespie's Algorithm.

Question

Argento on 19 Jun 2015

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/224604-implementing-gillespie-s-algorithm

Answered: Jeremy Huard on 1 Jun 2022

Hello everyone,

Being pretty new to Matlab, I've been struggling trying to implement Gillespie's Algorithm (1977). Truth be told, I am still somewhat confused by certain aspects of the algorithm itself (such as the calculation of the propensity function). I've been tying to stick pretty close to the methods outlined in his paper.

Below I posted my attempt at coding fig.6 from his (1977) paper, but I've been getting nowhere.

Could someone with some experience with the algorithm tell me what I'm doing wrong? Many thanks.

clear
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%Species and reaction channels %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
M = 2;
N = 1;
%%%%%%%%%%%%%%
%%%STEP 0 %%%
%%%%%%%%%%%%%%
c1X = 5;            %% ????  %%
c1 = 5;             %% ????  %%
c2 = 0.005;
Y = 10;
X1 = Y;
t = 0;                            %%% Time  %%%
n = 0;                            %%% Counter  %%%
t_max = 100;                      %%% Max. number of iterations %%%
T = zeros(1, t_max + 1);          %%% Time values array (for plotting purposes) %%%
XX = zeros(1, t_max + 1);          %%% X values array  %%%
Yc = zeros(1, t_max + 1);          %%% Y values array (for plotting purposes) %%%
%%%%%%%%%%%%%%
%%%STEP 1 %%%
%%%%%%%%%%%%%%
while n < t_max
    h1 = X1*Y;
    h2 = Y*(Y-1)*0.5;
      a = zeros(1, 2);
      a1 = h1*c1;
      a2 = h2*c2;
      a(1, 1) = a1;
      a(1, 2) = a2;
      a0 = sum(a);
      %%%%%%%%%%%%%%
      %%%STEP 2 %%%
      %%%%%%%%%%%%%%
      r1 = rand();
      r2 = rand();
      tau = -log(r1)*(1/a0);
      t = t + tau;
      if r2*a0 <= sum(a);
          Y = Y + h1;
          Yc(1, n+1) = Y;
      else
          Y = Y - h2;
          Yc(1, n+1) = Y;
      end
      %%%%%%%%%%%%%%
      %%%STEP 3 %%%
      %%%%%%%%%%%%%%
      t = t + tau;
      T(1, n+1) = t;
      n = n + 1;
  end
plot(T, Yc)

4 Comments
Show 2 older commentsHide 2 older comments

Harley Day on 8 Jul 2015

Edited: Harley Day on 11 Feb 2019

Open in MATLAB Online

Hi Argento

If I were you, I'd work up to reaction (29) in Gillespie's paper from reaction (22). This simpler reaction is irreversible isomerisation. Because it contains only one reaction species, you can play with the algorithm without having to worry so much about indexing your vectors. Here's my code for reaction (22). I'll work towards reaction (29) and post that code when it's ready.

Hope that helps.

Harley

 %code
clear
%%Irreversible isomerisation reaction.
%This gives figure 3 of Gillespie's 1977 paper.
M = 1;                  % Number of reaction pathways
N = 1;                  % Number of molecular species considered
%%STEP 0
c = 0.5;                % In general, a vector of length M of stochastic reaction constants.
X = 1000;               % Initial number of molecules of species X. In general, a vector of length N of initial population numbers for each species.
Xplot = X*ones(M,1);    % For plotting.
t = 0;                  % Time variable.
tplot = zeros(1);       % For plotting.
n = 0;                  % Reaction counter.
n_max = 1000;
%%STEP 1
while n < n_max
    h = X;              % Number of molecular reactant combinations available in current state. In general, h is a vector of length M consisting of combinatorial functions of molecular population numbers X (which is of length N).
    a = h*c;            % a is the propensity of the reaction pathway in current state. In general, a vector of length M
    a0 = sum(a);        % a0 is total propensity that anything happens. This number emerges more out of mathematical necessity than physical intuition.
    %%STEP 2
    r = rand(2,1);
    tau = -log(r(1))/a0;
    % Since this reaction is irreversible isomerisation (only one event can
    % occur) the value of mu is always 1, and so the next step is
    % unnecessary in this case.
    mu = sum(r(2)*a0 <= cumsum(a));   % Currently always 1 since 0<=r(2)<=1. Note 0<mu<=M.   
    %%STEP 3
    t = t + tau;
    % Adjust population levels based on reaction formula(s). Again, because
    % only one reaction is occuring, this switch statement is unnecessary.
    switch mu
        case 1
            X(mu) = X(mu) - 1;
    end
    n = n + 1;
    % At this point, all the physics has been simulated, it only remains to
    % record the new values of t and X to vectors to we can plot it later.   
    Xplot(mu,n+1) = X(mu);
    tplot(n+1) = t;  
end
stairs(tplot, Xplot)

Harley Day on 11 Feb 2019

Edited: Harley Day on 17 Feb 2019

Open in MATLAB Online

Hi all,

Just received an email asking me if I ever completed the code I promised almost 4 years ago now. Yes, I did, but I forgot to get back to this forum with my results! Anyway, here you go...

Equation 29 example

Here's some code which simulates equation 29 from Gillespie's 1977 paper.

If you run the code below, you should end up with something looking like figure 6 (though not exactly the same of course since it's stochastic):

figure
hold on
[tplot, Yplot3000] = eqn29 ( 3000 );
stairs(tplot, Yplot3000);
[tplot, Yplot10] = eqn29 ( 10 );
stairs(tplot, Yplot10);
line ( [0, 5], [1000, 1000] );
legend( {'$Y_0=3000$','$Y_0=10$', 'deterministic steady state'}, 'interpreter', 'latex' );
hold off
xlabel('time (minutes)', 'interpreter', 'latex');
ylabel('number of $Y$ molecules', 'interpreter', 'latex');
function [tplot, Yplot] = eqn29 ( Y0 )
% function to simulate reaction dynamics of equation 29 from Gillespie's
% 1977 paper. The sigle argument to this function specifies the initial
% number of molecules of species Y.
%% Equation 29 reactions
%This gives figure 6 of Gillespie's 1977 paper.
M = 2;                  % Number of reaction pathways
N = 1;                  % Number of molecular species considered
%% STEP 0
% Initial number of molecules of species Y(1) and X(2). In general, a
% vector of length N of initial population numbers for each species.
%------------------------
% equation         | rate
%------------------|-----
% Xbar + Y -> 2*Y  | c1
% Y + Y -> Z       | c2
%------------------------
Y = Y0;
Xbar = 10;
% In general, a vector of length M of stochastic reaction constants.
c = [5/Xbar,0.005]; % units: (expected number of) reactions per minute
Yplot = Y;            % For plotting.
t = 0;                  % Time variable.
tplot = zeros(1);       % For plotting.
t_stop = 5;
n = 0;                  % Reaction counter.
n_max = 1000;
%% STEP 1
while t < t_stop
    % Number of molecular reactant combinations available in current state.
    % In general, h is a vector of length M consisting of combinatorial
    % functions of molecular population numbers X (which is of length N).
    h = [Xbar*Y,Y*(Y-1)/2];
    % a is the propensity of the reaction pathway in current state. In
    % general, a vector of length M
    a = h.*c;
    % a0 is total propensity that anything happens. This number emerges
    % more out of mathematical necessity than physical intuition.
    a0 = sum(a);
    %% STEP 2
    r = rand(2,1);
    tau = -log(r(1))/a0;
    % Note 0<=mu<=M. This decides which reaction occurs. If mu=0, no
    % reaction occurs. The switch statement does nothing in this case.
    mu = sum(r(2)*a0 <= cumsum(a));
    
    %% STEP 3
    t = t + tau;
    % Adjust population levels based on reaction formula(s).
    switch mu
        case 2
            Y = Y + 1;
        case 1
            Y = Y - 2;
    end
    n = n + 1;
    % At this point, all the physics has been simulated, it only remains to
    % record the new values of t and X to vectors to we can plot it later.
    Yplot(n+1,:) = Y;
    tplot(n+1) = t;  
end
end

An equivalent of this system can be modelled deterministically like this:

This determinstic ODE represents the system under the assumption that the numbers of reactant molecules are high enough that the reaction rates and number of molecules can be assumed to be smooth without loss of detail. Obviousely this assumption is always wrong in some sense (you can't actually have 30000000.2 molecules of something). Remember however that all models are wrong, but some are useful!

The solution to the deterministic ODE system should be the average of many Gillespie algorithm runs, and gives us a nice way to check our working. If the Gillespie algorithm we've coded up yields results which are on average in agreement with the deterministic ODEs, then we can be fairly sure that we haven't made any errors.

[t3000, YDetplot3000] = ode45 ( @eqn29Deterministic, [0, 5], 3000 );
[t10, YDetplot10]   = ode45 ( @eqn29Deterministic, [0, 5], 10 );
figure;
hold on;
plot ( t3000, YDetplot3000 );
plot ( t10, YDetplot10 );
legend( {'$Y_0=3000$','$Y_0=10$'}, 'interpreter', 'latex' );
xlabel('time (minutes)', 'interpreter', 'latex');
ylabel('mean number of $Y$ molecules', 'interpreter', 'latex');
xlim ( [0, 5] );
function dy = eqn29Deterministic (t, y)
Xbar = 10;
c_1 = 5/Xbar;
c_2 = 0.005;
dy = c_1 * Xbar * y - c_2 *y^2;
end

I advise anyone starting to use the Gillespie algorithm to check their working using this method. Take, say 1000 runs using a Gillespie algorithm, and take an average. Then simulate the deterministic equivalent system, and check to see if your curves look similar. Use caution if you're dealing with limit cycle behaviour (such as the Lotka reactions below). In such cases, the stochastic nature of your system will likely cause the amplitude and phase to vary quite a bit in the stochastic case because small fluctiations in molecule numbers at any point in the cycle will result in slight devations in the rate at which your system works its way around the limit cycle, and these small fluctions will become compounded over time. Therefore oscillations from your stochastic and deterministic systems will probably not line up with each other after some time has passed. The stochastic system will gradually go out of phase with the deterministic system, but the amplitudes and frequencies of the limit cycle will, on average, be the same as for the deterministic system.

Lotka reactions

This next bit of code is along the same lines, but simulates the Lotka system defined in equation 38 from Gillespie's 1977 paper.

The code below simulates this system, and records the numbers of the species

and

.

Note the changes to the reaction rates vector "c", and the vector "h" which enumerates the number of combinations of molecular interactions which can bring about each reaction. There are three different reactions modelled here, so the switch condition now has three cases.

%clear
%% Lotka reactions
%This gives figure 6 of Gillespie's 1977 paper.
M = 3;                  % Number of reaction pathways
N = 2;                  % Number of molecular species considered
%% STEP 0
% Initial number of molecules of species X(1) and X(2). In general, a
% vector of length N of initial population numbers for each species.
%--------------------------
% equation           | rate
%--------------------|-----
% Xbar + Y1 -> 2*Y1  |  c1
% Y1 + Y2 -> 2*Y2    |  c2
% Y2 -> Z            |  c3
%--------------------------
Y1 = 1000;
Y2 = 1000;
X = 10;
% In general, a vector of length M of stochastic reaction constants.
c = [10/X,0.01,10]; % units: (expected number of) reactions per minute
Y1plot = Y1;            % For plotting.
Y2plot = Y2;
t = 0;                  % Time variable.
t_max = 30;
tplot = zeros(1);       % For plotting.
n = 0;                  % Reaction counter.
n_max = 1000000;
%% STEP 1
while t < t_max
% Number of molecular reactant combinations available in current state.
% In general, h is a vector of length M consisting of combinatorial
% functions of molecular population numbers X (which is of length N).
h = [X*Y1,Y1*Y2,Y2];
% a is the propensity of the reaction pathway in current state. In
% general, a vector of length M
a = h.*c;
% a0 is total propensity that anything happens. This number emerges
% more out of mathematical necessity than physical intuition.
a0 = sum(a);
%% STEP 2
r = rand(2,1);
tau = -log(r(1))/a0;
% Note 0<=mu<=M. This decides which reaction occurs. If mu=0, no
% reaction occurs. The switch statement does nothing in this case.
mu = sum(r(2)*a0 <= cumsum(a));
%% STEP 3
t = t + tau;
% Adjust population levels based on reaction formula(s).
switch mu
case 3
Y1 = Y1 + 1;
case 2
Y1 = Y1 - 1;
Y2 = Y2 + 1;
case 1
Y2 = Y2 - 1;
end
n = n + 1;
% At this point, all the physics has been simulated, it only remains to
% record the new values of t and X to vectors to we can plot it later.
Y1plot(n+1,:) = Y1;
Y2plot(n+1,:) = Y2;
tplot(n+1) = t;
end
figure
hold on
stairs(tplot, Y1plot)
stairs(tplot, Y2plot)
legend({'$Y_1$','$Y_2$'}, 'interpreter','latex');
xlabel('time (minutes)', 'interpreter', 'latex');
ylabel('numbers of $Y_1$ and $Y_2$ molecules', 'interpreter', 'latex');
hold off
figure
plot(Y1plot,Y2plot);
xlabel('number of $Y_1$ molecules', 'interpreter', 'latex');
ylabel('number of $Y_2$ molecules', 'interpreter', 'latex');

Time sampling (to make things quicker)

One problem which you may encounter when trying to analyse results from the Gillespie algorithm is that of sampling many Gillespie runs at a particular moment (or moments) in time. Let's say you're interested in looking at the distribution of species Y at time

minutes for the equation 29 example above. You can run lots of simulations which will tell you the number of Y species at that time, but your results will tell you the number of species for all time between time

up to time

minutes, and you'll have to sift through the time vector from each simulation run to extract the times you're interested in.

For example, one Gillespie run might give you:

meaning that

whereas another might give you:

meaning that

It's not efficient to ask MATLAB to store the results for all time and then go through your time vectors to obtain species numbers at the times you're actually interested in since you will end up spending alot of time storing numbers you end up just throwing away.

It is far more efficient to obtain results only at time-points you're actually interested in. Let's say you want to know the number of species Yat times

. We want the algorithm to output the following information:

or for another Gillespie run, perhaps the following would be obtained:

I.e. we want to have the times already sampled for us, and not have to bother sifting through huge arrays of results to extract these samples.

To sample only times you're actually interested in, it's best to introduce this time sampling within the Gillespie algorithm itself. I do this as follows (look for the if statement below):

% in this code, we will sample the timescale at regular intervals, rather
% than obtaining information about every change in species numbers which
% occurs (where the time sampling will be irregular).
%%%%%%%%%%%%%%%%%%%%%%% in my "equation_29.m" code, the gillespie algorithm
%%%%%%%%%%%%%%%%%%%%%%% usually gives us around 41000 time points when we
%%%%%%%%%%%%%%%%%%%%%%% simulate from Y0=10, but we can ask for just 1000
%%%%%%%%%%%%%%%%%%%%%%% equally spaced timepoints from this set.
[tplot, Yplot3000] = eqn29 ( 3000, linspace(0, 5, 1000) );
figure (1);
hold on;
stairs(tplot, Yplot3000);
%%%%%%%%%%%%%%%%%%%%%%% in my "equation_29.m" code, the gillespie algorithm
%%%%%%%%%%%%%%%%%%%%%%% usually gives us around 31000 time points when we
%%%%%%%%%%%%%%%%%%%%%%% simulate from Y0=10, but we can ask for just 1000
%%%%%%%%%%%%%%%%%%%%%%% equally spaced timepoints from this set.
[tplot, Yplot10] = eqn29 ( 10, linspace(0, 5, 1000) );
stairs(tplot, Yplot10);
line ( [0, 5], [1000, 1000] );
legend( {'$Y0=3000$','$Y0=10$', 'deterministic steady state'}, 'interpreter', 'latex' );
hold off
xlim([0,5]);
xlabel('time', 'interpreter', 'latex');
ylabel('number of Y molecules', 'interpreter', 'latex');
title ( 'Gillespie algorithm with time sampling' );
%%%%%%%%%%%%%%%%%%%%%% If we ask for samples at a rate which is higher than
%%%%%%%%%%%%%%%%%%%%%% the actual rate at which species numbers change, the
%%%%%%%%%%%%%%%%%%%%%% algorithm still works correctly.
%%%%%%%%%%%%%%%%%%%%%%% in my "equation_29.m" code, the gillespie algorithm
%%%%%%%%%%%%%%%%%%%%%%% usually gives us around 41000 time points when we
%%%%%%%%%%%%%%%%%%%%%%% simulate from Y0=10. Let's oversample (ask for more
%%%%%%%%%%%%%%%%%%%%%%% than 41000 timepoints), and see if this causes
%%%%%%%%%%%%%%%%%%%%%%% problems... We can ask for 1 million time samples
[tplot, Yplot3000] = eqn29 ( 3000, linspace(0, 5, 1e6) );
figure (2);
hold on;
stairs(tplot, Yplot3000);
%%%%%%%%%%%%%%%%%%%%%%% Same as above
[tplot, Yplot10] = eqn29 ( 10, linspace(0, 5, 1e6) );
stairs(tplot, Yplot10);
line ( [0, 5], [1000, 1000] );
legend( {'$Y0=3000$','$Y0=10$', 'deterministic steady state'}, 'interpreter', 'latex' );
hold off
xlim([0,5]);
xlabel('time', 'interpreter', 'latex');
ylabel('number of Y molecules', 'interpreter', 'latex');
title ( 'Gillespie algorithm with time sampling (oversampling)' );
%%%%%%%%%%%%%%%%%%%%%% If we ask for samples at a rate which is lower than
%%%%%%%%%%%%%%%%%%%%%% the actual rate at which species numbers change, the
%%%%%%%%%%%%%%%%%%%%%% algorithm still works correctly.
%%%%%%%%%%%%%%%%%%%%%%% in my "equation_29.m" code, the gillespie algorithm
%%%%%%%%%%%%%%%%%%%%%%% usually gives us around 41000 time points when we
%%%%%%%%%%%%%%%%%%%%%%% simulate from Y0=10. Let's undersample (ask for
%%%%%%%%%%%%%%%%%%%%%%% fewer than 41000 timepoints), and see if this causes
%%%%%%%%%%%%%%%%%%%%%%% problems... We can ask for 10 time samples
[tplot, Yplot3000] = eqn29 ( 3000, linspace(0, 5, 10) );
figure (3);
hold on;
stairs(tplot, Yplot3000);
%%%%%%%%%%%%%%%%%%%%%%% Same as above
[tplot, Yplot10] = eqn29 ( 10, linspace(0, 5, 10) );
stairs(tplot, Yplot10);
line ( [0, 5], [1000, 1000] );
legend( {'$Y0=3000$','$Y0=10$', 'deterministic steady state'}, 'interpreter', 'latex' );
hold off
xlim([0,5]);
xlabel('time', 'interpreter', 'latex');
ylabel('number of Y molecules', 'interpreter', 'latex');
title ( 'Gillespie algorithm with time sampling (undersampling)' );
%%%%%%%%%%%%%%%%%%%%%% if you want to make a histogram of the distribution
%%%%%%%%%%%%%%%%%%%%%% of species numbers at a particular time, you can
%%%%%%%%%%%%%%%%%%%%%% just run this algorithm multiple times (I advise
%%%%%%%%%%%%%%%%%%%%%% using parfor to run this algorithm in parallel on
%%%%%%%%%%%%%%%%%%%%%% lots of CPU cores since the computations will each
%%%%%%%%%%%%%%%%%%%%%% be independent). I won't write parfor here since I
%%%%%%%%%%%%%%%%%%%%%% don't know what kind of computer you're using. You
%%%%%%%%%%%%%%%%%%%%%% can read about it here: https://uk.mathworks.com/help/distcomp/parfor.html
number_of_Gillespie_runs = 100;
Y_samples = zeros( 1, number_of_Gillespie_runs);
sample_time = 2;
for i=1:length(Y_samples)
    [~, Y_samples(i)] = eqn29 ( 10, sample_time );
end
figure ( 4 );
hist ( Y_samples );
xlabel ( ['number of species $Y$ at time $t=', num2str(sample_time),'$'], 'interpreter', 'latex' );
ylabel ( 'frequency', 'interpreter', 'latex' );
title ( [num2str(number_of_Gillespie_runs), ' samples'], 'interpreter', 'latex' );
function [tplot, Yplot] = eqn29 ( Y0, timeSamples )
% function to simulate reaction dynamics of equation 29 from Gillespie's
% 1977 paper. The sigle argument to this function specifies the initial
% number of molecules of species Y.
% parameter   | meaning
%-------------|-------------------------------------------------------------------------
% Y0          | number of species Y0 at time t=0
% timeSamples | vector of times at which we would like to know the nuber of each sepcies
%---------------------------------------------------------------------------------------
%% Equation 29 reactions
%This gives figure 6 of Gillespie's 1977 paper.
% M = 2;                  % Number of reaction pathways
% N = 1;                  % Number of molecular species considered
%% STEP 0
% Initial number of molecules of species Y(1) and X(2). In general, a
% vector of length N of initial population numbers for each species.
%------------------------
% equation         | rate
%------------------|-----
% Xbar + Y -> 2*Y  | c1
% Y + Y -> 2*Y     | c2
%------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%% Harley's time sampling code...%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
obs_index = 1; % output index
n_times_to_observe = length ( timeSamples ); % the number of times which we will be sampling
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Y = Y0;
Xbar = 10;
% In general, a vector of length M of stochastic reaction constants.
c = [5/Xbar,0.005];
Yplot = zeros ( n_times_to_observe, length( Y ) ); % Prealocate the output array (makes it easier for the computer if we claim all of the memory we need at the beginning rather than dynamically allocating memory on the fly).
t = 0;                  % Time variable.
tplot = zeros(1);       % For plotting.
%% STEP 1
while obs_index<=n_times_to_observe % note this used to be "t < t_stop", but we now look at the time index for our stopping condition
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%%%%%%%%%%%%%%%% Harley's time sampling code...%%%%%
    %%%%%%%%%%%%%%%%% this if statement catches instances where the time t
    %%%%%%%%%%%%%%%%% passes one of our sampling times, and records the species
    %%%%%%%%%%%%%%%%% numbers when this happens. We can now sample time at any
    %%%%%%%%%%%%%%%%% resolution we like, and only record the times we're
    %%%%%%%%%%%%%%%%% interested in.
    if(t>=timeSamples(obs_index)) % if we've exceeded an obsevation time, record the current population values up to the new time change
        Yplot(obs_index,:) = Y;
        tplot(obs_index) = t;
        obs_index = obs_index + 1; % move our attention to next observation time
        continue; % restart while loop
    end
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    % Number of molecular reactant combinations available in current state.
    % In general, h is a vector of length M consisting of combinatorial
    % functions of molecular population numbers X (which is of length N).
    h = [Xbar*Y,Y*(Y-1)/2];
    % a is the propensity of the reaction pathway in current state. In
    % general, a vector of length M
    a = h.*c;
    % a0 is total propensity that anything happens. This number emerges
    % more out of mathematical necessity than physical intuition.
    a0 = sum(a);
    
    %% STEP 2
    r = rand(2,1);
    tau = -log(r(1))/a0;
    % Note 0<=mu<=M. This decides which reaction occurs. If mu=0, no
    % reaction occurs. The switch statement does nothing in this case.
    mu = sum(r(2)*a0 <= cumsum(a));
    
    %% STEP 3
    t = t + tau;
    % Adjust population levels based on reaction formula(s).
    switch mu
        case 2
            Y = Y + 1;
        case 1
            Y = Y - 2;
    end
end
end

As a final example, I'll show you a detailed simulation of equation 29 to find the distribution of species Y at time

minutes given that

.

Given

We will simulate to find the distribution

using 1 million Gillespie runs. We display a histogram of these runs to show the predicted distribution of species Y after 30 seconds. These distributions cannot usually be written in closed form, and so numerical simulations such as these must be employed.

number_of_Gillespie_runs = 1e6;
Y_samples = zeros( 1, number_of_Gillespie_runs);
sample_time =0.5;
parfor i=1:length(Y_samples)
    [~, Y_samples(i)] = eqn29 ( 10, sample_time );
end
fig4 = figure;
histogram ( Y_samples, [min(Y_samples):max(Y_samples)], 'EdgeColor', 'none' );
xlabel ( sprintf('number of species $Y$ at time $t=%G$', sample_time), 'interpreter', 'latex' );
ylabel ( 'frequency', 'interpreter', 'latex' );
title ( [num2str(number_of_Gillespie_runs), ' samples'], 'interpreter', 'latex' );
print ( fig4, 'histogram_of_Gillespie_runs', '-dsvg' );

I hope this helps

Harley

David Goodmanson on 17 Feb 2019

Hi Harley,

quite a tour de force.

I have often felt that this site should allow voting for comments as well as answers, for the simple reason that comments can sometimes be better and more insightful than answers.

Serhat Unal on 23 Sep 2020

why is it Y1,Y2=1000,1000 and how would the code for Lotka reactions look like

if we solve the ODE's using ODE45 in a seperate m-file, the we would have

three columns and many rows for Y1,Y2 and X for each time-step.

Sign in to comment.

Sign in to answer this question.

Answer 1

Harley Day on 24 Sep 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/224604-implementing-gillespie-s-algorithm#answer_499573

Edited: Harley Day on 1 Oct 2020

Open in MATLAB Online

Hi Serhat,

I assume you're talking about the Lotka reactions (equations 36 of the Gillespie paper).

We set

Y1 = 1000;
Y2 = 1000;

as our initial conditions. This choice of initial condition is completely arbitrary. You can change this is you like. In fact, if you copy paste the Lotka reaction code from above, and change the intial condition, you can see how/if it changes the dynamics.

The second part of your question seems to be about finding the deterministic equivalent of the Lotka example. The way we produce the detministic equivalent system is to look at each of the reactions in the system, converting each of these into terms in the ODE. I'll take us through the process for this example:

Let's just look at the first raction:

. To have this reaction occur, we need the molecule

to meet the molecule

. In the deterministic system, we assume that the numbers of these molecules is represented by a continuous, single valued function

(which remains constant so I'm going to drop the

notation for

) and

, which is a reasonable assumption when these numbers remain fairly large and the system is well-mixed.

The number of ways in which the molecules of

can combine with the molecules of

is

, and we produce two

molecules as a result (though one is just a reactant being released again so we only gain one

molecule overall). This reaction occurs at rate

. From all this information, we can write the first term of our ODE system:

We just have to work through the remaining reactions to fill in the remaining terms of the ODE. Note that the second reaction on the list

will produce two ODE terms, since it uses up a

molecule and produces a

molecule.

When we've finished this process, we obtain the coupled system of ODEs:

These ODEs can be straighforwardly solved using ODE45 as you suggest. Here's a quick script to do just that, applying the initial conditions

and

, we discussed above.

[t, y] = ode45 ( @Lotka_deterministic, [0, 5], [1000, 1000] );
figure;
hold on;
plot ( t, y );
legend( {'$Y_1=1000$','$Y_2=1000$'}, 'interpreter', 'latex' );
xlabel('time (minutes)', 'interpreter', 'latex');
ylabel('mean number of $Y_n$ molecules', 'interpreter', 'latex');
xlim ( [0, 5] );
function dy = Lotka_deterministic(t, y)
    Xbar = 10;
    c_1 = 10/Xbar;
    c_2 = 0.01;
    c_3 = 10;
    
    dy = [c_1*Xbar*y(1) - c_2*y(1)*y(2);
        c_2*y(1)*y(2) - c_3*y(2)];
end

Now, clearly we're not seeing any dynamics here (note that the

solution in blue is hidden behind the

solution in red so we can't see it). This is because the system is at an (unstable) steady state, so the ODE45 algorithm evaluates the ODE at

and discovers that

and

. You can confirm this using the code by running

Lotka_deterministic(0, [1000; 1000])
dy =
     0
     0

However, if you purturb the initial conditions by even a small amount, you will see limit cycle behaviours emerging. For example setting

instead:

[t, y] = ode45 ( @Lotka_deterministic, [0, 5], [1000.1, 1000] ); % note the small change to the initial condition for Y_1
figure;
hold on;
plot ( t, y );
legend( {'$Y_1=1000$','$Y_2=1000$'}, 'interpreter', 'latex' );
xlabel('time (minutes)', 'interpreter', 'latex');
ylabel('mean number of $Y_n$ molecules', 'interpreter', 'latex');
xlim ( [0, 5] );
function dy = Lotka_deterministic(t, y)
    Xbar = 10;
    c_1 = 10/Xbar;
    c_2 = 0.01;
    c_3 = 10;
    
    dy = [c_1*Xbar*y(1) - c_2*y(1)*y(2);
        c_2*y(1)*y(2) - c_3*y(2)];
end

So why does this happen? Why does the stochastic simulation using the Gillespie algorithm show us limit-cycling behaviour even if we put the initial conditions on a steady state, while the deterministic equivalent just stays at steady state? The answer is that the random fluctuations in the numbers of molecules due to the random occurance of the reactions rapidly moves the system away from the steady state, pushing it towards the limit cycle.

The Gillespie paper uses this example specifically because it shows this counterintiutive behaviour: The deterministic system can show us a steady state, whereas the stochastic simulation shows us that the steady state is quickly destroyed by stochastic noise. The point Gillespie was trying to make was that the introduction of Stochastic noise can fundamentally change the behaviours of whatever system you're studying.

I hope this helps. Let me know if there are any points which are unclear. :)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Jeremy Huard on 1 Jun 2022

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/224604-implementing-gillespie-s-algorithm#answer_976190

Open in MATLAB Online

While this post is about understanding Gillespie's algorithm, it might be interesting for anyone landing on this page to know that this algorithm is one of the stochastic solvers implemented in SimBiology.

Figure 6 of Gillespie's paper can be reproduced with the following code.

Please note that SimBiology requires deterministic rate constants and will scale them automatically to compute stochastic rate constants.

% create SimBiology model

model = sbiomodel('reaction29');

c = model.addcompartment('c',1);

% species

c.addspecies('Xbar',5,Constant=true);

y = c.addspecies('Y',3000);

c.addspecies('Z',0);

% stochastic rate constants

c1 = 1;

c2 = 0.005;

% deterministic rate constants

volume = c.Value;

model.addparameter('k1',volume*c1);

model.addparameter('k2',volume*c2/2);

% reactions

r1 = model.addreaction('Xbar + Y -> 2 Y');

kin1 = r1.addkineticlaw('MassAction');

kin1.setparameter( 'Forward Rate Parameter','k1');

r2 = model.addreaction('2 Y -> Z');

kin2 = r2.addkineticlaw('MassAction');

kin2.setparameter( 'Forward Rate Parameter','k2');

verify(model);

% run stochastic simulations

cs = model.getconfigset();

cs.StopTime = 5;

cs.RuntimeOptions.StatesToLog = 'Y';

cs.SolverType = 'ssa';

cs.SolverOptions.LogDecimation = 10; % 10 rpd plot

y.Value = 3000;

results1 = sbiosimulate(model, cs);

y.Value = 10;

results2 = sbiosimulate(model, cs);

% run deterministic simulations

cs.SolverType = 'ode15s';

y.Value = 3000;

results1d = sbiosimulate(model, cs);

y.Value = 10;

results2d = sbiosimulate(model, cs);

% plot results

colors = colororder;

figure;

hold on;

stairs(results1.Time, results1.Data, Color=colors(1,:), LineStyle='--');

plot(results1d.Time, results1d.Data, Color=colors(1,:), LineStyle='-');

stairs(results2.Time, results2.Data, Color=colors(2,:), LineStyle='--');

plot(results2d.Time, results2d.Data, Color=colors(2,:), LineStyle='-');

yline(1000);

legend("Y0 = 3000, stochastic","Y0 = 3000, deterministic",...

"Y0 = 10, stochastic", "Y0 = 10, deterministic")

ylabel("number of Y molecules")

xlabel("time")

box off;

set(gca,"XLimitMethod","padded","YLimitMethod","padded");

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Implementing Gillespie's Algorithm.

4 Comments
Show 2 older commentsHide 2 older comments

Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Implementing Gillespie's Algorithm.

4 Comments Show 2 older commentsHide 2 older comments

Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments