How to use Markov Chain Monte Carlo for data fitting

I'd like to use a Markov Chain Monte Carlo method for parameter estimation, in data fitting with a model. I usually use LSQNONLIN in Matlab, but that does not do a good job for more complicated nonlinear models. I was recomended (by a Python user) to use MCMC, or something more robust for parameter estimation. Apparently I would need the "DREAM" toolbox, which is nowhere to be found. What should I use instead.
Can you give me an example of a basic code, something that I can just adapt slightly and change the 'model' function.
Attached is an example .csv data file (X vs Y). The model is the following (with 2 parameters Imax and k)
Y = 1+(Imax -1)*(k*X)./(55.3e3 + k*X);
The parameter starting values are Imax0 = 1 and k0 = 1e5.
Let's say, the parameters "prior distribution" should be "Uniform" and "posterior distribution" is "Normal".
I'd very much appreciate an example. Thanks!

1 Comment

Your example doesn't appear to be anything LSQNONLIN or LSQCURVEFIT would struggle with,
[X,Y]=readvars('MCMCDataFile.csv');
RHS=(Y-1)*55.3;
c=[(1-Y).*X, X]\RHS;
k0=c(1), Imax0=c(2)/k0+1
k0 = 272.5221
Imax0 = 1.8800
F=@(p,X) 1+(p(1) -1)*(p(2)*X)./(55.3 + p(2)*X);
opts=optimoptions('lsqcurvefit',StepTol=1e-12, OptimalityTol=1e-12, FunctionTol=1e-12);
[p,resn,res]=lsqcurvefit(F,[Imax0,k0], X,Y,[0,100],[],opts);
Local minimum possible. lsqcurvefit stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
x=linspace(min(X),max(X));
plot(X,Y,'o',x,F(p,x))

Sign in to comment.

Answers (2)

I assume your Python user friend was talking about PyDREAM,
and that you should call it using Matlab's Python interface, e.g.,
Python Code:
# run_dream.py
import numpy as np
from pyDREAM import DREAM
from pyDREAM.parameters import SampledParam
from pyDREAM.Distribution import Uniform
# Define likelihood function
def likelihood(params):
x, y = params
return -((x - 3)**2 + (y + 1)**2)
# Define priors
prior = [SampledParam(Uniform, -10, 10),
SampledParam(Uniform, -10, 10)]
# Run the DREAM sampler
dream = DREAM(prior, likelihood, niterations=1000)
chains = dream.run()
# Save the chains for MATLAB to read
np.save('chains.npy', chains)
Matlab Code:
% Call the Python script using pyrun
pyrun("exec(open('run_dream.py').read())");
% Load the output
chains = py.numpy.load('chains.npy');
chains_mat = double(chains); % Convert to MATLAB array

2 Comments

Thanks, for the suggestions, Matt! The "lsqcurvefit' works well for this simple model, but not for more complicated ones (when 2 parameters are somewhat correlated, etc.).
I'd like to try calling "pydream" from Matlab, but after installing pydream, some of the packages are not found, like the pyDREAM.Distribution. Below is what I have in the directory.
Also, what exactely is the "return" function -((x - 3)**2 + (y + 1)**2) supposed to describe?
Thanks!
when 2 parameters are somewhat correlated, etc
I don't know what that means. In what way aren't they correlated in your example? Regardless, if you only have 2 parameters, you can usually just do an exhastive parameter search. No need to fuss with evolutionary algorithms.
Also, what exactely is the "return" function -((x - 3)**2 + (y + 1)**2) supposed to describe?
A fake loglikelihood function. You would replace it with the real loglikelihood for your data distribution.

Sign in to comment.

Hi @Ella,
I read your comments,since I am using Matlab mobile and don’t have access to toolbox,I created a custom MCMC algorithm. Below, I will provide a detailed example that includes the necessary code, explanations, and final plots.
Step 1: Define the Model Function
define your model function in MATLAB.
function Y = model(params, X)
Imax = params(1);
k = params(2);
Y = 1 + (Imax - 1) * (k .* X) ./ (55.3e3 + k .* X);
end
Step 2: Define the Log-Likelihood Function
Next, define a log-likelihood function that will evaluate how well the model fits the data given the parameters.
function logL = logLikelihood(params, X, Y)
Y_pred = model(params, X);
residuals = Y - Y_pred;
logL = -0.5 * sum(residuals.^2); % Assuming normal distribution
end
Step 3: Implement the MCMC Algorithm
Now, implement the MCMC algorithm. This will involve proposing new parameter values, calculating the acceptance ratio, and deciding whether to accept or reject the new parameters.
function [samples, logL_values] = mcmc(X, Y, num_samples, initial_params)
samples = zeros(num_samples, length(initial_params));
logL_values = zeros(num_samples, 1);
current_params = initial_params;
current_logL = logLikelihood(current_params, X, Y);
for i = 1:num_samples
% Propose new parameters
proposed_params = current_params + randn(size(current_params));
% Random walk
% Calculate log-likelihood for proposed parameters
proposed_logL = logLikelihood(proposed_params, X, Y);
% Calculate acceptance ratio
acceptance_ratio = exp(proposed_logL - current_logL);
% Accept or reject the proposed parameters
if rand < acceptance_ratio
current_params = proposed_params;
current_logL = proposed_logL;
end
% Store the samples
samples(i, :) = current_params;
logL_values(i) = current_logL;
end
end
Step 4: Run the MCMC Simulation
Now, run the MCMC simulation using the provided data and initial parameter values.
% Provided data
X = [0; 0.333333333; 0.661417323; 0.984375; 1.302325581; 1.615384615;
1.923664122; 2.227272727];
Y = [1; 1.669344148; 1.712769626; 1.721821986; 1.73873249; 1.774320879;
1.792085924; 1.8151507];
% Initial parameters
initial_params = [1, 1e5]; % Imax0 and k0
num_samples = 10000; % Number of MCMC samples
% Run MCMC
[samples, logL_values] = mcmc(X, Y, num_samples, initial_params);
Step 5: Analyze and Plot the Results
Finally, analyze the results and create plots to visualize the parameter estimates and the fitted model.
% Parameter estimates
Imax_est = mean(samples(:, 1));
k_est = mean(samples(:, 2));
% Generate fitted values
Y_fit = model([Imax_est, k_est], X);
% Plotting
figure;
scatter(X, Y, 'filled', 'MarkerFaceColor', 'b'); % Original data
hold on;
plot(X, Y_fit, 'r-', 'LineWidth', 2); % Fitted model
xlabel('X');
ylabel('Y');
title('Data Fitting using MCMC');
legend('Data', 'Fitted Model');
grid on;
hold off;
% Display estimated parameters
fprintf('Estimated Imax: %.4f\n', Imax_est);
Estimated Imax: 2.1047
fprintf('Estimated k: %.4f\n', k_est);
Estimated k: 100068.4244
Please see attached.
So, if you follow the mentioned steps, you should be able to implement this basic MCMC algorithm for parameter estimation in MATLAB without relying on external toolboxes. You can adapt this simple model function as needed for different models. The provided code includes all necessary components, from defining the model to running the MCMC simulation and visualizing the results. Feel free to modify the number of samples or the proposal distribution to suit your specific needs.
Please note that comments provided by @Matt J sound reasonable and not a bad idea to follow his comments. However, we all are here to help you.
Hope this helps.

8 Comments

Torsten, thanks alot! This is exactely what I had in mind. It is intersting though that, in this simple case, the 'lsqcurvefit' does a better job than the MCMC. Are there other options with MCMC besides that " number of samples or the proposal distribution" that I can adjust for a better guess?
I only formatted and ran the code because I was curious to see the result. Questions should be addressed to @Umar
It is intersting though that, in this simple case, the 'lsqcurvefit' does a better job than the MCMC.
I'm not terribly familiar with this family of solvers, but I believe the DREAM algorithm, recommended by your friend, is more advanced than the basic MCMC algorithm. DREAM is an evolutionary algorithm, so is robust against local minima, whereas MCMC is not, and may have the same vulnerabilities as lsqcurvefit.
In that case, thanks Umar! I'd appreciate more feedback, see my follow-up question. Can you add a few more options, like defining a "prior" distribution, etc.
Ella
Hi @Ella,
No problem. Also, I don’t disagree with @Matt J recent comments. I also read your recent comments and it makes sense. By incorporating prior distributions into the MCMC framework certainly enhances the robustness of parameter estimation by allowing you to express your beliefs about the parameters before observing the data. In this case, I will use a uniform distribution as our prior for the parameters I_{max} and (k). Please see Step-by-Step Modifications mentioned below.
Define the Prior Distribution: define a uniform prior for both parameters. For instance, set the prior for (I_{max}) to be between 1 and 2, and for (k) between (1e4) and (1e6).
Modify the Proposal Step: Instead of a random walk based on a normal distribution,I would like to propose new parameters uniformly within the defined bounds.
Update the Acceptance Criterion: The acceptance criterion will remain the same, but we will ensure that the proposed parameters fall within the defined prior bounds.
Plotting and Results: Finally,visualize the results and display the estimated parameters.
Here is the modified MATLAB code:
function Y = model(params, X)
Imax = params(1);
k = params(2);
Y = 1 + (Imax - 1) * (k .* X) ./ (55.3e3 + k .* X);
end
function logL = logLikelihood(params, X, Y)
Y_pred = model(params, X);
residuals = Y - Y_pred;
logL = -0.5 * sum(residuals.^2); % Assuming normal distribution
end
function logPrior = logPrior(params)
% Define uniform priors
Imax_min = 1;
Imax_max = 2;
k_min = 1e4;
k_max = 1e6;
% Check if parameters are within the prior bounds
if (params(1) >= Imax_min && params(1) <= Imax_max) && ...
(params(2) >= k_min && params(2) <= k_max)
logPrior = 0; % Uniform prior (constant)
else
logPrior = -Inf; % Outside prior bounds
end
end
function [samples, logL_values] = mcmc(X, Y, num_samples, initial_params)
samples = zeros(num_samples, length(initial_params));
logL_values = zeros(num_samples, 1);
current_params = initial_params;
current_logL = logLikelihood(current_params, X, Y) + logPrior(current_params);
for i = 1:num_samples
% Propose new parameters uniformly within the prior bounds
proposed_params = [rand * (2 - 1) + 1, rand * (1e6 - 1e4) + 1e4];
% Calculate log-likelihood and log-prior for proposed parameters
proposed_logL = logLikelihood(proposed_params, X, Y) + logPrior(proposed_params);
% Calculate acceptance ratio
acceptance_ratio = exp(proposed_logL - current_logL);
% Accept or reject the proposed parameters
if rand < acceptance_ratio
current_params = proposed_params;
current_logL = proposed_logL;
end
% Store the samples
samples(i, :) = current_params;
logL_values(i) = current_logL;
end
end
% Provided data
X = [0; 0.333333333; 0.661417323; 0.984375; 1.302325581; 1.615384615;...
1.923664122; 2.227272727];
Y = [1; 1.669344148; 1.712769626; 1.721821986; 1.73873249; 1.774320879;...
1.792085924; 1.8151507];
% Initial parameters
initial_params = [1, 1e5]; % Imax0 and k0
num_samples = 10000; % Number of MCMC samples
% Run MCMC
[samples, logL_values] = mcmc(X, Y, num_samples, initial_params);
% Parameter estimates
Imax_est = mean(samples(:, 1));
k_est = mean(samples(:, 2));
% Generate fitted values
Y_fit = model([Imax_est, k_est], X);
% Plotting
figure;
scatter(X, Y, 'filled', 'MarkerFaceColor', 'b'); % Original data
hold on;
plot(X, Y_fit, 'r-', 'LineWidth', 2); % Fitted model
xlabel('X');
ylabel('Y');
title('Data Fitting using MCMC with Uniform Prior');
legend('Data', 'Fitted Model');
grid on;
hold off;
% Display estimated parameters
fprintf('Estimated Imax: %.4f\n', Imax_est);
Estimated Imax: 1.6264
fprintf('Estimated k: %.4f\n', k_est);
Estimated k: 528787.1207
Please see attached.
<</matlabcentral/answers/uploaded_files/1838413/IMG_3852.jpeg>>
<</matlabcentral/answers/uploaded_files/1838414/IMG_3851.jpeg>>
In the modified code provided above,you will notice the following
Log Prior Function: The logPrior function checks if the proposed parameters fall within the specified uniform bounds. If they do, it returns a log prior of zero; otherwise, it returns negative infinity, effectively rejecting those samples.
Proposal Mechanism: The proposal mechanism has been changed to generate parameters uniformly within the defined ranges for (I_{max}) and (k).
Final Results: The code concludes with a plot of the original data and the fitted model, along with printed estimates of the parameters.
Hope this helps.

Hi @Ella,

Please note that the original estimates for ( k ) and I_{max} were 100068.4244 and 2.1174, respectively. After incorporating the uniform prior distributions, the new estimates are I_{max} = 1.6233 and k = 533156.8786. Why?

The original estimates of ( k = 100068.4244 ) and ( I_{max} = 2.1174 ) were derived without the influence of prior distributions. After incorporating uniform priors, the new estimates are ( I_{max} = 1.6233 ) and ( k = 533156.8786 ).

You probably are aware that the uniform prior restricts the parameter space, effectively guiding the MCMC sampling towards more plausible values based on prior knowledge. This can lead to more reliable estimates, especially when the data is limited or noisy.

Interpretation of New Estimates:

Estimated ( I_{max} = 1.6233 ): This value suggests a more constrained upper limit of the model's response, indicating that the model's saturation point is lower than previously estimated. Estimated ( k = 533156.8786 ): This increase in ( k ) reflects a stronger response rate in the model, which may indicate a more rapid approach to saturation under the given conditions.

The constraints imposed by the priors can guide you with the estimation process, leading to more reliable and interpretable results.

Umar, this doesn't seem to fit. Possibly something is wrong with the definition of the likelihood function. Thanks, though!

Hi @Ella,

Am I missing something?

Sign in to comment.

Asked:

on 4 Aug 2025

Commented:

on 6 Aug 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!