This Python tutorial will teach you how to use the “Python Scipy Curve Fit” method to fit data to various functions, including exponential and gaussian, and will go through the following topics.
Table of Contents
The curve_fit() method of module scipy.optimize that apply non-linear least squares to fit the data to a function.
The syntax is given below.
scipy.optimize.curve_fit(f, xdata, ydata, p0=None, sigma=None, absolute_sigma=False, check_finite=True, bounds=(- inf, inf), method=None, jac=None, full_output=False, **kwargs)
Where parameters are:
The method curve_fit() returns popt (The parameters should be set at their optimum values to minimize the sum of the squared residuals of “f(xdata, *popt) – ydata.”), pcov ( popt’s estimated covariance. The parameter estimate’s variance is provided by the diagonals.), infodict (an optional outputs dictionary with the keys) and mesg (a string message containing details about the answer).
Now we will use this method to fit the data in the following subtopics.
The form of the charted plot is what we refer to as the dataset’s distribution when we plot a dataset, like a histogram. The bell curve, usually referred to as the Gaussian or normal distribution, is the most frequently seen shape for continuous data.
Let’s fit the data to the gaussian distribution using the method curve_fit by following the below steps:
Import the required methods or libraries using the below python code.
from scipy.optimize import curve_fit import numpy as np import matplotlib.pyplot as plt
Create x and y data using the below code.
x_data = [ -7.0, -6.0, -10.0, -9.0, -8.0, -1.0, 0.0, 1.0, 2.0, -5.0, -4.0, -3.0, -2.0, 7.0, 8.0, 3.0, 4.0, 5.0, 6.0, 9.0, 10.0] y_data = [ 8.3, 10.6,1.2, 4.2, 6.7, 15.7, 16.1, 16.6, 11.7, 13.5, 14.5, 16.0, 12.7, 10.3, 8.6, 15.4, 14.4, 14.2, 6.1, 3.9, 2.1]
To make use of NumPy arrays’ useful capabilities, convert x_data and y_data into them.
x_data = np.asarray(x_data) y_data = np.asarray(y_data) plt.plot(x_data, y_data, 'o')
Create a Gaussian function using the below code.
def Gaussian_fun(x, a, b): y_res = a*np.exp(-1*b*x**2) return y_res
Now fit the data to the gaussian function and extract the required parameter values using the below code.
params, cov = curve_fit(Gaussian_fun, x_data, y_data) fitA = params[0] fitB = params[1] fity = Gaussian_fun(x_data, fitA, fitB)
Plot the fitted data using the below code.
plt.plot(x_data, y_data, '*', label='data') plt.plot(x_data, fity, '-', label='fit') plt.legend()
From the output, we have fitted the data to gaussian approximately.
The independent variables can be passed to “curve fit” as a multi-dimensional array, but our “function” must also allow this. Let’s understand with an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy import optimize import numpy as np
Create a function that will call the array P and unpack it to p and q using the below code.
def fun(P, x, y, z): p,q = P return np.log(x) + y*np.log(p) + z*np.log(q)
Create some noisy data to fit using the below code.
p = np.linspace(0.1,1.2,100) q = np.linspace(1.1,2.1, 100) x, y, z = 8., 5., 9. z = fun((p,q), x, y, z) * 1 + np.random.random(100) / 100
Define initial guess and fit data to multiple variables using the below code.
p0 = 7., 3., 6. print(optimize.curve_fit(fun, (p,q), z, p0))
The fit parameters are initially estimated using the “curve fit” procedure using values of 1.0. However, there are instances where the fit will not converge, in which case we must offer a wise assumption as a starting point. Let’s see with an example by following the below steps:
Import the required libraries or methods using the below python code.
from scipy import optimize import numpy as np
Here, we’ll specify some data that are similarly spaced in time and a range of temperatures in the hopes that they will fit an exponential that resembles a charging capacitor. In addition to defining error bars on the temperature values, we take this array of temperatures and add some random noise to it.
def capcitor(x, y, z): return y*(1-np.exp(z*x)) t = np.linspace(0.5,3.0, 9) tempretures = np.array([14.77, 18.47, 20.95, 22.62, 23.73, 24.48, 24.98, 25.32, 25.54]) tempretures = tempretures + 0.4*np.random.normal(size=len(tempretures)) dTempretures = np.array([1.3, 0.8, 1.1, 0.9, 0.8, 0.8, 0.7, 0.6, 0.6])
Now fit data using the below code.
fit_Params, fit_Covariances = optimize.curve_fit(capcitor, t, tempretures) print(fit_Params) print(fit_Covariances)
The code above won’t work if we run it. Take a look at the resulting error message.
As soon as we add some educated guesses (p0) for a and b, we’ll see that the fit is now perfect.
fit_Params, fit_Covariances = optimize.curve_fit(capcitor, t, tempretures, p0 = [30.0, -1.0]) print(fit_Params) print(fit_Covariances)
This is how to use the initial guesses with the method curve_fit() for fitting.
The method curve_fit() of Python Scipy accepts the parameter maxfev that is the maximum number of function calls. In the above subsection, When run fit the function to a data without initial guess, it shows an error Optimal parameters not found: Number of calls to function has reached maxfev = 600 .
That means the function is called 600 times and didn’t find any optimal parameters. Let’s increase the value of the argument maxfev and see if it finds the optimal parameters. So here we will take the same example as we have taken in the above subsection “Python Scipy Curve Fit Initial Guess”.
fit_Params, fit_Covariances = optimize.curve_fit(capcitor, t, tempretures, maxfev=800) print(fit_Params) print(fit_Covariances)
From the output, we can see that the optimal parameters are found when the function is called 800 times.
The curve_fit() method in the scipy.optimize the module of the SciPy Python package fits a function to data using non-linear least squares. As a result, in this section, we will develop an exponential function and provide it to the method curve fit() so that it can fit the generated data.
Let’s take an example by following the below steps:
Import the required libraries using the below python code.
import numpy as np import matplotlib.pyplot as plt %matplotlib inline from scipy import optimize
Create an exponential function using the below code.
def expfunc(x, y, z, s): return y * np.exp(-z * x) + s
Use the code below to define the data so that it can be fitted with noise, fit for the parameters of the function “expfunc” and also restrict the optimization to a specific area.
x_data = np.linspace(0, 5, 60) y_data = expfunc(x_data, 3.1, 2.3, 1.0) random_numgen = np.random.default_rng() noise_y = 0.3 * random_numgen.normal(size=x_data.size) data_y = y_data + noise_y plt.plot(x_data, data_y, 'b-', label='data') p_opt, p_cov = optimize.curve_fit(expfunc, x_data, data_y) plt.plot(x_data, expfunc(x_data, *p_opt), 'r-', label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(p_opt))
From the above output, we can see the fitted data to an exponential function using the method curve_fit() , this is how to fit the data to an exponential function.
You may also like to read the following Python SciPy tutorials.
So, in this tutorial, we have learned about the “Python Scipy Curve Fit” and covered the following topics.
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.