Skip to content
Snippets Groups Projects
Commit c70a796e authored by Pospelov, Gennady's avatar Pospelov, Gennady
Browse files

Manual: review of fitting section

parent a588cacf
No related branches found
No related tags found
No related merge requests found
......@@ -6,20 +6,23 @@ X-ray and neutron scattering by
multilayered samples, \BornAgain\ also offers the option to
fit the numerical model to reference data by modifying a selection of
sample parameters from the numerical model. This aspect
of the software is discussed in the following chapter.
\SecRef{FittingGentleIntroducion} gives a short introduction to the
basic concepts of data fitting. Users familiar with fitting can
directly proceed to \SecRef{FittingImplementation}, which details the
implementation of fittings in
\BornAgain\ . \Python\ fitting examples with detailed
of the software is discussed in the current chapter.
%\SecRef{FittingGentleIntroducion} gives a short introduction to the
%basic concepts of data fitting. Users familiar with fitting can
%directly proceed to \SecRef{FittingImplementation}, which details the
%implementation of fittings in
%\BornAgain\ .
\SecRef{FittingImplementation} details the
implementation of fittings in \BornAgain\ .
\Python\ fitting examples with detailed
explanations of every fitting step are given in \SecRef{FittingExamples}. Advanced fitting techniques, including fine tuning of minimization
algorithms, simultaneous fit of different data sets, parameters
correlation, are covered in
\SecRef{FittingAdvanced}. \SecRef{FittingRightAnswers} contains some practical advise which might
\SecRef{FittingAdvanced}. \SecRef{FittingRightAnswers} contains some practical advice which might
help the user to get right answers from \BornAgain\ fitting.
\input{FittingGentleIntroduction}
%\input{FittingGentleIntroduction}
\input{FittingImplementation}
......
......@@ -13,7 +13,7 @@ The script can also be found at
\noindent
This example uses the same sample geometry as in \SecRef{Example1Python}.
Cylindrical and
prismatic particles in equal proportion, in an air layer, are deposited on a substrate layer, with no interference
prismatic particles in equal proportion are deposited on a substrate layer, with no interference
between the particles. We consider the following parameters to be unkown
\begin{itemize}
\item the radius of cylinders,
......@@ -70,7 +70,7 @@ def get_sample(): @\label{script2::get_sample}@
# air layer with particles and substrate form multi layer
air_layer = Layer(m_air)
air_layer.setDecoration(particle_decoration)
substrate_layer = Layer(m_substrate, 0)
substrate_layer = Layer(m_substrate)
multi_layer = MultiLayer()
multi_layer.addLayer(air_layer)
multi_layer.addLayer(substrate_layer)
......@@ -143,17 +143,8 @@ the screen the information about fit progress once per 10 iterations.
\end{lstlisting}
Lines ~\ref{script2::fitpars1}--~\ref{script2::fitpars2} enter the
list of fitting parameters. Here we use the cylinders' height and
radius and the prisms' height and half side length. The syntax of
\Code{addFitParameter} is
\begin{lstlisting}[language=python, style=eclipse,numbers=none]
FitSuite().addFitParameter(<name>, <initial value>, <iteration step>, <limits>)
\end{lstlisting}
where \Code{<name>} is the name of the sample pool parameters (see \SecRef{WorkingWithSampleParameters}
) selected
as a fitting parameter. Then we input its initial
value and the iteration step to be used in the minimization process. Finally
\Code{<limits>} specify the boundaries of the parameter's value. Here
the cylinder's length and prism half side are initially equal to $4\,{\rm nm}$,
radius and the prisms' height and half side length.
The cylinder's length and prism half side are initially equal to $4\,{\rm nm}$,
whereas the cylinder's radius and the prism length are equal to $6\,{\rm nm}$ before the minimization. The
iteration step is equal to $0.01\,{\rm nm}$ and only the lower
boundary is imposed to be equal to $0.01\,{\rm nm}$.
......
......@@ -33,7 +33,7 @@ as a function of $(x,y)$, the positions on the detector ``measured'' in this toy
The scattering picture, similar to some GISAS patterns, has been generated using the simple function
$$I(x,y) = G(0.1,~0.01) + \frac{\sin(x)}{x} \cdot \frac{\sin(y)}{y}$$
$$I(x,y) = G(0.1,~0.01) + sincx(x) \cdot sinc(y)$$
Here $G(0.1, 0.01)$ is a random variable distributed according to a Gaussian distribution
with a mean of 0.1 and a standard deviation $\sigma=0.01$.
Constant $0.1$ symbolizes our experimental background and constant $0.01$ refers
......@@ -54,7 +54,7 @@ Figure~\ref{fig:toyfit_data},right shows the intensity as a function
of $(x,y)$ calculated using our model and a fixed set of parameters
$p_0=0, p_1=1, p_2=0, p_3=0$.
Two distributions are qualitatively identical. However in order to find the
The two distributions are qualitatively identical. However in order to find the
values of parameters which best describe the experimental data, one has to
\begin{itemize}
\item define criteria for the difference between the actual data and the model,
......@@ -103,8 +103,7 @@ to total number of detector pixels.
The model function has the form
$f(\mathbf{x_{i}},\mathbf{p})$
where adjustable parameters are held in vector $\mathbf{p}$.
The least squared method finds the optimum of the model function which best
fits the data by searching the minimum of the sum of squared
The least squares method finds the optimum of the model function by searching the minimum of the sum of squared
residuals
$$ \chi^{2}(\mathbf{p}) = \sum_{i=1}^{N}r_{i}^{2}$$
where the residual is defined as the difference between the measured value and the value predicted by the model.
......@@ -145,7 +144,7 @@ The $\chi^2$
objective function is obtained by calculating the sum of squared residuals between
the measured (Fig.~\ref{fig:toyfit_data}, left) and the
predicted (Fig.~\ref{fig:toyfit_data}, right) values over $x,y$ space. It is defined
in parameter space $\mathbf{p}$, which have 4 dimensions.
in parameter space $\mathbf{p}$, which has 4 dimensions.
Figure~\ref{fig:toyfit_chi2} (left) shows the $\chi^2$ distribution as a function of
parameters $p_1,p_2$ while parameters $p_0,p_3$ remain fixed.
......@@ -155,7 +154,7 @@ parameters $p_2,p_3$ while parameters $p_0,p_1$ remain fixed.
One can see that the given objective function has a strongly
pronounced global minimum, which we aim to determine. In addition the
objective function presents a number of local minima.
The presence of these minima leads to a poor or slow convergence
The presence of these minima leads to slow, or even no convergence
towards the single global minimum.
......
......@@ -5,12 +5,14 @@
Fitting in \BornAgain\ deals with estimating the optimum parameters
in the numerical model by minimizing the difference between
numerical and reference data using $\chi^2$ or maximum likelihood methods. The features include
numerical and reference data.
%using $\chi^2$ or maximum likelihood methods.
The features include
\begin{itemize}
\item a variety of multidimensional minimization algorithms and strategies.
\item the choice over possible fitting parameters, their properties and correlations.
\item the full control on $\chi^2$ calculations, including applications of different normalizations and assignments of different masks and weights to different areas of reference data.
\item the full control on objective function calculations, including applications of different normalizations and assignments of different masks and weights to different areas of reference data.
\item the possibility to fit simultaneously an arbitrary number of data sets.
\end{itemize}
......@@ -24,8 +26,9 @@ Fitting work flow.
}
\label{fig:minimization_workflow}
\end{figure}
Before running the fitting the user is required to prepare some data and to
configure the fitting kernel of \BornAgain\ . The required stages consist in
configure the fitting kernel of \BornAgain\ . The required stages are
\begin{itemize}
\item Preparing the sample and the simulation description (multilayer, beam, detector parameters).
......@@ -96,21 +99,20 @@ to read it before proceeding any further.
The user specifies selected sample parameters as fit parameters using \Code{FitSuite}
and its \Code{addFitParameter} method
\begin{lstlisting}[language=shell, style=commandline]
fit_suite = FitSuite()
fit_suite.addFitParameter(<name>, <value>, <AttLimits>)
fit_suite.addFitParameter(<name>, <initial value>, <step>, <limits>)
\end{lstlisting}
Here \Code{<name>} corresponds to the parameter name in the sample's parameter pool.
By using wildcard's in the parameter name the group of sample parameters, corresponding to the given
pattern, can be associated with single fitting parameter and
fitted simultaneously to get common optimal value.
The second parameter \Code <value> correspond to the initial value of
the fitting parameter
while the third one \Code{<AttLimits>} corresponds to
the boundaries imposed on the range of variations of that value. It can be
where \Code{<name>} corresponds to the parameter name in the sample's parameter pool.
By using wildcards in the parameter name, a group of sample parameters, corresponding to the given
pattern, can be associated with a single fitting parameter and
fitted simultaneously to get a common optimal value (see \SecRef{WorkingWithSampleParameters}).
The second parameter \Code <initial value> correspond to the initial value of
the fitting parameter, while the third one
is responsible to the initial iteration steps size.
The last parameter \Code{<AttLimits>} corresponds to
the boundaries imposed on parameter value. It can be
\begin{itemize}
\item \Code{limitless()} by default,
\item \Code{fixed()},
......@@ -144,25 +146,25 @@ this feature in the following releases and welcome users' requests on this subje
}
\vspace*{1mm}
To associate the simulation with the reference data, method \newline
To associate the simulation and the reference data to the fitting engine, method \newline
\Code{addSimulationAndRealData} has to be used as shown
\begin{lstlisting}[language=python, style=eclipseboxed,numbers=none]
fit_suite = FitSuite()
fit_suite.addSimulationAndRealData(<simulation>, <reference>, <chi2_module>)
\end{lstlisting}
Here \Code{<simulation>} corresponds to \BornAgain\ simulation object
Here \Code{<simulation>} corresponds to a \BornAgain\ simulation object
with the sample, beam and detector fully defined, \Code{<reference>}
corresponds to the experimental data object obtained from the ASCII file and \Code{<chi2\_module>} is an optional parameter for advanced
control of $\chi2$ calculations.
It is possible to call this given method more than once to submit more than one pair of
\Code{<simulation>, <reference>} to the fitting procedure and so in
order to proceed to simultaneous fits of
some combined data sets.
\Code{<simulation>, <reference>} to the fitting procedure.
In this way, simultaneous fits of
some combined data sets are performed.
By using the third \Code{<chi2\_module>} parameter different normalizations and weights
can be applied to let the user in full control of the way $\chi2$ is calculated.
By using the third \Code{<chi2\_module>} parameter, different normalizations and weights
can be applied to give user full control of the way $\chi2$ is calculated.
This feature will be explained in \SecRef{FittingAdvanced}.
......@@ -175,7 +177,7 @@ This feature will be explained in \SecRef{FittingAdvanced}.
libraries. They are listed in Table~\ref{table:fit_minimizers}.
By default \Code{Minuit2} minimizer with default settings will be used and no additional
configuration needs to be done.
The remainder of this section explains some of the expert setting, which can be applied to get better
The remainder of this section explains some of the expert settings, which can be applied to get better
fit results.
The default minimization algorithm can be changed using
......@@ -276,7 +278,7 @@ fit_suite.runFit()
\end{lstlisting}
Depending on the complexity of the sample and the number of free sample parameters the fitting
process can count from tenths to thousands of iterations. The results of the fit can
process can take from tens to thousands of iterations. The results of the fit can
be printed on the screen using the command
\begin{lstlisting}[language=python, style=eclipseboxed, numbers = none]
fit_suite.printResults()
......
......@@ -8,12 +8,13 @@ local minima in the objective function. Many problems can cause the
fit to fail, for example:
\begin{itemize}
\item an unreliable physical model,
\item an unappropriate choice of objective function
\item multiple local minima,
\item an unphysical behavior of the objective function, unphysical regions
in the parameters space,
\item an unreliable parameter error calculation in the presence of
limits on the parameter value,
\item often an exponential behavior of the objective function and the
\item an exponential behavior of the objective function and the
corresponding numerical inaccuracies, excessive numerical roundoff
in the calculation of its value and derivatives,
\item large correlations between parameters,
......@@ -34,10 +35,10 @@ fitting. It remains applicable to any fitting program and any kind of theoretica
\item provide a good initial guess for the fit parameters,
\item start from the default minimizer settings and perform some fine tuning after some experience has been acquired,
\item repeat the fit using different starting values for the parameters or their limits,
\item repeat the fit fixing and varying different groups of parameters,
\item use \Code{Minuit2} minimizer with \Code{Migrad} algorithm
(default) to get the most reliable parameter error estimation,
\item try \Code{GSLMultiFit} minimizer or \Code{Minuit2} minimizer with \Code{Fumili} algorithm to get fewer iterations.
\item repeat the fit, fixing and varying different groups of parameters,
%\item use \Code{Minuit2} minimizer with \Code{Migrad} algorithm
% (default) to get the most reliable parameter error estimation,
%\item try \Code{GSLMultiFit} minimizer or \Code{Minuit2} minimizer with \Code{Fumili} %algorithm to get fewer iterations.
%\subsection*{Interpretation of errors.}
......
......@@ -46,7 +46,7 @@ The framework consists of two shared libraries, \Code{libBornAgainCore} and
The library \Code{libBornAgainFit} contains a number of minimization engines
and interfaces to them, allowing the user to fit real data with the model previously defined.
\BornAgain\ depends from a few external and well established
\BornAgain\ depends on a few external and well established
open-source libraries: \Code{boost}, GNU scientific library, Eigen and
Fast Fourier Transformation libraries. They are required to be
installed on the system to run \BornAgain\ on Unix Platforms. In the
......
No preview for this file type
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment