Installation
Using CAMEL
Tutorials
|
Likelihood-based statisticsOn this page... (hide) 1. Working environment
2. Model+parametersWe adress the statistical problem of parameters inference (which is different from model testing): asumming a model is correct what can we say about its parameters given some measurements ("data") ? We will use for that the CAMEL software that allows to perform parameter estimates in both the frequentist (profile-likelihoods) and Bayesian (MCMC) approaches. This gives some idea about the influence of a statistical method used to analyse data. Note that the example here is simple and parameters well constrained, so both methods give similar results. But this may not be the case if you work on more general models (as modified GR). Think about using CAMEL if you need some confidence about your statistical method (as checking the influence of priors in MCMC approach). We will first learn how to write a parameter file ("parFile") which is a CAMEL (text) file that describes your inputs. The very first questions you should ask are:
Here we will work on the popular (flat) Lambda-CDM model: How many free parameters does it contain? In the following for CPU reasons, we will focus only on 3 ones, ie. we assume the others are fixed to some default values. "Data" will be represented by a (fake) power spectrum measurement at different redshifts and some BAO scale. In linear theory, the matter power-spectrum shape is mostly sensitive to the Hubble constant, and matter densities where one should distinguish between the standard "baryonic" one and the "cold dark matter" contributions. which behave very differently
In the following we will call your choice of parameters "par1,par2,par3" and the order of magnitude guess as "guess1,guess2,guess3" Instructions
Edit there with your preferred editor a new file called for instance and complete it with your choices: fix par1 cosmo guess1 fix par2 cosmo guess2 fix par3 cosmo guess3 class sBBN\ file bbn/sBBN.dat precisionFile=class_pre/hpjul2.pre
The starting keyword "class" can be used to transmit parameters to CLASS (similar to explanatory.ini but without the = sign).
Validation:the All executables are in EXECDIR which corrresponds to the $EXECDIR/writeSpectraPk lcdm.par 2000 output.fits 0.5 The third argument is lmax (here 2000) for Cl(CMB), the next is the name for the fits output file, and the last one (here 0.5) is the redshif (z) at which the matter spectrum is computed. You may now have a look at the Pk spectrum (which is in the 3d header of the output.fits file). You can do this in python. If you are not familiar here is how to (but better if you do it yourself). DOCKER: use your own python software since you can access the output shared area. does the spectrum looks "OK"? 3. LikelihoodA likelihood is a real-valued function of the parameters(a vector, here (par1,par2,par3)) that returns a high value when the model matches the data and low values otherwise. If correctly written (a big job in real life) it contains all the information concerning observables. We will equally use the term "chi2" which by definition is chi2(pars)=-2 ln[L(pars)] so that a "high likelihood" means a "low chi2". An essential piece of information is the value of pars for which L(pars) is maximal or chi2(pars) minimal. This is the Maximum Likelihood Estimator (MLE) and is most often the best one can do to infer which set of pars describes best the data. We work with a some very basic likelihood, which assumes we did measure a set of "Pk" measuremnts at redshifts of z=0.5,1,1.5 over some k range and we know exactly the associated Gaussian errors (this is a very over-simplified example but it will really happen soon with the EUCLID/LSST missions, with a much higher precision than what we are playing with today). We also add another likelihood related to the BAO peak position. The "Pk" likelihoods value (or chi2, a single number) of a set of cosmological parameters pars=(par1,par2,par3) is in fact computed in two steps:
InstructionsTo use these likelihoods add the following lines to lcdm.par: fakePk1=fake/pk_0.5.lik fakePk2=fake/pk_1.lik fakePk3=fake/pk_1.5.lik fakeBAO=fake/BAO_scale.lik
(python: you may use
Then we add to lcdm.par some CLASS directives to compute the spectrum only at the right redshifts (the default being z=0). For optimization with this likelihood we also don't need CLASS to compute spectra above some k_max value right? Then complete X,Y,Z (which is a comma-separated list without spaces) and W in the following CLASS-specific parameters: class z_pk X,Y,Z class P_k_max_1/Mpc W To check that the parFile is correctly configured we then run the $EXECDIR/writeChi2 lcdm.par What is happening (see the log messages) is that the likelihood is lacking a parameter, the (galactic) bias here. It is not a cosmological one but a nuisance parameter, that comes from the fact that one does not measure directly the full matter power spectrum but the power spectrum inferred from some population of galaxies. It is not a well known value (and actually varies with the galaxy selection and redshift range, here for simplicity we assume it is the same in all the likelihoods) There are generally many nuisance parameters qualifying for instance possible systematic errors in the measurement (values we are not very sure about). They impact directly the likelihood (not the cosmological model) and should be fitted together with cosmological parameter (whose precision will be decreased when adding them) So finally what is called "parameters" is a set of (cosmos,nuisances) ones. The likelihood actually depends on them through :
Complete the parFile adding the nuisance parameter (in this case instead of "cosmo", use the keyword "nui"). For the moment we will fix its value to 1.07, so add fix bias nui 1.07 Rerun 4. The Maximum Likelihood Estimator, a.k.a. "best-fit"Given some measurement(s), we are looking for the set of parameters (within a model) that is the most likely to produce those data (note the "likely" semantic, it is not the same than "probable" a confusion often done by Bayesians). Technically this is performed by looking for the parameters that minimize the chi2(pars) function (or maximizes the likelihood). "pars" is a vector that includes both cosmological and all the nuisance parameters. Here it has low dimensionality but it is frequent to go up to 10, 30 or 100 parameters. So we are dealing with the minimization of a multi-dimensional function. There are several algorithms to do that but there is no magical one (unless your likelihood is linear which is rarely the case). Furthermore we don't have access here to the analytical gradients of the function which complicates the convergence. We use in CAMEL the MINUIT package which is widely used in High Energy Physics. However remember: no magic, you always have to experiment with such tools to ensure your really captured the minimum chi2 solution. This means trying different starting points and check that the solution is not blocked at some boundary for example. So far our parameters were fixed. Let us release one.
Modify one of your parameter adding a (rough) estimate of its error and some bounds within which it should be varied
with guesses for the "error" and "bounds" (use spaces or tabs to separate the fields) Also to see what is happening during the minimization add to your parFile verbose=true and run with $EXECDIR/Minimize lcdm.par bf.txt hess.txt (the last two arguments are some outputs for the best-fit/hessian that will be written) What is then shown are values that are being tried by MINUIT: the last 2 columns are of interest also: they gives respectively the chi2 value for this set of parameters and the time computation of the model (in s). Don't be worried at the end by this kind of message: WARNING: FunctionMinimum is invalid. which is hard to avoid because of the numerical noise. The minimum can be actually valid (but maybe not the estimate of its error, see the last part). At the end you need to check the values under: # ext. || Name || type || Value || Error +/- did the parameter reach a boundary? does the error seem correct? Modify slightly the starting values (or errors) and rerun. Do you obtain the same minimum? Why? Which one is better? Now release all the parameters but the bias one (ie. modify the "fix" lines by "par" ones) and look for the global best-fit. This is essentially a try+guess work (on batch centers this is performed more automatically), here a few recommendations:
When you have a reasonable first guess of the 3 parameters you may try releasing also the "bias" parameter (but do not spend too much time if it hardly converges, rather go on next). 5. Profile-likelihoodsThe MLE (which is classical statistics) gives you a solution in a multi-dimensional space which is not very convenient to visualize. We would like to make statements about each variable separately What can we say about individual parameters? Only here do you have to choose your religion and decide what (philosophically) a probability means! It's time to ask yourself '''are you a Bayesian or a frequentist?
Certainly the most sane way for a scientist is to be agnostic and compare both approaches (which, when the problem is well constrained as here gives in practice similar results but it is not always the case). Profile-likelihoods are the frequentist solution (MCMC sampling is the Bayesian answer) to estimating intervals for variables. Rather than writing a lot of theory, let see in practice how it works. We choose one variable, here H0 and we will profile it:
Think about it: where is the minimum located wrt to the global-best fit (ie. the MLE, when all the parameters free)? Sometimes starting from a slightly different point leads to a slighty better chi2 and you should use this one. Here it may be a bit cumbersome because you are doing it by hand. In real life, there are some tools to send several minmizations (with differemt starting point values) to a cluster and take the best chi2, but it is outside the scope of this school. You may look at a more real-life case here (see Fig.4). Finally write your statement about H0 , using the "+-" notation. 6. ConclusionsThe goal of this lecture is to show that minimizing a multi-dimensional likelihood to get parameter estimates is not always a trivial task because observables in cosmology are generally complicated functions of the parameters. This may require trying different "staring values" or "error estimated" for the fit, to find the best one (you should automatize this process in real life). You may have also to try to increase the precision of the computations in CLASS (there are many parameters in CLASS for that), since numerical noise may spoil the numerical convergence to the minimum. Profile-likelihoods are a powerfull way of estimating ranges for individual parameters in a "frequentist" way. Note there is not such thing as "priors" here. So if you used an MCMC for your parameter estimation, getting a similar result with profile-likelihood will give you good confidence you are not actually pulling your results with the priors (even if you used implicitely 1 which is not non-informative). This is very valuable in particular when you work with models which are less constrained than LCDM (modified GR). |