## 2. ijpast-512-v16n

Int. J. Pure Appl. Sci. Technol., 16(1) (2013), pp. 7-19 International Journal of Pure and Applied Sciences and Technology
ISSN 2229 - 6107

Available online at www.ijopaasat.in

Research Paper
Statistical Bayesian Analysis of Experimental Data
Labdaoui Ahlam1, * and Merabet Hayet1
1 Department of Mathematics, University Constantine 1, Route of Ain El Bey, 25000 Constantine, * Corresponding author, e-mail: (ahlem_stat@live.fr)
Abstract:
The Bayesian researcher should know the basic ideas underlying
Bayesian methodology and the computational tools used in modern Bayesian
econometrics. Some of the most important methods of posterior simulation are
Monte Carlo integration, importance sampling, Gibbs sampling and the Metropolis-
Hastings algorithm. The Bayesian should also be able to put the theory and
computational tools together in the context of substantive empirical problems. We
focus primarily on recent developments in Bayesian computation. Then we focus on
particular models. Inevitably, we combine theory and computation in the context of
particular models. Although we have tried to be reasonably complete in terms of
covering the basic ideas of Bayesian theory and the computational tools most
commonly used by the Bayesian, there is no way we can cover all the classes of
models used in econometrics. We propose to the user of analysis of variance and
linear regression model.

Keywords:
Bayesian analysis, Markov Chain Monte Carlo Algorithms, regression
models.
1. Introduction
Regression is by far the larger the field of statistics, both theoretical and applied. This is the preferred
method of econometrics, and the practice of social science modeled on econometrics, "econometric
model" has come to mean any regression model, even without reference to economic problems.
The framework model of regression is defined by a variable to predict (or "dependent", dedicated
notation y), and a variable (simple regression) and multivariate (multiple regression) known predictor
variables (or "independent"). Regression is to construct a variable regressed ࢟
predictor variables as close as possible (in a sense to be specified) of the dependent variable. Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 8 Procedures classical linear regression, applicable to numeric variables, recently came to enlist the logistic regression and its variants for the variables categorized. Considerations of this module focus on linear regression, shall apply (mutatis mutandis) to various forms of regression. Statistical experimental data, regression can be considered as a special case of the analysis of variance, in the case of digital independent variables. For observational data, new problems arise, related to the fact that in general the predictor variables are not statistically independent. It is these problems that have focused my recent work. In the Bayesian framework, there is no fundamental difference between the observation and the parameter of a statistical model, both of which are considered variable quantities, so if we denote by x the given bill sampling f (x \ θ), and θ the model parameters considered (plus possibly latent variables) of prior formal inference requires updating of the conditional distribution f (θ \ x) parameter. Determining π (θ) and f (x \ θ) gives f (x, θ) by ݂(ݔ, θ) = ݂(ݔ \θ)* π(ߠሻ After observing x, we can use Bayes' theorem to determine the distribution of θ conditional on the data (or the posterior) (see ). ׬ ௙(௫\஘ሻ∗ ஠(஘ሻୢ(஘ሻ For the Bayesian approach, all the features of the posterior distribution are important for inference: time, quantile, etc . These quantities can often be expressed in terms of conditional expectation of a function of θ with respect to the law post ׬ ୦(஘ሻ௙(௫\஘ሻ∗ ஠(஘ሻୢ(஘ሻ
We can calculate the posterior distribution directly in the simple case or calculation is made by
MCMC simulation where the integral calculation is very complex.
In our work we first present the regression model and the simple and multiple logistic model then we
set the conditions for the use of algorithms Monte Carlo Markov Chain (MCMC) then we introduce
some MCMC algorithms, in particular the Metropolis-Hastings algorithm and the Gibbs sampling
method. Finally, we present the numerical results and their interpretations.
We used the software WinBUGS to estimate the parameters, and interpret the results of actual data,
WinBUGS (the MS Windows operating system version of BUGS: Bayesian
Analysis Using Gibbs Sampling) is a versatile package that has been designed to carry out Markov
chain Monte Carlo (MCMC) computations for a wide variety of Bayesian models (see ).
2. Methodology
2.1 Regression Models
2.1.1. Linear Regression Model:
Regression is for a type of problem where two continuous
quantitative variables X and Y have a role asymmetrical variable Y depends on the variable X.
The connection between the dependent variable Y and the independent variable X can be modeled as
a function of Y = α + β X+ߝ, (see )

Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 9 Y: dependent variable (explained) X: independent variable (predictor) α: intercept (value of Y for x = 0) β: slope (average variation of the value of Y for a one-unit increase of α et β can be calculated by : r = correlation coefficient = is another important determinant and looks a lot like β. r=ඥ∑(௑ି௑തሻమ ∑(௒ି௒തሻమ
r = measure for the strength of association between Y and X-data. The stronger the association, the
better Y predicts X.
2.1.2 Multiple Linear Models: The multiple regression model is a generalization of the regression
model Simple when the explanatory variables are finite in number. The connection between the
dependent variable Y and the independent variables ܺଵand ܺଶ can be modeled as a function of
Y= α + β1* X1 + β2* X2.

A linear regression model is defined by an equation of the form:
ܻ௡×ଵ = ܺ௡×௣ߚ௣×ଵ + ߝ௡×ଵ

Y: is an n-dimensional random vector.
X: is a matrix of size n × p known design matrix called experience.
β: is the p-dimensional vector of unknown model parameters
ε: the vector is centered, n-dimensional errors.

2.1.3 Logistic Model:
A standard qualitative regression and logistic regression model or logit model,
where the conditional distribution of y is zϵRp explanatory variables, (see ):

ܲ(ݕ = 1ሻ = 1 − ܲ(ݕ = 0ሻ = ୣ୶୮ (௭೟ఊሻ Consider the particular case where z= (1,ݔ) and ߛ= (α, β) random variables yi values in {0,1} are associated with explanatory variables were modeled using a Bernoulli conditional probability ݕ௜\ݔ௜ ∼ ܤ ቀ ୣ୶୮(ఈାఉ௫೔ሻ ቁ Assume that our parameters follow a priori law unsuitable π(α, β) = 1. The likelihood of our model for a sample (ݕଵ, ݔଵ),…,(ݕ௡,ݔ௡),is equal to ௡ ݂(ݕଵ, … . . , ݕ௡\ݔଵ, … … . . , ݔ௡, ߙ, ߚሻ = ∏ ୣ୶୮ሼ (ఈାఉ௫೔ሻ௬೔ሽ The posterior distribution of (α, β) is then deduced by formal application of Bayes Theorem, see  ߙ ∏௡ ୣ୶୮ሼ (ఈାఉ௫೔ሻ௬೔ሽ = ௘௫௣ሼ∑సభ(ఈାఉ௫೔ሻ௬೔ ሽ Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 10 2.2. MCMC Methods
The Monte Carlo Markov Chain (Monte Carlo Markov Chains in English or MCMC) is used when
interest law cannot be simulated directly by the usual methods and / or when its density is known to a
normalization constant fields.
2.2.1. Metropolis-Hasting Algorithm:
The Metropolis-Hastings algorithm based on the use of a
conditional density measurement ݍ(ݕ|ݔሻwith respect to the dominant model li. It cannot be put into
practice if ݍ(. |ݔሻ is simulated quickly and is available either analytically for a constant independent
of either symmetrical, that is to say as ݍ(ݕ|ݔሻ = ݍ(ݔ|ݕሻ. The Metropolis-Hastings algorithm (see
) associated with the objective law ߨ and the conditional ݍ produces a Markov chain ݔ(௧ሻ based on
the following transition:

Initialization: X0
At each step k ≥ 0:
• Simulate a value
• Simulate a value.
ܺ௞ାଵ ൜ݕ௞ ݂݅ ݑ௞ ≤ ߩ(ݔ௞, ݕ௞ሻ Or ߩ(ݔ௞, ݕ௞ሻ = min ൜1, గ(௬ೖሻ௤(௫ೖ│௬ೖሻൠ . The law ݍ is called the law of instrumental or proposal. This algorithm accepts systematically simulations ݕ௧ such that the ratio ቀߨ(ݕ௧ሻቚݍ൫ݕ௧หݔ(௧ሻ൯ቁ is greater than the previous value൬ߨ ቀ൫ݔ(௧ሻ൯ቁ ฬݍ൫ݔ(௧ሻหݕ௧൯൰. It is only in the symmetric case that acceptance is governed by the report ߨ(ݕ௧ሻ/ߨ(ݔ௧ሻ.
2.2.2 The Gibbs Sampling:
The Gibbs sampling algorithm is a simulation of a law π (x) such that:
x admits a decomposition of the form ݔ = (ݔଵ, . . . , ݔ௡ሻ, The conditional law ߨ௜ (. |(ݔଵ, . . . , ݔ௫ିଵ, ݔ௫ାଵ, . . . , ݔ௡ሻሻare easily simulated (see). Example: (ܺ, ܻሻ~ܰ(0, ∑), with ∑=ቀଵ ఘ Principle of the algorithm: Updating "component by component". ~ߨଵ൫. หܺଶ , … … … . , ܺ௡൯ ~ߨ௜(. ܺଵ , … … , ܺ௜ିଵ , ܺ௜ିଵ, … … . , ܺ௡ሻ 5 ~ߨଵ(. ܺଵ , … … … . , ܺ௡ିଵሻ Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 11 3. Applications
3.1 Example with WinBUGS: Linear Model “Calculated α and β”
Table 1 gives the real data of a crossover study comparing a new laxative versus a standard laxative,
bisacodyl. Days with stool are used as primary endpoint. The table shows that the new drug is more
efficacious than bisacodyl (see ).
Table 1: Example of a crossover trial comparing efficacy of a new
*Model with software WinBUGS Y-variables: new treatment (days with stool). X-variables: bisacodyl (days of stool). Y̴ N (mui, tau) Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 12 mui=α + β * Xi The model is: model { for(i in 1 : 35) { y[i] ~ dnorm(mu[i], tau) mu[i] <- alpha + beta * X[i] } alpha ~ dnorm(0, 1.0E-6) beta ~ dnorm(0, 1.0E-6) tau ~ dgamma(1.0E-3, 1.0E-3) sigma <- 1/sqrt(tau) } We then proceed to estimate, this time on two channels, with 110 000 iterations (1000 enough) each, keeping an iteration of 150. The parameters of the line are estimated, α = 8.669 with a standard deviation of 3.236 and β= 2.062 with a standard deviation of 0.2854. WinBUGS outputs are as follows: MC error 2.5% median 97.5% start
We now presenting a graphical representation of the parameters alpha and beta, of Kernel density in fig.1, quantiles in fig.2 and the auto correlation function in fig.3 Figure 1: Kernel density
Figure 2: Quantiles
Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 13 Figure 3: Autocorrelation function

3.2 Example with WinBUGS: Multiple Linear Model “Calculated α, β1 and
β2”
We may be Interested to know if age is an independent contributor to the effect of
the new laxative. That purpose for a simple regression equation has to be extended
as follows Y = α + β1 X1 + β2 X2, two partial regression coefficients are Called. Just like a simple
linear regression, multiple linear regressions can give us the best fit for the data given, although it is
hard to display the correlations in a figure. Table 2 gives the data from Table 1
extended by the variable age (see ).
Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 14 Table 2: Example of a crossover trial comparing efficacy of a new laxative versus bisacodyl
*Model with software WinBUGS Y-variables: new treatment (days with stool). X1-variables: bisacodyl (days of stool). X2-variables: age (years). Y̴ N (mu, tau) mu=α + β1 * X1 + β2*X2 The model is: model mu[i] <- alpha + beta1* X1[i] + beta2* X2[i] We then proceed to estimate, this time on two channels, with 110 000 iterations (1000 enough) each, keeping an iteration of 150. The parameters of the line are estimated, α = 2.332 with a standard deviation of 4.985 and β1= 1.876 with a standard deviation of 0.3003, β2= 0.282 with a standard deviation of 0.171. WinBUGS outputs are as follows: MC error 2.5%
We now present a graphical representation of the parameters alpha and beta (1), beta (2) of Kernel density in fig.4, quantiles, fig.5 and the graphical of auto correlation in fig.6: Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 15 Figure 4: Kernel density
Figure 5: Quantiles
Figure 6: Autocorrelation function
Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 16 3.3 Example with WinBUGS: Logistic Model

Our study is based on a comparison of an antiseptic cream and Placebo; as the endpoint is cure an
infection. We seek to estimate the effect of the cream versus placebo, the following table gives the
answer 8 centers that we have considered, see :
Table 3: Processed data
Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 17 {
for(i in 1 : 8) {
rp[i] ~ dbin(pp[i], np[i])
rc[i] ~ dbin(pc[i], nc[i])
logit(pp[i]) <- alpha - beta / 2 + u[i]
logit(pc[i]) <- alpha + beta / 2 + u[i]
u[i] ~ dnorm(0.0, tau)
}
alpha ~ dnorm(0.0, 1.0E-6)
beta ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.1, 0.1)
sigma <- 1/ sqrt(tau)
OR <- exp(beta)
}
We then proceed to estimate, this time on three channels, with 110 000 iterations (1000 enough) each,
keeping an iteration of 150. The (assumed homogeneous) cream is estimated at 0.757, with a standard
deviation of 0.304.
WinBUGS outputs are as follows:
sd MC error 2.5%
median 97.5%
We now presenting a graphical representation of the parameters alpha and beta, of Kernel density in fig.7, quantiles fig.8 and finally the auto correlation function in fig.9 Figure 7 : Kernel density
Figure 8: Quantiles
Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 18 Figure 9: Autocorrelation function
4. Discussion

In the linear regression model the regression line is Y = 2.065+X 8.646. The slope is 2.065 and directed the original is 8.646, and if we have x = 1 → y = 10 therefore the new treatment is better than the standard treatment. The regression line in the model of multiple linear regression is Y = 2332 + 0282 + X2 1.876X1 We add the parameter age or not the new treatment is the best. • As gold is greater than 1 and the confidence interval between 3.88 and 1191 at 97.5% it is said that our anti septic cream is effective.
5. Conclusion

One of the merits of our work is to have shown using experimental data of clinical trials that can be
modeled in a natural way and draw appropriate inferences, namely estimating parameters in
regression models: model simple and multiple linear and logit model using Monte Carlo methods for
Markov Chain (MCMC) especially as computer performance, made feasible processes effective
simulations and the availability of computer programs has facilitated the calculation of posterior
probabilities, which were previously daunting complexity.

6. Acknowledgement

We definitely want to thank Mr. Pierre Druilhet, Professor at the University of Blaise Pascal,
Clermont Ferrand, France, for his help and advice for successful completion of this work.
References

A. Agresti, Categorical Data Analysis (Volume 359), de Wiley Series in Probability and Statistics, 2002. A. Altaleb and C.P. Robert, Analyse bayésienne du modèle logit: Algorithme par tranches ou metropolis-Hastings? Revue de Statistique Appliquée, Tome, 49(4) (2001), 53-70. C.P. Robert and J.M. Marin, Bayesian Core: A Practical Approach to Computational Bayesian Statistics, Springer Texts in Statistics, 2007. C.P. Robert and G. Casella, Monte Carlo Statistical Methods, Springer, 2004. D.J. Lunn, A. Thomaa, N. Best and D. Spiegelhalter, WinBUGS – A Bayesian modeling framework: Concepts, structure and extensibility, Statistics and Computing, 10(2000), 325-337. É. Parent and J. Bernier, Le Raisonnement Bayésien, Springer-Verlag, France, Paris, 2007. L.R. França, Statistique Bayésienne, INSERM U669, Mai, 2009. Int. J. Pure Appl. Sci. Technol., 16(1) (2013), 7-19 19 C.P. Robert and G. Casella, Monte Carlo Statistical Methods, New York: Springer Verlag, 1999. C.P. Robert, L’analyse Statistique Bayésienne Economica, Paris, 1992. T.J. Cleophas, A.H. Zwinderman and T.F. Cleophas, Statistics Applied to Clinical Trials, Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands, 2006.