SlideShare a Scribd company logo
5th International Summer School
Achievements and Applications of Contemporary Informatics,
Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 3-15, 2010




                        On Foundations of Parameter Estimation for
                              Generalized Partial Linear Models
                        with B–Splines and Continuous Optimization

                                             Gerhard-Wilhelm WEBER
                              Institute of Applied Mathematics, METU, Ankara,Turkey
                            Faculty of Economics, Business and Law, University of Siegen, Germany
                         Center for Research on Optimization and Control, University of Aveiro, Portugal
                                             Universiti Teknologi Malaysia, Skudai, Malaysia

                                                     Pakize TAYLAN
                           Department of Mathematics, Dicle University, Diyarbakır, Turkey

                                                          Lian LIU
                        Roche Pharma Development Center in Asia Pacific, Shangai China
Outline


•      Introduction
•      Estimation for Generalized Linear Models
•      Generalized Partial Linear Model (GPLM)
•      Newton-Raphson and Scoring Methods
•      Penalized Maximum Likelihood
•      Penalized Iteratively Reweighted Least Squares (P-IRLS)
•      An Alternative Solution for (P-IRLS) with CQP
•      Solution Methods
•      Linear Model + MARS,     and Robust CMARS
•      Conclusion
Introduction


The class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool.

This popularity is due to:

• The flexibility of GLM in addressing a variety of statistical problems,
• The availability of software (Stata, SAS, S-PLUS, R) )to fit the models.


The class of GLM is an extension of traditional linear models allows:

 The mean of a dependent variable to depend on a linear predictor by a nonlinear link function......

 The probability distribution of the response, to be any member of an exponential family of distributions.

   Many widely used statistical models belong to GLM:

o linear models with normal errors,
o logistic and probit models for binary data,
o log-linear models for multinomial data.
Introduction


Many other useful statistical models such as with

•    Poisson, binomial,
•    Gamma or normal distributions,


can be formulated as GLM by the selection of an appropriate link function
and response probability distribution.

A GLM looks as follows:


                                       i  H ( i )  xiT  ;


•   i  E(Yi ) : expected value of the response variable Yi   ,
•   H:            smooth monotonic link function,
•   xi   :       observed value of explanatory variable for the i-th case,
•       :       vector of unknown parameters.
Introduction


•   Assumptions:          Yi   are independent and can have any distribution from exponential family density

                               Yi ~ fY ( yi ,i , )
                                       i


                                         y   b ( )           
                                   exp  i i i i  ci ( yi , )  (i  1, 2,..., n),
                                         ai ( )                

•   ai , bi , ci   are arbitrary “scale” parameters, and   i   is called a natural parameter .

•   General expressions for mean and variance of dependent variable             Yi   :


                                    i  E (Yi )  bi' (i ),
                                    Var (Yi )  V ( i ) ,
                                    V ( i )  bi" (i ) i , ai ( ) :  / i .
Estimation for GLM

•    Estimation and inference for GLM is based on the theory of

•    Maximum Likelihood Estimation


•    Least –Squares approach:


                                             n
                            l (  ) :       (  y 
                                            i 1
                                                   i   i   i
                                                                bi (i )   ci ( yi , )).


•    The dependence of the right-hand side on                     is solely through the dependence of the             i   on      .

•    Score equations:

                      n
                                  i    -1
                    x                   Vi  yi  i   0,
                                  i
                            ij
                     i 1                
                     i i  xij   i  j  ,                 xi 0  1    (i = 1, 2,..., n; j = 0,1,..., m) .

•    Solution for score equations given by Fisher scoring procedure based on the a
     Newton-Raphson algorithm.
Generalized Partial Linear Models                                      (GPLMs)


•   Particular semiparametric models are the Generalized Partial Linear Models (GPLMs) :

    They extend the GLMs in that the usual parametric terms are augmented by a
    single nonparametric component:



                                                  
                               E Y X , T   G X T    T  ;       
             m  is a vector of parameters,
                          T
•                                                                and
                           is a smooth function,
                              which we try to estimate by splines.


•   Assumption: m-dimensional random vector                X   which represents (typically discrete) covariates,
                q-dimensional random vector                T   of continuous covariates,
                    which comes from a decomposition of explanatory variables.


                     Other interpretations of      :       role of the environment,
                                                               expert opinions,
                                                               Wiener processes,     etc..
Newton-Raphson and Scoring Methods

The Newton-Raphson algorithm is based on a quadratic Taylor series approximation.

•    An important statistical application of the Newton-Raphson algorithm is given by
     maximum likelihood estimation:

                                         l ( , y )  l a ( , y )
                                                                        l ( , y )
                                                    l ( 0 , y )                  (   0 )
                                                                              0

                                                       2l ( , y )
                                                                          (   0 )2 ;          0 :   startingvalue;
                                                           T
                                                                      0

•    l ( , y)  log L( , y) : log likelihood function of                                                                  T
                                                                        is based on the observed data y = ( y1 , y2 ,..., yn ) .

•    Next, determine the new iterate  1 from              l   a
                                                                                 
                                                                     ( , y)   0 :


                                           1 :  0  C 1r ,                  r := l ( , y)  , C :=  2l ( , y)  T .

•    The Fisher’s scoring method replaces C by the expectation E(C).
Penalized Maximum Likelihood

•     Penalized Maximum Likelihood criterion for GPLM:

                                                                b
                                j (  ,  ) : l ( , y)  1   ( (t )) 2 dt .
                                                           2
                                                                a


•   l : log likelihood of the linear predictor and the second term penalizes the integrated squared
        curvature of   t  over the given interval  a, b  .

•    : smoothing parameter controlling the trade-off between
        accuracy of the data fitting and its smoothness (stability, robustness or regularity).


•     Maximization of j (  ,  ) given by B-splines through the local scoring algorithm.
      For this, we write a k degree B-spline with knots at the values ti (i  1, 2, ..., n) for    t  :
                                                     v
                                         t     j B j ,k (t ),
                                                     j 1


      where   j   are coefficients, and   B j ,k   are k degree B-spline basis functions.
Penalized Maximum Likelihood


•   Zero and k degree B-splines bases are defined by

                                            1, t j  t  t j 1
                              B j ,0 (t )  
                                            0, otherwise,
                                               t tj                        t j  k 1  t
                              B j ,k (t )               B j ,k 1 (t )                      B j 1,k 1 (t )    k  1 .
                                            t j k  t j                  t j  k 1  t j 1

                 t  :  (t1 ),...,  (tn )  and define an n  v matrix             Β           Bij : B j (ti );
                                            T
•   We write                                                                                   by
    then,

                                t  = Β                 1  2  v  .

•   Further, define a   vv   matrix           by

                                                       b
                                        K kl :  Bk (t ) Bl(t ) dt.
                                                   
                                                       a
Penalized Maximum Likelihood


•   Then, the   j (  ,  ) criterion can be written as

                                   j (  ,  )  l (, y)  1     
                                                            2



•   If we insert the least-squares estimation      = ( BT B)1 BT   t  ,
                                                  ˆ                            we get:


                                    j (  ,  )  l (, y)  1      ,
                                                             2



     where    := ( ΒΤ Β)  ( ΒΤ Β)  ΒΤ .
•   Now, we will find   ˆ   and   ˆ   to solve the optimization problem of maximizing   j( ,  ) .

•   Let

                   H ( )   ( X , t )  g1  g2 ;          g1 : X   g2 :   t 
Penalized Maximum Likelihood


•   To maximize   j (  ,  ) with respect to g1 and g2 , we solve the following system of equations:
                                                 T
                              j (  ,  )    l ( , y )
                                                            0,
                                 g1        g1     
                                                 T
                              j (  ,  )    l ( , y )
                                                              g 2  0,
                                 g 2       g 2    

    which we treat by Newton-Raphson method.


•   These system equations are nonlinear in  and       g2   .
    We linearize them around a current guess       by



                           l ( , y) l ( , y)    2l (, y)
                              
                                     
                                            
                                                 
                                                       
                                                                     
                                                                    0.    
Penalized Maximum Likelihood

•     We use this equation in the system equations :


     C    C            g1  g1  
                           1    0
                                            r                      l ( , y)             2l ( , y)
                      1      0
                                             0
                                                   ;           r :=            ,   C :=               ,
     C C +           g2  g2   r    g2                                         

            0
                 2 
     where g1 , g 0  g1 , g1
                       1
                            2       is a Newton-Raphson step and   C and r    are evaluated at 
                                                                                                    
                                                                                                        .

•    More simple form:


                    C     C   g1   C 
                                  1
     (A *)                    1     h;           h :=    C 1r ,    S B := (C +  M )1 C ,
                     SB   I   g2   S B 

      which can be resolved for


                                     g1   X     h  g1 
                                       1

                                     1 1 
                                                            2
                                                              1 
                                                                   .
                                     g2      S B (h  g1 ) 
Penalized Maximum Likelihood



•      ˆ
       and ˆ can be found explicitly without iteration (inner loop backfitting):

                               g1  X 
                               ˆ       ˆ

                                   X { X T C ( I  S B ) X }1 X T C ( I  S B )h ,
                                                      ˆ
                               g2  ˆ  S B (h  X  ).
                               ˆ

•    Here, X represents the regression matrix for the input data xi ,
     S B computes a weighted B-spline smoothing on the variable ti ,
     with weights given by
                                         C   2l (, y) 

     and h is the adjusted dependent variable.
Penalized Maximum Likelihood


•    From the updated    ˆ  , the outer loop must be iterated to update
                          ˆ                                                       and, hence, h and C ;
     then, the loop is repeated until sufficient convergence is achieved.

     Step size optimization is performed by      ( )   1  (1   ) 0 ,
     and we turn to maximize     j ( ( ) ).

•    Standard results on the Newton-Raphson procedure ensure local convergence.

•    Asymptotic properties of the

                                   RB (  C 1r ),
                                 ˆ       ˆ       ˆ
                                    RB h,                    r = l  , y)   ,
                                                              ˆ                 ˆ


      where   RB   is the weighted additive fit operator:

      If we replace h, RB and C by their asymptotic versions           h0 , RB0 and C0 ,
      then, we get the covariance matrix for  
                                             ˆ
Penalized Maximum Likelihood



                           Cov( )  RB0 C0 1 RB0 
                               ˆ              T
                                                                                             ''  '' : asymptotically
                                      RBC 1 RB 
                                               T


     and

                           Cov( gs )  RBs C 1RBs 
                                ˆ               T
                                                            (s  1, 2).


     Here, h  h0 has mean   and variance C0               C 1
                                                       1
•                                                                         , and   RB j   is the matrix
                   ˆ
     that produces g j from h based on B-splines.

•    Furthermore,   ˆ   is asymptotically distributed as


                                     RB C 01 RB  
                                              0
                                                    T
                                                       0
Penalized Iteratively Reweighted Least Squares                                             (P-IRLS)


The penalized likelihood is maximized by the penalized iteratively reweighted least squares
to find the  p  1 th estimate of the linear predictor 
                                                           [ p +1]
                                                                   , which is given by


  B *                                                                                                i[ p ]  X iT   ˆ T  ,
                                                         2
                                 C [ p ] (h[ p ]   )                                                          ˆ

                                                                                                       i[ p ]  H 1 (i[ p ] ),

where    h[ p ]   is the iteratively adjusted dependent variable, given by


                                          hi[ p ] : i[ p ]  H (i[ p ] )( yi  i[ p ] );

here,   H p]     represents the derivative of H with respect to  , and
        C[        is a diagonal weight matrix with entries

        Cii p ] : 1 V ( i[ p ] ) H ( i[ p ] )2 ,
         [




          V (i[ p ] ) is proportional to the variance of Yi according to the current estimate i .
                                                                                                [ p]
where
Penalized Iteratively Reweighted Least Squares                                         (P-IRLS)


•   If we use γ  t  = Βλ   in    B * , which we rewrite as
                                                                     2
                              C    [ p]
                                          (h[ p]
                                                    X   Β )              .


•   With Green and Yandell (1985), we suppose that K is of rank z  v.
    Two matrices J and T can be formed such that

                             J T KJ =I , T T KT =0 and J T T =0,

    where J and T have             v      rows and with full column ranks        z   and   v-z,   respectively.
    Then, rewriting  as

             C *                                          J 
    with vectors  ,  of dimensions               z and v - z , respectively.

    Then,    B * becomes
                                                                                 2
                                                                
                                  C [ p ] (h[ p ]   X , ΒT     ΒJ )               
                                                                
Penalized Iteratively Reweighted Least Squares                                         (P-IRLS)


•   Using Householder decomposition, the minimization can be split
    by separating the solution with respect to      from the one on .


       D *                       Q1 C [ p ]  X , ΒT   R,
                                    T
                                                                                Q2 C [ p ]  X , ΒT   0,
                                                                                 T




                
    where Q = Q1 , Q2         is orthogonal and       R is nonsingular,        upper triangular and of full rank m  v  z.


    Then, we get the bilevel minimization problem of
                                                                                         2
                                                             
       E * upper                    Q1T C [ k ]h[ k ]  R    Q1T C [ k ] BJ                  (upper level)
                                                             

    with respect to       ,   given      based on minimizing


       E * lower 
                                                                                2
                                           Q2 C [ k ]h[ k ]  Q2 C [ k ] BJ 
                                            T                  T
                                                                                               (lower level).
Penalized Iteratively Reweighted Least Squares                                   (P-IRLS)

•   The term     E * upper    can be set to 0.
•   If we put
                                  Q2 C [ k ]h[ k ] ,
                                      T
                                                          V  Q2 C [ k ] BJ ,
                                                                T



     E * lower  becomes the problem of minimizing
                                           H  V                 ,
                                                          2




    which is a ridge regression problem. The solution is


     E *                                 V  V   I V  H .
    The other parameters can be found as

                                            1 T [ k ]
                                           R Q2 C ( H  BJ ).
                                         
•   Now, we can compute             using  C *  and, finally,


                                          [ p+1]  X   Β 
An Alternative Solution for (P-IRLS) with CQP


•     Both penalized maximum likelihood and P-IRLS methods contain a smoothing parameter                  .
      This parameter can be estimated by

o              Generalized Cross Validation               (GCV),
o              Minimization of an UnBiased Risk Estimator (UBRE).



•     Different Method to solve P-IRLS by Conic Quadratic Programming (CQP).

      Use Cholesky decomposition        vv   matrix K in    B *   such that K  U U .
                                                                                    T




      Then,  B *  becomes
                                                                                                    (  T ,  T )T
       F *                        W  v
                                                2
                                                     U
                                                            2
                                                                .                                 W  C [ p] ( X , B)
                                                                                                  v  C [ p ] h[ p ]

•     The regression problem    F *   can be reinterpreted as



       H *                       min G( ),                                 G ( ) : W   v
                                                                                                     2
                                    

                                   where g(  )  0.                          g ( ) : U 
                                                                                              2
                                                                                                  M
                                                                              M 0
An Alternative Solution for (P-IRLS) with CQP



•     Then, our optimization problem        H *         is equivalent to

                                min             t,
                                  t ,

                                                      W  v           t 2 , t  0,
                                                                  2
                                 where

                                                            U         M;
                                                                  2




      Here, W and U are     n  (m  v)         and        vv   matrices,
             and v are      (m  v)          and        n     vectors, respectively.

•     This means:

                                   min               t,
         I *                           t ,

                                    where                  W   v  t,
                                                               U  M .
An Alternative Solution for (P-IRLS) with CQP

•     Conic Quadratic Programming (CQP) problem:

                              min cT x,
                                x

                              where          Di x  d i  piT x  qi    (i  1, 2,..., k ) ;

      our problem is from CQP with

                               c  (1, 0T v )T , x  (t ,  T )T  (t ,  m , vT )T , D1  (0n , W ), d1  v, p1  (1,0,...,0)T ,
                                        m
                                                                           T


                               q1  0, D2  (0 v( m+1), U ), d 2  0v , p2  0mv 1 , q2  M ; k  2.

•     We first reformulate    I *  as a    Primal Problem:


                               min      t,
                                t ,

                                                     0      W  t   v 
                               such that        :  n T      ,
                                                     1 0m  v      0 
                                                     0     0 v m U  t   0v 
                                                 :  v                      ,
                                                     0      0Tm   0T      M 
                                                                    v           
                                                       Ln 1 ,   Lv 1 ,
An Alternative Solution for (P-IRLS) with CQP



      with ice-cream (or second order or Lorentz) cones:



                    
            Ll 1 : x  ( x1,..., xl 1)T  Rl 1 | xl 1  x1  x2  ...  xl2 .
                                                              2    2
                                                                                             
•     The corresponding Dual Problem is




                                            
                        max (v T , 0)  1  0T ,  M  2
                                             v              
                                                                0T     0 
                                                                        0
                                     0T         1                                 1 
                                                                   v
                                                                          
                                                         1   0mv   0m   2  
                                       n
                        such that    T                                                     ,
                                    W          0m  v         UT     0v         0m  v 
                                                                          
                                     1  Ln 1 ,  2  Lv 1.
Solution Methods


•   Polynomial time algorithms requested.

     –   Usually, only local information on the objective and the constraints given.

     –   This algorithm cannot utilize a priori knowledge of the problem’s structure.

     –   CQPs belong to the well-structured convex problems.



•   Interior Point algorithms:

     –   We use the structure of problem.

     –   Yield better complexity bounds.

     –   Exhibit much better practical performance.
Outlook


 Important new class of GPLs:

    E Y X , T          
                      G XT                  T  ,           e.g.,


 GPLM (X ,T )        =   LM (X ) + MARS (T )



                                y

                                                        
                                                          
                                                      
                                                      
                                                   
                                                
                                     c-(x,)=[(x)]       c+(x,)=[(x)]
                                                                                x

              * 2            * 2
 X     *
                      L
                         *
                                                                                     CMARS
Outlook


 Robust CMARS:




                       confidence interval




                                (T j  )

      ... ...................... .
          outlier


                  ...               
                                                         outlier




                                    

                                    
                    semi-length of confidence interval

                                                            RCMARS
References


[1] Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic
     Press, 2004.
[2] Craven, P., and Wahba, G., Smoothing noisy data with spline functions, Numer. Math. 31, Linear
     Models, (1979), 377-403.
[3] De Boor, C., Practical Guide to Splines, Springer Verlag, 2001.
[4] Dongarra, J.J., Bunch, J.R., Moler, C.B., and Stewart, G.W., Linpack User’s Guide, Philadelphia,
     SIAM, 1979.
[5] Friedman, J.H., Multivariate adaptive regression splines, (1991), The Annals of Statistics
    19, 1, 1-141.
[6] Green, P.J., and Yandell, B.S., Semi-Parametric Generalized Linear Models, Lecture Notes in
     Statistics, 32 (1985).
[7] Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall,
     1990.
[8] Kincaid, D., and Cheney, W., Numerical Analysis: Mathematics of Scientific computing, Pacific
     Grove, 2002.
[9] Müller, M., Estimation and testing in generalized partial linear models – A comparive study,
     Statistics and Computing 11 (2001) 299-309, 2001.
[10] Nelder, J.A., and Wedderburn, R.W.M., Generalized linear models, Journal of the Royal Statistical
     Society A, 145, (1972) 470-484.
[11] Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology
     http://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.
References


[12] Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming,
     SIAM, 1993.
[13] Ortega, J.M., and Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several
     Variables, Academic Press, New York, 1970.
[14] Renegar, J., Mathematical View of Interior Point Methods in Convex Programming, SIAM,
     2000.
[15] Sheid, F., Numerical Analysis, McGraw-Hill Book Company, New-York, 1968.
[16] Taylan, P., Weber, G.-W., and Beck, A., New approaches to regression by generalized
     additive and continuous optimization for modern applications in finance, science and
     technology, Optimization 56, 5-6 (2007), pp. 1-24.
[17] Taylan, P., Weber, G.-W., and Liu, L., On foundations of parameter estimation for
     generalized partial linear models with B-splines and continuous optimization, in the
     proceedings of PCO 2010, 3rd Global Conference on Power Control and Optimization,
     February 2-4, 2010, Gold Coast, Queensland, Australia.
[18] Weber, G.-W., Akteke-Öztürk, B., İşcanoğlu, A., Özöğür, S., and Taylan, P., Data Mining:
     Clustering, Classification and Regression, four lectures given at the Graduate Summer
     School on New Advances in Statistics, Middle East Technical University, Ankara, Turkey,
     August 11-24, 2007 (http://www.statsummer.com/).
[19] Wood, S.N., Generalized Additive Models, An Introduction with R, New York, Chapman
     and Hall, 2006.
Thank you very much for your attention!




http://www3.iam.metu.edu.tr/iam/images/7/73/Willi-CV.pdf

More Related Content

What's hot (20)

PDF
Curve fitting
JULIO GONZALEZ SANZ
 
PPT
Matrix factorization
rubyyc
 
PDF
Models
Merlise Clyde
 
PDF
IGARSS2011 FR3.T08.3 BenDavid.pdf
grssieee
 
PDF
Lesson 14: Derivatives of Logarithmic and Exponential Functions
Matthew Leingang
 
PDF
Bouguet's MatLab Camera Calibration Toolbox
Yuji Oyamada
 
PDF
YSC 2013
Adrien Ickowicz
 
PDF
11.solution of linear and nonlinear partial differential equations using mixt...
Alexander Decker
 
PDF
Solution of linear and nonlinear partial differential equations using mixture...
Alexander Decker
 
PDF
UT Austin - Portugal Lectures on Portfolio Choice
guasoni
 
PDF
Lesson 16: Inverse Trigonometric Functions
Matthew Leingang
 
PDF
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Yuji Oyamada
 
PDF
Tele3113 wk1wed
Vin Voro
 
PDF
11.application of matrix algebra to multivariate data using standardize scores
Alexander Decker
 
PDF
Application of matrix algebra to multivariate data using standardize scores
Alexander Decker
 
PDF
Color Img at Prisma Network meeting 2009
Juan Luis Nieves
 
PDF
11.homotopy perturbation and elzaki transform for solving nonlinear partial d...
Alexander Decker
 
PDF
Homotopy perturbation and elzaki transform for solving nonlinear partial diff...
Alexander Decker
 
PDF
Estimation of the score vector and observed information matrix in intractable...
Pierre Jacob
 
Curve fitting
JULIO GONZALEZ SANZ
 
Matrix factorization
rubyyc
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
grssieee
 
Lesson 14: Derivatives of Logarithmic and Exponential Functions
Matthew Leingang
 
Bouguet's MatLab Camera Calibration Toolbox
Yuji Oyamada
 
YSC 2013
Adrien Ickowicz
 
11.solution of linear and nonlinear partial differential equations using mixt...
Alexander Decker
 
Solution of linear and nonlinear partial differential equations using mixture...
Alexander Decker
 
UT Austin - Portugal Lectures on Portfolio Choice
guasoni
 
Lesson 16: Inverse Trigonometric Functions
Matthew Leingang
 
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Yuji Oyamada
 
Tele3113 wk1wed
Vin Voro
 
11.application of matrix algebra to multivariate data using standardize scores
Alexander Decker
 
Application of matrix algebra to multivariate data using standardize scores
Alexander Decker
 
Color Img at Prisma Network meeting 2009
Juan Luis Nieves
 
11.homotopy perturbation and elzaki transform for solving nonlinear partial d...
Alexander Decker
 
Homotopy perturbation and elzaki transform for solving nonlinear partial diff...
Alexander Decker
 
Estimation of the score vector and observed information matrix in intractable...
Pierre Jacob
 

Viewers also liked (10)

PDF
1801 1805
Editor IJARCET
 
PDF
Lecture5 kernel svm
Stéphane Canu
 
PDF
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Persontyle
 
PDF
Data analysis02 twovariables
Universidade de São Paulo
 
PPT
Regression
Vijay Kumar
 
PPT
General Additive Models in R
Noam Ross
 
PPTX
Predictions from MARS
Salford Systems
 
PPTX
Introduction to MARS (1999)
Salford Systems
 
PPT
Introduction to mars_2009
Matthew Magistrado
 
PDF
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Derek Kane
 
1801 1805
Editor IJARCET
 
Lecture5 kernel svm
Stéphane Canu
 
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Persontyle
 
Data analysis02 twovariables
Universidade de São Paulo
 
Regression
Vijay Kumar
 
General Additive Models in R
Noam Ross
 
Predictions from MARS
Salford Systems
 
Introduction to MARS (1999)
Salford Systems
 
Introduction to mars_2009
Matthew Magistrado
 
Data Science - Part XV - MARS, Logistic Regression, & Survival Analysis
Derek Kane
 
Ad

Similar to On Foundations of Parameter Estimation for Generalized Partial Linear Models with B–Splines and Continuous Optimization (20)

PDF
Introduction to Machine Learning
kkkc
 
PDF
Jackknife algorithm for the estimation of logistic regression parameters
Alexander Decker
 
PDF
Monte-Carlo method for Two-Stage SLP
SSA KPI
 
PDF
Computation of the marginal likelihood
BigMC
 
PDF
Bayesian inversion of deterministic dynamic causal models
khbrodersen
 
PDF
ma112011id535
matsushimalab
 
PDF
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Enthought, Inc.
 
PDF
Machine Learning With MapReduce, K-Means, MLE
Jason J Pulikkottil
 
PDF
RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNING
butest
 
PDF
Dd31720725
IJERA Editor
 
PPT
Symmetrical2
Senthil Kumar
 
PDF
Logistic Regression(SGD)
Prentice Xu
 
PDF
Dataanalysis2
Olga Moreira
 
PDF
Physics of Algorithms Talk
jasonj383
 
PDF
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Beniamino Murgante
 
PDF
1 - Linear Regression
Nikita Zhiltsov
 
PDF
2_GLMs_printable.pdf
Elio Laureano
 
PDF
Regression Theory
SSA KPI
 
PDF
Whitcher Ismrm 2009
bwhitcher
 
PDF
icml2004 tutorial on bayesian methods for machine learning
zukun
 
Introduction to Machine Learning
kkkc
 
Jackknife algorithm for the estimation of logistic regression parameters
Alexander Decker
 
Monte-Carlo method for Two-Stage SLP
SSA KPI
 
Computation of the marginal likelihood
BigMC
 
Bayesian inversion of deterministic dynamic causal models
khbrodersen
 
ma112011id535
matsushimalab
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Enthought, Inc.
 
Machine Learning With MapReduce, K-Means, MLE
Jason J Pulikkottil
 
RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNING
butest
 
Dd31720725
IJERA Editor
 
Symmetrical2
Senthil Kumar
 
Logistic Regression(SGD)
Prentice Xu
 
Dataanalysis2
Olga Moreira
 
Physics of Algorithms Talk
jasonj383
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Beniamino Murgante
 
1 - Linear Regression
Nikita Zhiltsov
 
2_GLMs_printable.pdf
Elio Laureano
 
Regression Theory
SSA KPI
 
Whitcher Ismrm 2009
bwhitcher
 
icml2004 tutorial on bayesian methods for machine learning
zukun
 
Ad

More from SSA KPI (20)

PDF
Germany presentation
SSA KPI
 
PDF
Grand challenges in energy
SSA KPI
 
PDF
Engineering role in sustainability
SSA KPI
 
PDF
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
PDF
Competences in sustainability in engineering education
SSA KPI
 
PDF
Introducatio SD for enginers
SSA KPI
 
PPT
DAAD-10.11.2011
SSA KPI
 
PDF
Talking with money
SSA KPI
 
PDF
'Green' startup investment
SSA KPI
 
PDF
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
SSA KPI
 
PDF
Dynamics of dice games
SSA KPI
 
PPT
Energy Security Costs
SSA KPI
 
PPT
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
SSA KPI
 
PDF
Advanced energy technology for sustainable development. Part 5
SSA KPI
 
PDF
Advanced energy technology for sustainable development. Part 4
SSA KPI
 
PDF
Advanced energy technology for sustainable development. Part 3
SSA KPI
 
PDF
Advanced energy technology for sustainable development. Part 2
SSA KPI
 
PDF
Advanced energy technology for sustainable development. Part 1
SSA KPI
 
PPT
Fluorescent proteins in current biology
SSA KPI
 
PPTX
Neurotransmitter systems of the brain and their functions
SSA KPI
 
Germany presentation
SSA KPI
 
Grand challenges in energy
SSA KPI
 
Engineering role in sustainability
SSA KPI
 
Consensus and interaction on a long term strategy for sustainable development
SSA KPI
 
Competences in sustainability in engineering education
SSA KPI
 
Introducatio SD for enginers
SSA KPI
 
DAAD-10.11.2011
SSA KPI
 
Talking with money
SSA KPI
 
'Green' startup investment
SSA KPI
 
From Huygens odd sympathy to the energy Huygens' extraction from the sea waves
SSA KPI
 
Dynamics of dice games
SSA KPI
 
Energy Security Costs
SSA KPI
 
Naturally Occurring Radioactivity (NOR) in natural and anthropic environments
SSA KPI
 
Advanced energy technology for sustainable development. Part 5
SSA KPI
 
Advanced energy technology for sustainable development. Part 4
SSA KPI
 
Advanced energy technology for sustainable development. Part 3
SSA KPI
 
Advanced energy technology for sustainable development. Part 2
SSA KPI
 
Advanced energy technology for sustainable development. Part 1
SSA KPI
 
Fluorescent proteins in current biology
SSA KPI
 
Neurotransmitter systems of the brain and their functions
SSA KPI
 

Recently uploaded (20)

PPTX
Latest Features in Odoo 18 - Odoo slides
Celine George
 
PPTX
HEAD INJURY IN CHILDREN: NURSING MANAGEMENGT.pptx
PRADEEP ABOTHU
 
PPTX
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PPTX
How to Configure Lost Reasons in Odoo 18 CRM
Celine George
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PDF
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
PPT
digestive system for Pharm d I year HAP
rekhapositivity
 
PDF
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
PPSX
Health Planning in india - Unit 03 - CHN 2 - GNM 3RD YEAR.ppsx
Priyanshu Anand
 
PPTX
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
PDF
The-Beginnings-of-Indian-Civilisation.pdf/6th class new ncert social/by k san...
Sandeep Swamy
 
PPTX
Presentation: Climate Citizenship Digital Education
Karl Donert
 
PDF
IMP NAAC REFORMS 2024 - 10 Attributes.pdf
BHARTIWADEKAR
 
PDF
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
PPTX
How to Manage Promotions in Odoo 18 Sales
Celine George
 
PDF
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
PPTX
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
PPTX
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
PPTX
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
PPTX
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 
Latest Features in Odoo 18 - Odoo slides
Celine George
 
HEAD INJURY IN CHILDREN: NURSING MANAGEMENGT.pptx
PRADEEP ABOTHU
 
HYDROCEPHALUS: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
How to Configure Lost Reasons in Odoo 18 CRM
Celine George
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
1, 2, 3… E MAIS UM CICLO CHEGA AO FIM!.pdf
Colégio Santa Teresinha
 
digestive system for Pharm d I year HAP
rekhapositivity
 
IMP NAAC-Reforms-Stakeholder-Consultation-Presentation-on-Draft-Metrics-Unive...
BHARTIWADEKAR
 
Health Planning in india - Unit 03 - CHN 2 - GNM 3RD YEAR.ppsx
Priyanshu Anand
 
Optimizing Cancer Screening With MCED Technologies: From Science to Practical...
i3 Health
 
The-Beginnings-of-Indian-Civilisation.pdf/6th class new ncert social/by k san...
Sandeep Swamy
 
Presentation: Climate Citizenship Digital Education
Karl Donert
 
IMP NAAC REFORMS 2024 - 10 Attributes.pdf
BHARTIWADEKAR
 
Federal dollars withheld by district, charter, grant recipient
Mebane Rash
 
How to Manage Promotions in Odoo 18 Sales
Celine George
 
Zoology (Animal Physiology) practical Manual
raviralanaresh2
 
How to Configure Prepayments in Odoo 18 Sales
Celine George
 
ROLE OF ANTIOXIDANT IN EYE HEALTH MANAGEMENT.pptx
Subham Panja
 
How to Configure Access Rights of Manufacturing Orders in Odoo 18 Manufacturing
Celine George
 
Views on Education of Indian Thinkers Mahatma Gandhi.pptx
ShrutiMahanta1
 

On Foundations of Parameter Estimation for Generalized Partial Linear Models with B–Splines and Continuous Optimization

  • 1. 5th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 3-15, 2010 On Foundations of Parameter Estimation for Generalized Partial Linear Models with B–Splines and Continuous Optimization Gerhard-Wilhelm WEBER Institute of Applied Mathematics, METU, Ankara,Turkey Faculty of Economics, Business and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Skudai, Malaysia Pakize TAYLAN Department of Mathematics, Dicle University, Diyarbakır, Turkey Lian LIU Roche Pharma Development Center in Asia Pacific, Shangai China
  • 2. Outline • Introduction • Estimation for Generalized Linear Models • Generalized Partial Linear Model (GPLM) • Newton-Raphson and Scoring Methods • Penalized Maximum Likelihood • Penalized Iteratively Reweighted Least Squares (P-IRLS) • An Alternative Solution for (P-IRLS) with CQP • Solution Methods • Linear Model + MARS, and Robust CMARS • Conclusion
  • 3. Introduction The class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool. This popularity is due to: • The flexibility of GLM in addressing a variety of statistical problems, • The availability of software (Stata, SAS, S-PLUS, R) )to fit the models. The class of GLM is an extension of traditional linear models allows:  The mean of a dependent variable to depend on a linear predictor by a nonlinear link function......  The probability distribution of the response, to be any member of an exponential family of distributions.  Many widely used statistical models belong to GLM: o linear models with normal errors, o logistic and probit models for binary data, o log-linear models for multinomial data.
  • 4. Introduction Many other useful statistical models such as with • Poisson, binomial, • Gamma or normal distributions, can be formulated as GLM by the selection of an appropriate link function and response probability distribution. A GLM looks as follows: i  H ( i )  xiT  ; • i  E(Yi ) : expected value of the response variable Yi , • H: smooth monotonic link function, • xi : observed value of explanatory variable for the i-th case, •  : vector of unknown parameters.
  • 5. Introduction • Assumptions: Yi are independent and can have any distribution from exponential family density Yi ~ fY ( yi ,i , ) i  y   b ( )   exp  i i i i  ci ( yi , )  (i  1, 2,..., n),  ai ( )  • ai , bi , ci are arbitrary “scale” parameters, and i is called a natural parameter . • General expressions for mean and variance of dependent variable Yi : i  E (Yi )  bi' (i ), Var (Yi )  V ( i ) , V ( i )  bi" (i ) i , ai ( ) :  / i .
  • 6. Estimation for GLM • Estimation and inference for GLM is based on the theory of • Maximum Likelihood Estimation • Least –Squares approach: n l (  ) :  (  y  i 1 i i i  bi (i )   ci ( yi , )). • The dependence of the right-hand side on  is solely through the dependence of the i on  . • Score equations: n  i  -1 x   Vi  yi  i   0,  i ij i 1   i i  xij   i  j  , xi 0  1 (i = 1, 2,..., n; j = 0,1,..., m) . • Solution for score equations given by Fisher scoring procedure based on the a Newton-Raphson algorithm.
  • 7. Generalized Partial Linear Models (GPLMs) • Particular semiparametric models are the Generalized Partial Linear Models (GPLMs) : They extend the GLMs in that the usual parametric terms are augmented by a single nonparametric component:  E Y X , T   G X T    T  ;          m  is a vector of parameters, T • and    is a smooth function, which we try to estimate by splines. • Assumption: m-dimensional random vector X which represents (typically discrete) covariates, q-dimensional random vector T of continuous covariates, which comes from a decomposition of explanatory variables. Other interpretations of    : role of the environment, expert opinions, Wiener processes, etc..
  • 8. Newton-Raphson and Scoring Methods The Newton-Raphson algorithm is based on a quadratic Taylor series approximation. • An important statistical application of the Newton-Raphson algorithm is given by maximum likelihood estimation: l ( , y )  l a ( , y ) l ( , y )  l ( 0 , y )  (   0 )   0  2l ( , y )  (   0 )2 ; 0 : startingvalue;  T 0 • l ( , y)  log L( , y) : log likelihood function of  T is based on the observed data y = ( y1 , y2 ,..., yn ) . • Next, determine the new iterate  1 from  l a  ( , y)   0 : 1 :  0  C 1r , r := l ( , y)  , C :=  2l ( , y)  T . • The Fisher’s scoring method replaces C by the expectation E(C).
  • 9. Penalized Maximum Likelihood • Penalized Maximum Likelihood criterion for GPLM: b j (  ,  ) : l ( , y)  1   ( (t )) 2 dt . 2 a • l : log likelihood of the linear predictor and the second term penalizes the integrated squared curvature of   t  over the given interval  a, b  . •  : smoothing parameter controlling the trade-off between accuracy of the data fitting and its smoothness (stability, robustness or regularity). • Maximization of j (  ,  ) given by B-splines through the local scoring algorithm. For this, we write a k degree B-spline with knots at the values ti (i  1, 2, ..., n) for  t  : v   t     j B j ,k (t ), j 1 where j are coefficients, and B j ,k are k degree B-spline basis functions.
  • 10. Penalized Maximum Likelihood • Zero and k degree B-splines bases are defined by 1, t j  t  t j 1 B j ,0 (t )   0, otherwise, t tj t j  k 1  t B j ,k (t )  B j ,k 1 (t )  B j 1,k 1 (t )  k  1 . t j k  t j t j  k 1  t j 1   t  :  (t1 ),...,  (tn )  and define an n  v matrix Β Bij : B j (ti ); T • We write by then,   t  = Β     1  2  v  . • Further, define a vv matrix  by b K kl :  Bk (t ) Bl(t ) dt.  a
  • 11. Penalized Maximum Likelihood • Then, the j (  ,  ) criterion can be written as j (  ,  )  l (, y)  1      2 • If we insert the least-squares estimation  = ( BT B)1 BT   t  , ˆ we get: j (  ,  )  l (, y)  1      , 2 where  := ( ΒΤ Β)  ( ΒΤ Β)  ΒΤ . • Now, we will find ˆ and ˆ to solve the optimization problem of maximizing j( ,  ) . • Let H ( )   ( X , t )  g1  g2 ; g1 : X   g2 :   t 
  • 12. Penalized Maximum Likelihood • To maximize j (  ,  ) with respect to g1 and g2 , we solve the following system of equations: T j (  ,  )    l ( , y )    0, g1  g1   T j (  ,  )    l ( , y )      g 2  0, g 2  g 2   which we treat by Newton-Raphson method. • These system equations are nonlinear in  and g2 . We linearize them around a current guess  by l ( , y) l ( , y)  2l (, y)                0. 
  • 13. Penalized Maximum Likelihood • We use this equation in the system equations : C C   g1  g1   1 0 r  l ( , y)  2l ( , y)   1 0  0 ; r := , C :=  , C C +   g2  g2   r    g2    0  2  where g1 , g 0  g1 , g1 1 2  is a Newton-Raphson step and C and r are evaluated at   . • More simple form: C C   g1   C  1 (A *)    1     h; h :=    C 1r , S B := (C +  M )1 C ,  SB I   g2   S B  which can be resolved for  g1   X     h  g1  1  1 1  2 1  .  g2      S B (h  g1 ) 
  • 14. Penalized Maximum Likelihood • ˆ  and ˆ can be found explicitly without iteration (inner loop backfitting): g1  X  ˆ ˆ  X { X T C ( I  S B ) X }1 X T C ( I  S B )h , ˆ g2  ˆ  S B (h  X  ). ˆ • Here, X represents the regression matrix for the input data xi , S B computes a weighted B-spline smoothing on the variable ti , with weights given by C   2l (, y)  and h is the adjusted dependent variable.
  • 15. Penalized Maximum Likelihood • From the updated    ˆ  , the outer loop must be iterated to update ˆ  and, hence, h and C ; then, the loop is repeated until sufficient convergence is achieved. Step size optimization is performed by  ( )   1  (1   ) 0 , and we turn to maximize j ( ( ) ). • Standard results on the Newton-Raphson procedure ensure local convergence. • Asymptotic properties of the   RB (  C 1r ), ˆ ˆ ˆ  RB h, r = l  , y)   , ˆ ˆ where RB is the weighted additive fit operator: If we replace h, RB and C by their asymptotic versions h0 , RB0 and C0 , then, we get the covariance matrix for   ˆ
  • 16. Penalized Maximum Likelihood Cov( )  RB0 C0 1 RB0  ˆ  T ''  '' : asymptotically  RBC 1 RB  T and Cov( gs )  RBs C 1RBs  ˆ T (s  1, 2). Here, h  h0 has mean   and variance C0   C 1 1 • , and RB j is the matrix ˆ that produces g j from h based on B-splines. • Furthermore, ˆ is asymptotically distributed as    RB C 01 RB   0 T 0
  • 17. Penalized Iteratively Reweighted Least Squares (P-IRLS) The penalized likelihood is maximized by the penalized iteratively reweighted least squares to find the  p  1 th estimate of the linear predictor  [ p +1] , which is given by  B * i[ p ]  X iT   ˆ T  , 2 C [ p ] (h[ p ]   )       ˆ i[ p ]  H 1 (i[ p ] ), where h[ p ] is the iteratively adjusted dependent variable, given by hi[ p ] : i[ p ]  H (i[ p ] )( yi  i[ p ] ); here, H p] represents the derivative of H with respect to  , and C[ is a diagonal weight matrix with entries Cii p ] : 1 V ( i[ p ] ) H ( i[ p ] )2 , [ V (i[ p ] ) is proportional to the variance of Yi according to the current estimate i . [ p] where
  • 18. Penalized Iteratively Reweighted Least Squares (P-IRLS) • If we use γ  t  = Βλ in  B * , which we rewrite as 2 C [ p] (h[ p]  X   Β )     . • With Green and Yandell (1985), we suppose that K is of rank z  v. Two matrices J and T can be formed such that J T KJ =I , T T KT =0 and J T T =0, where J and T have v rows and with full column ranks z and v-z, respectively. Then, rewriting  as  C *     J  with vectors  ,  of dimensions z and v - z , respectively. Then,  B * becomes 2   C [ p ] (h[ p ]   X , ΒT     ΒJ )       
  • 19. Penalized Iteratively Reweighted Least Squares (P-IRLS) • Using Householder decomposition, the minimization can be split by separating the solution with respect to      from the one on .  D * Q1 C [ p ]  X , ΒT   R, T Q2 C [ p ]  X , ΒT   0, T  where Q = Q1 , Q2  is orthogonal and R is nonsingular, upper triangular and of full rank m  v  z. Then, we get the bilevel minimization problem of 2    E * upper  Q1T C [ k ]h[ k ]  R    Q1T C [ k ] BJ  (upper level)   with respect to     , given  based on minimizing  E * lower  2 Q2 C [ k ]h[ k ]  Q2 C [ k ] BJ  T T     (lower level).
  • 20. Penalized Iteratively Reweighted Least Squares (P-IRLS) • The term  E * upper  can be set to 0. • If we put   Q2 C [ k ]h[ k ] , T V  Q2 C [ k ] BJ , T  E * lower  becomes the problem of minimizing H  V     , 2 which is a ridge regression problem. The solution is  E *   V  V   I V  H . The other parameters can be found as   1 T [ k ]    R Q2 C ( H  BJ ).   • Now, we can compute  using  C *  and, finally, [ p+1]  X   Β 
  • 21. An Alternative Solution for (P-IRLS) with CQP • Both penalized maximum likelihood and P-IRLS methods contain a smoothing parameter . This parameter can be estimated by o Generalized Cross Validation (GCV), o Minimization of an UnBiased Risk Estimator (UBRE). • Different Method to solve P-IRLS by Conic Quadratic Programming (CQP). Use Cholesky decomposition vv matrix K in  B * such that K  U U . T Then,  B *  becomes   (  T ,  T )T  F * W  v 2  U 2 . W  C [ p] ( X , B) v  C [ p ] h[ p ] • The regression problem  F * can be reinterpreted as  H * min G( ), G ( ) : W   v 2  where g(  )  0. g ( ) : U  2 M M 0
  • 22. An Alternative Solution for (P-IRLS) with CQP • Then, our optimization problem  H * is equivalent to min t, t , W  v  t 2 , t  0, 2 where U  M; 2 Here, W and U are n  (m  v) and vv matrices,  and v are (m  v) and n vectors, respectively. • This means: min t,  I * t , where W   v  t, U  M .
  • 23. An Alternative Solution for (P-IRLS) with CQP • Conic Quadratic Programming (CQP) problem: min cT x, x where Di x  d i  piT x  qi (i  1, 2,..., k ) ; our problem is from CQP with c  (1, 0T v )T , x  (t ,  T )T  (t ,  m , vT )T , D1  (0n , W ), d1  v, p1  (1,0,...,0)T , m T q1  0, D2  (0 v( m+1), U ), d 2  0v , p2  0mv 1 , q2  M ; k  2. • We first reformulate  I *  as a Primal Problem: min t, t , 0 W  t   v  such that  :  n T      ,  1 0m  v      0  0 0 v m U  t   0v   :  v    , 0 0Tm 0T      M  v      Ln 1 ,   Lv 1 ,
  • 24. An Alternative Solution for (P-IRLS) with CQP with ice-cream (or second order or Lorentz) cones:  Ll 1 : x  ( x1,..., xl 1)T  Rl 1 | xl 1  x1  x2  ...  xl2 . 2 2  • The corresponding Dual Problem is  max (v T , 0)  1  0T ,  M  2 v   0T 0  0  0T 1   1  v     1   0mv 0m   2   n such that  T , W 0m  v   UT 0v   0m  v     1  Ln 1 ,  2  Lv 1.
  • 25. Solution Methods • Polynomial time algorithms requested. – Usually, only local information on the objective and the constraints given. – This algorithm cannot utilize a priori knowledge of the problem’s structure. – CQPs belong to the well-structured convex problems. • Interior Point algorithms: – We use the structure of problem. – Yield better complexity bounds. – Exhibit much better practical performance.
  • 26. Outlook Important new class of GPLs: E Y X , T    G XT   T  , e.g., GPLM (X ,T ) = LM (X ) + MARS (T ) y                    c-(x,)=[(x)] c+(x,)=[(x)]  x * 2 * 2 X  *   L * CMARS
  • 27. Outlook Robust CMARS: confidence interval (T j  ) ... ...................... . outlier ...   outlier    semi-length of confidence interval RCMARS
  • 28. References [1] Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004. [2] Craven, P., and Wahba, G., Smoothing noisy data with spline functions, Numer. Math. 31, Linear Models, (1979), 377-403. [3] De Boor, C., Practical Guide to Splines, Springer Verlag, 2001. [4] Dongarra, J.J., Bunch, J.R., Moler, C.B., and Stewart, G.W., Linpack User’s Guide, Philadelphia, SIAM, 1979. [5] Friedman, J.H., Multivariate adaptive regression splines, (1991), The Annals of Statistics 19, 1, 1-141. [6] Green, P.J., and Yandell, B.S., Semi-Parametric Generalized Linear Models, Lecture Notes in Statistics, 32 (1985). [7] Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990. [8] Kincaid, D., and Cheney, W., Numerical Analysis: Mathematics of Scientific computing, Pacific Grove, 2002. [9] Müller, M., Estimation and testing in generalized partial linear models – A comparive study, Statistics and Computing 11 (2001) 299-309, 2001. [10] Nelder, J.A., and Wedderburn, R.W.M., Generalized linear models, Journal of the Royal Statistical Society A, 145, (1972) 470-484. [11] Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology http://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf.
  • 29. References [12] Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993. [13] Ortega, J.M., and Rheinboldt, W.C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. [14] Renegar, J., Mathematical View of Interior Point Methods in Convex Programming, SIAM, 2000. [15] Sheid, F., Numerical Analysis, McGraw-Hill Book Company, New-York, 1968. [16] Taylan, P., Weber, G.-W., and Beck, A., New approaches to regression by generalized additive and continuous optimization for modern applications in finance, science and technology, Optimization 56, 5-6 (2007), pp. 1-24. [17] Taylan, P., Weber, G.-W., and Liu, L., On foundations of parameter estimation for generalized partial linear models with B-splines and continuous optimization, in the proceedings of PCO 2010, 3rd Global Conference on Power Control and Optimization, February 2-4, 2010, Gold Coast, Queensland, Australia. [18] Weber, G.-W., Akteke-Öztürk, B., İşcanoğlu, A., Özöğür, S., and Taylan, P., Data Mining: Clustering, Classification and Regression, four lectures given at the Graduate Summer School on New Advances in Statistics, Middle East Technical University, Ankara, Turkey, August 11-24, 2007 (http://www.statsummer.com/). [19] Wood, S.N., Generalized Additive Models, An Introduction with R, New York, Chapman and Hall, 2006.
  • 30. Thank you very much for your attention! http://www3.iam.metu.edu.tr/iam/images/7/73/Willi-CV.pdf