Ridge function

In mathematics, a ridge function is any function f : R d → R {\displaystyle f:\mathbb {R} ^{d}\rightarrow \mathbb {R} } that can be written as the composition of an univariate function g : R → R {\displaystyle g:\mathbb {R} \rightarrow \mathbb {R} } , that is called a profile function, with an affine transformation, given by a direction vector a ∈ R d {\displaystyle a\in \mathbb {R} ^{d}} with shift b ∈ R {\displaystyle b\in \mathbb {R} } .

.mw-parser-output .hatnote{font-style:italic}.mw-parser-output div.hatnote{padding-left:1.6em;margin-bottom:0.5em}.mw-parser-output .hatnote i{font-style:normal}.mw-parser-output .hatnote+link+.hatnote{margin-top:-0.5em}@media print{body.ns-0 .mw-parser-output .hatnote{display:none!important}}

In mathematics, a ridge function is any function

    f
    :
    
      
        R
      
      
        d
      
    
    →
    
      R
    
  

{\displaystyle f:\mathbb {R} ^{d}\rightarrow \mathbb {R} }

that can be written as the composition of an univariate function

    g
    :
    
      R
    
    →
    
      R
    
  

{\displaystyle g:\mathbb {R} \rightarrow \mathbb {R} }

, that is called a profile function, with an affine transformation, given by a direction vector

    a
    ∈
    
      
        R
      
      
        d
      
    
  

{\displaystyle a\in \mathbb {R} ^{d}}

with shift

    b
    ∈
    
      R
    
  

{\displaystyle b\in \mathbb {R} }

.

Then, the ridge function reads

    f
    (
    x
    )
    =
    g
    (
    
      x
      
        ⊤
      
    
    a
    +
    b
    )
  

{\displaystyle f(x)=g(x^{\top }a+b)}

for

    x
    ∈
    
      
        R
      
      
        d
      
    
  

{\displaystyle x\in \mathbb {R} ^{d}}

.

Coinage of the term 'ridge function' is often attributed to B.F. Logan and L.A. Shepp.

A ridge function is not susceptible to the curse of dimensionality, making it an instrumental tool in various estimation problems. This is a direct result of the fact that ridge functions are constant in

    d
    −
    1
  

{\displaystyle d-1}

directions: Let

      a
      
        1
      
    
    ,
    …
    ,
    
      a
      
        d
        −
        1
      
    
  

{\displaystyle a_{1},\dots ,a_{d-1}}

be

    d
    −
    1
  

{\displaystyle d-1}

independent vectors that are orthogonal to

    a
  

{\displaystyle a}

, such that these vectors span

    d
    −
    1
  

{\displaystyle d-1}

dimensions. Then

f

      (
      
        
          x
        
        +
        
          ∑
          
            k
            =
            1
          
          
            d
            −
            1
          
        
        
          c
          
            k
          
        
        
          
            a
          
          
            k
          
        
      
      )
    
    =
    g
    
      (
      
        
          x
        
        ⋅
        
          a
        
        +
        
          ∑
          
            k
            =
            1
          
          
            d
            −
            1
          
        
        
          c
          
            k
          
        
        
          
            a
          
          
            k
          
        
        ⋅
        
          a
        
      
      )
    
    =
    g
    
      (
      
        
          x
        
        ⋅
        
          a
        
        +
        
          ∑
          
            k
            =
            1
          
          
            d
            −
            1
          
        
        
          c
          
            k
          
        
        0
      
      )
    
    =
    g
    (
    
      x
    
    ⋅
    
      a
    
    )
    =
    f
    (
    
      x
    
    )
  

{\displaystyle f\left({\boldsymbol {x}}+\sum _{k=1}^{d-1}c_{k}{\boldsymbol {a}}_{k}\right)=g\left({\boldsymbol {x}}\cdot {\boldsymbol {a}}+\sum _{k=1}^{d-1}c_{k}{\boldsymbol {a}}_{k}\cdot {\boldsymbol {a}}\right)=g\left({\boldsymbol {x}}\cdot {\boldsymbol {a}}+\sum _{k=1}^{d-1}c_{k}0\right)=g({\boldsymbol {x}}\cdot {\boldsymbol {a}})=f({\boldsymbol {x}})}

for all

      c
      
        i
      
    
    ∈
    
      R
    
    ,
    1
    ≤
    i
    <
    d
  

{\displaystyle c_{i}\in \mathbb {R} ,1\leq i

. In other words, any shift of

      x
    
  

{\displaystyle {\boldsymbol {x}}}

in a direction perpendicular to

      a
    
  

{\displaystyle {\boldsymbol {a}}}

does not change the value of

    f
  

{\displaystyle f}

.

Ridge functions play an essential role in amongst others projection pursuit, generalized linear models, and as activation functions in neural networks. For a survey on ridge functions, see. For books on ridge functions, see.

.mw-parser-output .reflist-columns-2{column-width:30em}.mw-parser-output .reflist-columns-3{column-width:25em}body.skin-vector-2022 .mw-parser-output .reflist-columns-2{column-width:27em}body.skin-vector-2022 .mw-parser-output .reflist-columns-3{column-width:22.5em}.mw-parser-output .references[data-mw-group=upper-alpha]{list-style-type:upper-alpha}.mw-parser-output .references[data-mw-group=upper-roman]{list-style-type:upper-roman}.mw-parser-output .references[data-mw-group=lower-alpha]{list-style-type:lower-alpha}.mw-parser-output .references[data-mw-group=lower-greek]{list-style-type:lower-greek}.mw-parser-output .references[data-mw-group=lower-roman]{list-style-type:lower-roman}.mw-parser-output div.reflist-liststyle-upper-alpha .references{list-style-type:upper-alpha}.mw-parser-output div.reflist-liststyle-upper-roman .references{list-style-type:upper-roman}.mw-parser-output div.reflist-liststyle-lower-alpha .references{list-style-type:lower-alpha}.mw-parser-output div.reflist-liststyle-lower-greek .references{list-style-type:lower-greek}.mw-parser-output div.reflist-liststyle-lower-roman .references{list-style-type:lower-roman}