$\newcommand{\mvec}[1]{\mathbf{#1}}\newcommand{\gvec}[1]{\boldsymbol{#1}}\definecolor{eqcol2}{RGB}{114,0,172}\definecolor{eqcol1}{RGB}{172,0,114}\definecolor{eqcol3}{RGB}{0,172,114}\definecolor{eqcol4}{RGB}{230,190,120}$In the description of the physically based rendering implementation of Unreal Engine 4[1], Karis states that instead of using the classical Fresnel formula \begin{align*} F(\mvec{v},\mvec{h}) = F_0 + (1-F_0)(1 - \mvec{v}\cdot\mvec{h})^5 \end{align*} he is using the approximation \begin{align*} F(\mvec{v},\mvec{h}) = F_0 + (1-F_0)2^{ (−5.55473(\mvec{v}\cdot\mvec{h}) −6.98316)(\mvec{v}\cdot\mvec{h}) } \end{align*} Note that $-5.55473$ and $−6.98316$ are oddly exact numbers. It is unkely that Karis found them by guessing. Instead, it is more likely that they were found by some numerical algorithm. In this article, we shall describe one method that could have been used for finding these constants.

First, note that $\mvec{v}$ and $\mvec{h}$ are in practice always normalized before they are used in the Fresnel formula. But then we must have $0 \le \mvec{v}\cdot\mvec{h} \le 1$, by the definition of the dot product. Consequently, if we want to find an approximation for $(1 - \mvec{v}\cdot\mvec{h})^5$, it is sufficient to find an approximation for $(1 - x)^5$, where $0 \le x \le 1$, since this is the only interval that is of interest when we are using the Fresnel formula. Within this interval, Karis finds the approximation $2^{ (−5.55473x −6.98316)x}$. Indeed, plotting reveals that this approximation is only useful within the interval $0 \le x \le 1$:

We shall denote this interval $0 \le x \le 1$ the *domain* of our
curve fitting problem.

We shall use the root mean square error(RMSE) to measure the quality of an approximation. The root mean square error between some original function $y(x)$ and its approximation $\hat{y}(x)$ is given by \begin{align*} RMSE = \sqrt{ \frac{\sum_{i=1}^N (y(x_i) - \hat{y}(x_i))^2}{n}}. \end{align*} That is, we define $N$ sample points at which we evaluate $y$ and $\hat{y}$. These sample points are uniformly distributed on the domain. For instance, if $N=5$, then the sample points are $x_1=0.00$, $x_2=0.25$, $x_3=0.50$, $x_4=0.75$, $x_5=1.00$. At every such sample point, we evaluate the squared difference between the original function and the approximation: $(y(x_i) - \hat{y}(x_i))^2$. If all these squared differences were 0, then $\hat{y}$ would be a very close fit to $y$, assuming that the number of sample points $N$ is large. The higher $N$ is, the better we can measure the similarity of $y$ and $\hat{y}$.

Finally, the sum of the squared differences is divided by the number of samples $N$ and then the square root is taken, and the result of this is the RMSE. These last two steps are done for the purpose of normalization.

The approximation of Karis has a RMSE of $0.002238$ with respect to the original Fresnel function $(1-x)^5$. For properly chosen values of $A$ and $B$, the curve of $2^{(Ax + B)x}$ is close to $(1-x)^5$. For instance, the assignments $A=-5$ and $B=-7$ result in an RMSE of $0.003689$. However, this is clearly worse than the approximation of Karis, where $A=-5.55473$ and $B=−6.98316$. What is necessary is an algorithm that allows us to find values of these two parameters that result in a low RMSE. One such algorithm shall described in the following section.

Therefore, to find a good value for $h$, we set the first partial
derivative to zero and solve for $h$, in order to find the value of
$h$ for which we have a local minimum:
\begin{align*}
-2\sum_{i=1}^N \left[ \left(y(x_i) -
\hat{y}(x_i,p) - h\frac{\partial \hat{y}}{\partial
p}(x_i,p)\right)\frac{\partial \hat{y}}{\partial p}(x_i,p) \right] = 0 \\
\end{align*}
Rearranging the above yields
\begin{equation}
\left( \sum_{i=1}^N \left(\frac{\partial \hat{y}}{\partial p}(x_i,p) \right)^2
\right)h = \sum_{i=1}^N \left( \left( y(x_i) - \hat{y}(x_i,p) \right) \frac{\partial \hat{y}}{\partial
p}(x_i,p) \right)
\label{eq:single}
\end{equation}
Therefore, to compute a value of $h$ that will move $\hat{y}(x,p)$ closer to
$y(x)$, we simply solve for $h$ in equation $\eqref{eq:single}$. The
new value of $p$ shall be set to $p+h$. Then $h$ will be solved for
again, and $p$ will be updated again, and this iterative scheme is
repeated until the RMSE is small enough. This algorithm is called the
*Gauss-Newton Algorithm*.

In practice, $\hat{y}$ will often be dependent on several $m$ parameters and not a single parameter, so that $p$ is instead a vector $\mvec{p}$ of dimensions $m \times 1$, so that our approximation is instead written $\hat{y}(x, \mvec{p})$. The update step is also a vector $\mvec{h}$ of dimensions $m \times 1$. For every iteration, we will find our update step by solving the matrix equation \begin{equation} [\mvec{J}^T \mvec{J}] \mvec{h} = \mvec{J}^T (\mvec{y} - \mvec{\hat{y}}) \label{eq:multi} \end{equation} The jacobian matrix $\mvec{J}$ is a matrix with dimensions $n \times m$. It is defined as follows: In column $j$ in row $i$, we store the value $\frac{\partial \hat{y}}{\partial p_j}(x_i, \mvec{p})$. So the jacobian simply stores the values of the partial derivatives for all the sample points. Finally, the vectors $\mvec{y}$ and $\mvec{\hat{y}}$ contains the values of $y(x)$ and $\hat{y}(x,\mvec{p})$ evaluated at all the sample points. Finally, note that $\eqref{eq:multi}$ is simply a generalization of $\eqref{eq:single}$. The derivation of this more general equation is slightly more involved, since it involves matrices and vectors, but the derivation is still based on the exact same principles that we applied when deriving equation $\eqref{eq:single}$. The interested reader may peruse section 3 in [2] for a derivation of $\eqref{eq:multi}$.

Source code that implements the Gauss-Newton algorithm to find the parameters $A=-5.55473$ and $B=6.98316$ can be found in a gist.

[1] Brian Karis, "Real Shading in Unreal Engine 4". Link.

[2] Henri P. Gavin, "The Levenberg-Marquardt method for nonlinear least squares curve-fitting problems". Link.