Notes on Matrix Calculus


Matrix Products

The elements of the product $C = AB$, where $A$ is an $m \times n$ matrix and $B$ is an $n \times m$ matrix, are given by $$c_{ij}=\sum_{k=1}^{n} a_{ik}b_{kj}.$$ The elements of the matrix-vector product $y=Ax$, where $A$ is an $m \times n$ matrix and $x$ is an $n \times 1$ vector, are given by $$y_{k}=\sum_{i=1}^{n} a_{ki}x_{i}.$$ The above equation is equally valid for getting a row vector $y=x^TA^T$. The elements of the vector-matrix product $y=x^TA$, where $A$ is an $m \times n$ matrix and $x$ is an $m \times 1$ vector, are given by $$y_{k}=\sum_{i=1}^{n} a_{ik}x_{i}.$$ Finally, the elements of vector-matrix-vector product $\alpha=y^TAx$, where $A$ is an $m \times n$ matrix and $x$,$y$ are $n \times 1$ vectors, are given by $$\alpha=\sum_{j=1}^{m} \sum_{k=1}^{n} a_{jk}y_{j}x_{k}.$$

Derivatives

Constant Vectors

The derivative of the matrix-vector product $y=Ax$, where the matrix $A$ is constant with restpect to $x$ is given by $$\frac{\partial y}{\partial x} = A$$ In the case of a vector-matrix-vector product $\alpha=y^TAx$ we can treat the product $y^TA$ as another constant matrix and thus derivative with respect to $x$ is given by $$\frac{\partial \alpha}{\partial x} = y^TA$$

Quadratic Forms

The derivative of the quadratic form $\alpha=x^TAx$ is derivaed as follows: $$ \alpha = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij}x_{i}x_{j}, \\ \frac{\partial \alpha}{\partial x_k} = \sum_{j=1}^{n} a_{kj}x_{j} + \sum_{i=1}^{m} a_{ik}x_{i}, \\ \frac{\partial \alpha}{\partial x} = x^TA^T + x^TA, \\ \frac{\partial \alpha}{\partial x} = x^T\left(A^T + A\right), \\ $$ where in the second step the first term is the derivative of the double summation when the $k=i$ and the second term is the derivative of the double summation when the $k=j$. For symmetric matrix $A=A^T$ $$ \frac{\partial \alpha}{\partial x} = 2x^TA $$

Dot Products

The derivative of the dot product $\alpha = y^Tx$, where $x$ and $y$ are functions of a third vector say $z$, is given by $$ \alpha = \sum_i^n x_iy_i \\ \frac{\partial \alpha}{\partial z_k} = \sum_i^n\left( x_i \frac{\partial y_i}{\partial z_k} + y_i \frac{\partial x_i}{\partial z_k}\right) \\ \frac{\partial \alpha}{\partial z} = x^T \frac{\partial y}{\partial z} + y^T \frac{\partial x}{\partial z} $$ Setting $y=x$ in the above equations we get $$ \alpha = x^Tx\\ \frac{\partial \alpha}{\partial z} = 2x^T \frac{\partial x}{\partial z} $$

Vector-Matrix-Vector Products

Given a vector-matrix-vector product $\alpha = y^TAx$, we can substitute $A^Ty$ as a vector $w$: $$ \alpha = y^TAx = w^Tx\\ \frac{\partial \alpha}{\partial z} = x^T \frac{\partial w}{\partial z} + w^T \frac{\partial x}{\partial z}\\ \frac{\partial \alpha}{\partial z} = x^T A^T\frac{\partial y}{\partial z} + y^TA \frac{\partial x}{\partial z} $$ The quadratic form of $x(z)$ is given by $$\alpha = x^TAx $$ and its derivative is given by $$ \frac{\partial \alpha}{\partial z} = x^T(A+A^T)\frac{\partial x}{\partial z}. $$ For a symmetric $A$, we have $$\frac{\partial \alpha}{\partial z} = 2x^TA\frac{\partial x}{\partial z}.$$

References