Matrix Algebra

Matrices, Tensors, and Coordinate Transformations

For sake of clarity we do not write variables in italics in this module

This is not a course in matrix algebra (including vector and tensor calculus), but a quick reminder, assuming you know the basic facts of life here.

We also cut a lot of corners, not distinguishing much between matrices (a mathematical object) and tensors (a physical object), "true" vectors and "polar" vectors, Cartesian and non-Cartesian coordinate systems and the like.

We will deal with some topics of matrix algebra roughly in the sequence they come up in the backbone chapters

	a₂₁	a₂₂	a₂₃
(	a₁₁	a₁₂	a₁₃	)
	a₃₁	a₃₂	a₃₃

A matrix then is an assembly of nine numbers arranged as shown on the left.

In a simplified way of speaking, a matrix (or better tensor) allows to correlate vectors in a simple linear way.

Every component of the vector r = (r₁, r₂, r₃) can be expressed as a linear function of all components of a second vector t = (t₁, t₂, t₃) by the equations

r₁	=	a₁₁ · t₁ + a₁₂ · t₂ + a₁₃ · t₃

r₂	=	a₂₁ · t₁ + a₂₂ · t₂ + a₂₃ · t₃

r₃	=	a₃₁ · t₁ + a₃₂ · t₂ + a₃₃ · t₃

In matrix notation we simple write

r	=	A · t

With A being the symbol for the matrix defined above.

We then have already defined how a matrix is multiplied with a vector and that a new vector is the result of the multiplication.

The matrix A, if interpreted as an entity that relates two vectors with each other, must have certain properties that are not required for a general matrix (that might express, e.g., the coefficients of a linear system of equations with several unknowns).

If we change the coordinate system in which we express the vectors, the components of the vectors will be different numbers, but the vectors themselves (the arrows) stay unchanged. This imposes some conditions on the set of nine numbers - the matrix - connecting the components of the vectors and any matrix meeting these conditions we call a tensor.

A tensor thus is a set of nine numbers, and the numerical value of these numbers depends on the coordinate system in which the tensor is expressed. If we do a coordinate transformation, the numerical values of the nine components must then transform in a specific way.

Transforming a coordinate system into another one is done by matrices as follows:

If the first vector r is chosen to be one of the unit vectors defining some Cartesian coordinate system, the second vector r' obtained by multiplying r with the transformation matrix T, can be interpreted as the unit vector of some new coordinate system

The set of unit vectors r_i with i = x,y,z will be changed to a new set r'_i by

r'_i	=	T · r_i

and T is called the transformation matrix. It is clear that T must have certain properties if the r'_i are also supposed to be unit vectors

While this is clear, it is not so clear what we have to do if we want to reverse the transformation. The simple thing is to write

r_i = ( T ^–1 ) · r'_i

and defining ( T ^–1 ) to be the inverse matrix to T so that the operation can be reversed.

But how do we calculate the numerical values of the components of T ^–1 if we know the numerical values of the components of T ??

In order to be able to give a simple formula, we first have to introduce something else, the determinant of a matrix

The Determinant of a Matrix

The determinant |A| of a matrix A is a single number calculated by summing up the diagonal products in a special fashion.

For a 3 × 3 matrix we have

\|A\|	=	a₁₁ · a₂₂ · a₃₃ + a₁₂ · a₂₃ · a₃₁ + a₁₃ · a₂₁ · a₃₂ – a₁₃ · a₂₂ · a₃₁ – a₁₁ · a₂₃a · ₃₂ – a₁₂ · a₂₁a · ₃₃

Look at the written matrix A above and you see that you start by doing the products by going down diagonally from left to right, adding the products of the three possible diagonals - always completing a diagonal by repeating the matrix if necessary. Then you subtract the product you obtain by going down the diagonal from right to left.

This sounds more complicated as it is; graphically it looks like this:

The determinant of a matrix obtained in this way is a number that comes up a lot in all kinds of matrix operation; the same is true for a related quantity, the subdeterminant A_ik of the matrix A

There are as many subdeterminants as there are elements in the matrix. A_ik is obtained by

Erasing the line and the row that contains the element a_ik and calculating the determinant of the matrix that remains, and
multiplying the number obtained by (– 1)^{i · k}

With the concept of a subdeterminant, we can also define the rank of a matrix:

The rank of a matrix is the number of row (or columns, resp.) of the determinant or largest subdeterminant with non-zero value. In other words, the rank of a 3 × 3 matrix A is rank(A) = 3 if |A| ¹ 0; if |A| = 0, you look for the largest subdeterminant

With determinant and subdeterminant, the inverse matrix is easy to formulate:

The inverse matrix A ^–1 to A has the elements (a_ik)^–1 given by

(a_ik)^–1	=	A_ki \|A\|

i.e. the value of the respective subdeterminant divided by the value of the determinant. Note that the indexes are interchanged ("ik" → "ki"); and that the ^{" –1"} must be read as "inverse", it is not an exponent!!!

We will not prove it here; but it is not too difficult - just solve the system of equations given above for the t_i.

Two more important points follow directly:

An inverse matrix ( A^–1 ) to A only exists if the determinant of A is not zero!

The product of A^–1 and A results in the identity matrix I

A^–1 · A = I =		0	1	0
	(	1	0	0	)
		0	0	1

The last claim is unproved, we first need the multiplication rule for matrices to prove it.

Multiplication of the matrix A with the matrix B gives a new matrix C; and the element c_ik of C is obtained by taking the scalar product of the "line" or "row" vector in row i of matrix A times the column vector of column k of matrix B. This is best seen in a kind of graph:

(	×	×	×	)		(	×	×	×	)		(	×	b₁₂	×	)
	×	×	×		=		×	×	×		·		×	b₂₂	×
	×	c₃₂	×				a₁₃	a₁₂	a₃₂				×	b₃₂	×

Now it is still fairly messy, but straightforward to prove the claim from above - you may want to try it.

A useful relation is that the multiplication of any matrix with the identity matrix I doesn't change anything.

I · A	=	A

And this is also true for multiplying a vector with I:

I · r	=	r

From the various definitions you may get the feeling, that signs are important and possibly tricky. Well, that's true.

Matrix multiplication, in general is not commutative, i.e. A · B ¹ B · A - you must watch out if you multiply from the left or from the right.

Still, we now can solve "mixed" vector - matrix equations. Take, for example

r₀ = A^–1 · r + T(I)

Multiplying from the right with I yields.

I · r₀ = I · A ^–1 · r₀ + I · T(I)

I · r₀ = A ^–1 · r₀ + I · T(I)

That looks a bit stupid, but with this cheap trick we now we have only tensors in connection with r₀, which means we now can combine the "factors" of r₀, giving

( I – A ^–1 ) · r₀ = T(I)

our O-lattice theory master equation.

One last important property of transformation matrices is that their determinant gives directly the volume ratio of the unit cells:

\|A\|	=	V(after the transformation) V(before the transformation)

This is not particularly easy to see, but simply consider two points:

1. The base vector a(I) is transformed to the base vector a(II) via

a_i(II)	=	A · a_i(I)

2. The volumes V of elementary cells is given by

V	=	(a₁ × a₂) · a₃

Since we produce the O-lattice from a crystal lattice with the matrix I – A ^–1, the volume V_O of an O-lattice cell (in units of the volume of a crystal unit cell) is

1 V_O	=	\|I – A^–1\|

Again, as remarked above; watch out for signs.

To index

7.3.2 Working with the O-Lattice

7.3.1 The Basic Concept

7.3.4 Periodic O-Lattices and Pattern Elements