Tensor

There seems to be some uncertainty in the computer science community about exactly what a tensor is. The only things involved are functions and their composition. This short note gives names to functions then composes them. The mathematical definitions can be translated directly into any computer language that allows functions to be defined and called. We cover topics described in The Tensor Cookbook but have no need for their graphical notations that do not have a direct translation to computer implementation.

Set

Everything in classical mathematics is a set. We assume the reader knows what a set is and use \in to denote set membership. The set exponential of the set A and B is B^A = \{f\colon A\to B\}, the set of all functions from A to B. We also write this as \{A\to B\}. The domain of a function f\colon A\to B is \operatorname{dom}f = A and the codomain is \operatorname{cod}f = B.

The cartesian product of sets A and B is {A\times B = \{(a,b)\mid a\in A, b\in B\}}, the set of ordered pairs of elements from the two sets.

Exercise. (Kuratowski) Show we can identify the ordered pair (a,b) with the set \{\{a\},\{a,b\}\}.

Hint: An ordered pair is defined by (a,b) = (c,d) if and only if a = c and b = d. Show \{\{a\}, \{a,b\}\} = \{\{c\}, \{c,d\}\} if and only if a = c and b = d.

Exercise. A function can be identified with its \operatorname{graph}f = \{(a, f(a))\mid a\in A\}\subseteq A\times B.

Hint: f(a) = b if and only if (a,b)\in\operatorname{graph}f.

If f\colon A\to B we say a,a'\in A are equivalent if f(a) = f(a'). The equivalence class of a\in A is [a] = \{a'\in A\mid f(a) = f(a')\}. The kernel of a f is \ker f = \{[a]\mid a\in A\}.

The kernel of a function is a partition of its domain.

Exercise. Show the union of \ker f is A.

Hint. Show \cup_{a\in A} [a] = A using a\in[A] for a\in A.

Exercise. Show the elements of \ker f are either disjoint or equal.

Hint. If x\in [a]\cap[a'] then f(x) = f(a) and f(x) = f(a') so f(a) = f(a'). If y\in [a] then f(y) = f(a) = f(a') so y\in [a']. Show the converse also holds.

The last two exercises show \ker f is a partiion of A. Partitions are a way to loosen the notion of equality to equivalence. They are also a way of modeling partial information. Full information is knowing the element a\in A, partial information is only knowing [a], no information is only knowing a\in A.

Exercise. Show the parition \{\{a\}\mid a\in A\} represents full information and the partition \{A\} represents no information.

A function f\colon A\to B is one-to-one, or injective, if [a] is a singleton (has exactly one element) for all a\in A.

Exercise. If f\colon A\to B is injective then f(a) = f(a') implies a = a'.

The range of f\colon A\to B is \operatorname{ran}f = \{f(a)\mid a\in A\}. It is onto, or surjective if \operatorname{ran}f = B.

Exercise. If f\colon A\to B is surjective then for every b\in B there exists at least one a\in A with f(a) = b.

Set exponential, domain, codomain…

Composition

The composition of two compatible functions is a function \circ\colon\{A\to B\}\times\{B\to C\}\to\{A\to C\} defined by \circ(f,g)(a) = g(f(a)), a\in A. This is also written as g\circ f or simply gf when it is clear composition is intended. Note the order of f and g are reversed. If \operatorname{cod}f = \operatorname{dom}g then the composition {\circ(f,g) = g\circ f = gf} exists. We could write B^A as \{B\leftarrow A\} and define composition as \circ\colon\{C\leftarrow B\}\times\{B\leftarrow A\}\to\{C\leftarrow A\} by \circ(g,f)(a) = g(f(a)) so we don’t have to reverse the order of f and g, but we resist this temptation and stick to standard mathematical notation.

Exercise. Show composition of functions is associative.

Hint: If f\colon A\to B, g\colon B\to C, and h\colon C\to D show h\circ(g\circ f) = (h\circ g)\circ f.

This allows us to write h\circ g\circ f unambiguosly.

Given a set A define the identity function 1_A\colon A\to A by 1_A(a) = a for a\in A.

Exercise. If f\colon A\to B is a function then f1_A = f and 1_Bf = f.

The last two exercises shows sets and functions are the objects and arrows of a category denoted by \text{\bf{Set}}.

Exercise. _Show if f\colon A\to B is injective then there exists g\colon B\to A with gf = 1_A.

Hint: If f(a) = b then b\in[a]. Since [a] is a singleton we can define g(b) = a.

Exercise. _Show if f\colon A\to B is surjective then there exists g\colon B\to A with fg = 1_B.

Hint: For every b\in B there exists a\in A with f(a) = b. Let g(b) = a. In general, there are many right inverses.

These two exercises show it is possible to define injective and surjective without mentioning elements of sets.

Curry

Currying and uncurrying provides a connection between set exponential and cartesian product. The set {\{A\times B\to C\}} is in one-to-one correspondence with the set {\{A\to\{B\to C\}\}}. Given {f\in\{A\times B\to C\}} define {\operatorname{curry}f\colon A\to\{B\to C\}} by {((\operatorname{curry}f)(a))(b) = f(a,b)}. The inverse is uncurrying. Given {g\in\{A\to\{B\to C\}\}} define {\operatorname{uncurry}g\colon A\times B\to C} by {(\operatorname{uncurry}g)(a,b) = (g(a))(b)}.

Using the non-standard notation \operatorname{curry}f = f, then f,a(b) = f(a,b) and \operatorname{uncurry}g = ,g then ,g(a,b) = ga(b).

Exercise. Show ,(f,) = f and (,g), = g.

Hint. ,(f,)(a,b) = f,a(b) = f(a,b) and (,g),a(b) = ,g(a,b) = ga(b).

Evaluation

Evaluation is a function \operatorname{eval}\colon\{A\to B\}\times A\to B defined by \operatorname{eval}(f,a) = f(a), f\in\{A\to B\}, a\in A.

Exercise. Show \operatorname{eval}(f,a) = f1_{\{a\}}(a).

Relation

A relation on sets A and B is a subset R\subseteq A\times B. We write aRb whenever (a,b)\in R. The left coset is Rb = \{a\in A\mid aRb\} and the right coset iis aRb = \{b\in B\mid aRb\}. If for every a\in A the right coset aR is a singleton then R is a function and we write R(a) = b where b\in B is the unique element of aR.

The composition of relations R\subseteq A\times B and S\subseteq B\times C is SR = \{a\in A, c\in C\mid aRb\text{ and }bSc\text{ for some }b\in B\}. This is used in the definition of the various JOIN commands in SQL.

Relations on A\times A are classified by properties they satisfy

The relation equal is =\subseteq\{(a,a)\mid a\in A\}. It is a left and right identity for composition.

Exercise. Show =R, R, and R= are all equal.

Exercise. Show if R is reflexive then =\subseteq R.

Exercise. Show R is symmetric if and only if aR = Ra, a\in A.

The Kleen star of a relation R is R^* = \cup_{n>0} R^n where R^n is R composed with itself n times.

Exercise. Show the Kleen star of any relation is transitive.

Hint. Note aR^*b if and only if aR^nb for some n and R^nR^m = R^{n+m}.

An equivalence relation is reflexive, symmetric, and transitive. Note equal is an equivalence relation.

Exercise. _Show if R is an equivalence relation then aRb if and only if aR = Rb.

This shows equivalence is equality of cosets.

Exercise. If R is an equivalence relation show \{aR\mid a\in A\} is a partition of A.

Hint. Show \cup_{a\in A}aR = A and if aR\cap Rb\neq\emptyset then aR = Rb.

Exercise. If \{A_i\}_{i\in I} is a partition of A and aRb if and only if there exists i\in I with a,b\in A_i then R is an equivalence relation.

Vector Space

If the base of a set exponential is a field then \bm{F}^I is a vector space for any index set I. Typically \bm{F} is the real numbers \bm{R}, the complex numbers \bm{C}, the rational numbers \bm{Q}, or integers modulo p \bm{Z}_p where p is prime. If I = \{1,\ldots,n\} or I = \{0,\ldots,n-1\} we write \bm{F}^{\bm{n}} when the base index of 1 or 0 is understood. Scalar multiplication and vector addition are defined pointwise: (ax)(i) = a(x(i)) and (x + y)(i) = x(i) + y(i) for a\in\bm{F}, x,y\in\bm{F}^I. The additive identity is \bm{0} = 0_I defined by \bm{0}(i) = 0, i\in I. The additive inverse of x\in\bm{F}^I is -x defined by (-x)(i) = -x(i), i\in I.

Exercise. Show \bm{F}^I satisfies the vector space axioms.

The standard basis of \bm{R}^I is (e_i)_{i\in I} where e_i(j) = \delta_{ij} is the Kronecker delta. We write e_i^I to indicate e_i\in\bm{F}^I.

Exercise. Show x = \sum_{i\in I} x(i) e_i when I is finite.

Hint. If x,y\in\bm{F}^I then x = y if and only if x(j) = y(j) for all j\in I. What is x(j)?

We can identify \bm{F}^I with the cartesian product of I copies of the field \bm{F}. The element (x_i)_{i\in I} in the cartesian product corresponds to the function x\in\bm{F}^I by x(i) = x_i, i\in I.

A simple but quite useful observation is that if \sigma\colon J\to I is any function then \circ\sigma\colon\bm{F}^I\to\bm{F}^J by x\mapsto x\circ\sigma, x\in\bm{F}^I.

Exercise. Show \circ\sigma(ax) = a(\circ\sigma(x)) and \circ\sigma(x + y) = \circ\sigma(x) + \circ\sigma(y) for a\in\bm{R} and x,y\in\bm{F}^I.

Hint: Show (ax)\sigma = a(x\sigma) and (x + y)\sigma = x\sigma + y\sigma on \bm{F}^J.

If i\in I and {\sigma_i\colon\{i\}\to I} is the inclusion {i\mapsto i} then {\circ\sigma_i\colon\bm{F}^I\to\bm{F}^{\{i\}}}. We can identify \bm{F}^{\{i\}} with \bm{F} by (x)\leftrightarrow x and get a map {e_i^* = e_i^{I*}\colon\bm{F}^I\to\bm{F}}.

Exercise. Show x = \sum_{i\in I}e_i^*(x) e_i, x\in\bm{F}^I, if I is finite.

Hint: Show e_i*(x) = x(i).

A subspace is a subset V\subseteq\bm{R}^I that is closed under scalar multiplication and vector addition. There are two trivial subspaces.

Exercise. Show \{0\} and \bm{F}^I are subspaces of \bm{F}^I.

Exercise. Show if v\in\bm{F}^I then \{av\mid a\in\bm{F}\} is a subspace.

Hint: b(av) = (ba)v and av + bv = (a + b)v for a,b\in\bm{F}.

Given v_1,\ldots,v_k\in\bm{F}^I define their span as the set of linear combinations \operatorname{span}\{v_j\}_{j=1}^k = \{\sum_{j=1}^k a_j v_j\mid a_f\in\bm{F}\}.

Exercise. Show the span is a subspace.

Hint. Use a(\sum_j a_j v_j) = \sum_j aa_j v_j and \sum_j a_j v_j + \sum_j b_j v_j = \sum_j (a_j + b_j) v_j.

The vectors v_1,\ldots,v_k\in\bm{F}^I are independent if for a_1,\ldots,a_k\in\bm{F}, \sum_j a_j v_j = 0 implies a_j = 0 for all j.

The vectors v_1,\ldots,v_k\in\bm{F}^I are dependent (not independent) if there exist a_1,\ldots,a_k\in\bm{F}, not all 0, with \sum_j a_j v_j = 0. If a_j\not=0 then v_j = -\sum_{i\not=j} a_i/a_j v_i.

Linear Operator

Any function T\colon\bm{F}^I\to\bm{F}^J satisfying T(ax) = a(Tx) and T(x + y) = Tx + Ty for a\in\bm{F}, x,y\in\bm{F}^I is a linear operator. We use \mathcal{L}(\bm{F}^I, \bm{F}^J) and [\bm{R}^I\to\bm{F}^J] to denote the space of all linear operators. Linear operators also are a vector space with scalar multiplication (aT)(x) = aT(x) and vector addition (T + U)(x) = Tx + Ux, a\in\bm{F}, T,U\in[\bm{F}^I\to\bm{F}^J], x\in\bm{F}^I.

Exercise. Prove T0 = 0.

Hint: Usee x + x = x implies x=\bm{0}.

The kernel of a linear operator T\colon\bm{F}^I\to\bm{F}^J is {\ker T = \{x\in\bm{F}^I\mid Tx = 0\}\subseteq\bm{F}^J}.

Exercise. Show \ker T is a subspace of \bm{F}^I.

Hint. Show if Tx = 0 then T(ax) = 0 and Tx = 0, Ty = 0 implies T(x + y) = 0$.

Exercise. Show if Tx = Ty then x - y\in\ker T.

Hint. Use T is linear.

Exercise. Show if \ker T = \{0\} then T is injective.

The matrix of a linear operator T\in[\bm{F}^I\to\bm{F}^J] is [T]\in\bm{F}^{I\times J} defined by [t_{ij}] = [T](i,j) = e_j^{J*} Te_i^I, i\in I, j\in J.

Since I\times J is a set \bm{F}^{I\times J} is a vector space and [T] = \sum_{i,j} t_{ij} e_{ij} where (e_{ij})_{i\in I, j\in J} is the standard basis of \bm{F}^{I\times J}.

Exercise. Show T\mapsto [T] is linear.

Hint. Show [aT] = a[T] and [T + U] = [T] + [U].

Exercise. If T\colon\bm{F}^I\bm{F}^J and U\colon\bm{F}^J\to\bm{F}^K then [UT](i,k) = \sum_{j\in J} t(i,j) u(j,k).

Hint.