Tensor

Keith A. Lewis

March 30, 2025

There seems to be some uncertainty in the computer science community about exactly what a tensor is. This short note clarifies this by giving names to functions and composing them. The mathematical definitions can be translated directly into any computer language that allows functions to be defined and called.

The vector space \bm{R}^n is the set of tuples {x = (x_1,\ldots,x_n)} where {x_i\in\bm{R}}, {1\le i\le n}. Scalar multiplication is defined by {a(x_1,\ldots,x_n) = (ax_1,\ldots,ax_n)} for {a\in\bm{R}}. Vector addition is defined by {(x_1,\ldots,x_n) + (y_1,\ldots,y_n) = (x_1 + y_1,\ldots,x_n + y_n)}.

Definitions involving dots are not ammenable to computer implementation. Every vector {x\in\bm{R}^n} determines a function {\bm{x}\colon\bm{n}\to\bm{R}} by {\bm{x}(i) = x_i} for {i\in\bm{n} = \{1,\ldots,n\}}. Scalar multiplication and vector addition can be defined pointwise by {(a\bm{x})(i) = a(\bm{x}(i))} and {(\bm{x} + \bm{y})(i) = \bm{x}(i) + \bm{y}(i)}, {1\le i\le n}.

The set of tuples \bm{R}^n is not the same as the set of functions \bm{R}^\bm{n} but they are in one-to-one correspondence. Define a map {\iota\colon\bm{R}^n\to\bm{R}^\bm{n}} by {\iota x(i) = x_i}, {x\in\bm{R}^n}, {i\in\bm{n}}.

Exercise. Show if \iota x = \iota y then x = y, x,y\in\bm{R}^n

Hint: Use x = y if and only if x_i = y_i for all i\in\bm{n}.

This shows \iota is one-to-one, or injective.

Exercise. Show for every \bm{x}\in\bm{R}^\bm{n} there exists x\in\bm{R}^n with \iota x = \bm{x}.

Hint: Given \bm{x}\in\bm{R}^\bm{n} let x_i = \bm{x}(i), i\in\bm{n}.

This show \iota is onto, or surjective. A function that is one-to-one and onto is a one-to-one correspondence, or bijective.

Exercise. Show \iota(ax) = a(\iota x) and \iota(x + y) = \iota x + \iota y for a\in\bm{R}, x,y\in\bm{R}^n.

This shows \iota is a linear operator, a function from one vector space to another that preserves the linear structure. If a linear operator is bijective we call it an isomorphism.

Cartesian Product

The cartesian product \prod_{i\in I}A_i of sets A_i, i\in I, is defined by projections \pi_i\colon \Pi_{i\in I}A_i\to A_i, i\in I having the property that if P is a set with functions p_i\colon P\to A_i, i\in I, then there exists a function p\colon P\to\prod_{i\in I}A_i with \pi_ip = p_i, i\in I.

If P is a set with functions p_i\colon P\to A_i, i\in I having the property that if Q is a set with functions q_i\colon Q\to A_i, i\in I, then there exists a function q\colon Q\to P with p_iq = q_i, i\in I, then P is in one-to-one correspondence with \prod_{i\in I}A_i.

Exercise. Prove this.

Hint: Let Q = \prod_{i\in I}A_i and q_i = \pi_i.

Solution We have p\colon P\to\prod_{i\in I}A_i with \pi_ip = p_i and q\colon\prod_{i\in I}A_i\to P with p_i q = \pi_i. Use this to show this defines an isomorphism.

Set Exponential

Given sets A and B, the set exponential B^A is the set of all functions from A to B. It is related to cartesian product by C^{A\times B} is in one-to-one correspondence with C^{B^A}. If we write B^A = \{A\to B\} this says \{A\times B\to C\} is in one-to-one correspondence with \{A\to\{B\to C\}\} and is more easily recognized as currying. If we have a function f\colon A\times B\to C and a\in A we can define a function from B\to C by b\mapsto f(a,b). Conversly, if we have a function g\colon A\to C^B we can define a function from A\times B to C by (a,b)\mapsto (g(a))(b) = (ga)b.

If f\colon A\to B is a function then the value of f at a\in A is f(a). We reify this with the name \operatorname{eval}_B^A\colon B^A\times A\to B defined by \operatorname{eval}_B^A(f,a) = f(a). If g\in (C^B)^A and a\in A then \operatorname{eval}_{C^B}^A(g, a)\in C^B is called partial evaluation.

Tuple, Array

Given any index set I the cartesian product \prod_{i\in I}\bm{R} is isomorphic to the set exponential \bm{R}^I. Elements of the cartesian product are tuples. Elements of the set exponential are arrays.

A view of \bm{R}^I is a function \sigma\colon J\to I and is used to define a function \circ\sigma\colon\bm{R}^I\to\bm{R}^J by right composition (\circ\sigma)x = x\circ\sigma = x\sigma so (x\sigma)(j) = x(\sigma(j)), x\in\bm{R}^I, j\in J.

Exercise. Show \circ\sigma is a linear operator.

Hint: Show a(x\sigma) = (ax)\sigma and (x + y)\sigma = x\sigma + y\sigma.

Exercise. Show if \sigma is one-to-one then \circ\sigma is onto.

Hint: Show there exists \sigma^\dashv\colon I\to J with \sigma^\dashv\sigma = 1_J. If x\in\bm{R}^J then x\sigma^\dashv\in\bm{R}^I and x\sigma^\dashv\sigma = x.

Exercise. Show if \sigma is onto then \circ\sigma is one-to-one.

Hint: Show there exists \sigma^\vdash\colon J\to I with \sigma\sigma^\vdash = 1_I. If x\sigma = y\sigma, x,y\in\bm{R}^I, then x\sigma\sigma^\vdash = y\sigma\sigma^\vdash and x = y.

This shows if \sigma is one-to-one and onto then \circ\sigma is an isomorphism.

If J\subset I we write [J]\colon J\to I for the inclusion map and call x[J] the projection on J of x\in\bm{R}^I.

Dual

The dual of the vector space \bm{R}^I is the set of all linear operators from \bm{R}^I to \bm{R}. The standard basis of \bm{R}^I is (e_i)_{i\in I} where e_i(j) = \delta_{ij} is the Kronecker delta: \delta_{ij} = 1 if i = j and \delta_{ij} = 0 if i \not= j.

Exercise. Show x = \sum_{i\in I} x(i)e_i if I is finite.

Hint: Evaluate x(j), j\in I.

Matrix

A matrix is an array where the index set is the cartesian product of two sets \bm{R}^{I\times J}.

A vector is usually represented on a computer as a contiguous array of memory with elements of the same type. The real numbers are uniquely characterized mathematically as a complete Archemedean ordered field. Computers can only model real numbers as a finite number of bits. The most common representation is 64-bit or 32-bit IEEE 754 floating point.