Logical Entropy

Keith A. Lewis

January 26, 2025

Abstract

An improvement over Shannon entropy

Shannon entropy is defined for measures, not random variables. It is a common mistake to define the entropy of a discrete random variable XX, where the probability XX takes on the value xjx_j with probability pjp_j, as jpjlogpj-\sum_j p_j \log p_j. Note this definition of entropy does not involve any xjx_j.

It is an even worse mistake to define the entropy of a random variable with density ff as Rf(x)logf(x)dx-\int_\mathbf{R}f(x)\log f(x)\,dx. It is true that every random variable XX with cumulative distribution F(x)=P(Xx)F(x) = P(X\le x) can be modelled as the identity function on the sample space of real numbers with probablility measure P((,x])=xdF(x)P((-\infty, x]) = \int_{-\infty}^x dF(x).

Joint distributions. Negative entropy.

Logical entropy can incorporate values or random variables…

Logic

Boole and De Morgan layed down the foundation for computer science by showing predicate calculus can be reduced to arithmetic.

Frege introduced the notion of parameterizing a predicate with a variable.

Set

Frege advanced the predicate calculus of logic developed by Boole and De Morgan by

A set is a collection of elements that are members of the set.

S={a,...}S = \{a,...\}.

Relation

A relation RR on sets AA and BB is a subset of their cartesian product RA×BR\subseteq A\times B. The domain of RR is domR=A\operatorname{dom}R = A and the codomain of RR is codR=B\operatorname{cod}R = B. We write R ⁣:AB{R\colon A\leftrightarrow B} and aRbaRb when (a,b)R{(a,b)\in R}. The codset aR={bBaRb}{aR = \{b\in B\mid aRb\}} and the domset Rb={aAaRb}{Rb = \{a\in A\mid aRb\}}. If SB×CS\subseteq B\times C is a relation on sets BB and CC, then the composition SR ⁣:ACSR\colon A\leftrightarrow C is a relation on A×CA\times C where a(SR)ca(SR)c if and only if there is a bBb\in B with aRbaRb and bScbSc. This is used to define the join of relational database tables.

Function

A relation R ⁣:ABR\colon A\leftrightarrow B is a function if the codset aRaR has exactly one element for every aAa\in A. To indicate this we write R ⁣:ABR\colon A\to B and b=R(a)b = R(a) where bb is the unique element of aRaR. A function is injective, or one-to-one, if RbRb has at most one element for all bBb\in B.

Exercise. If f ⁣:ABf\colon A\to B is injective and g,h ⁣:CAg,h\colon C\to A are functions where fg=fhfg = fh then g=hg = h.

A function is surjective, or onto, if Rb>0|Rb| > 0 for all bBb\in B. There is some aAa\in A with aRbaRb.

Exercise. If f ⁣:ABf\colon A\to B is surjective and g,h ⁣:BCg,h\colon B\to C are functions where gf=hfgf = hf then g=hg = h.

A part of a set AA is an injective function P ⁣:IAP\colon I\to A. It corresponds to the subset P(I)={P(i)iI}AP(I) = \{P(i)\mid i\in I\}\subseteq A.

A partition of a set AA is a surjective function Π ⁣:AI\Pi\colon A\to I. Let Ai={aAΠ(a)=i}{A_i = \{a\in A\mid \Pi(a) = i\}}.

Exercise. Show either Ai=AjA_i = A_j or AiAj=A_i\cap A_j = \emptyset for i,jIi,j\in I.

Exercise. Show iIAi=A\cup_{i\in I}A_i = A.

Partitions are a mathematically rigorous way of representing partial information. Given a set AA, full information is the partition of singletons {{a}aA}{\{\{a\}\mid a\in A\}}, no information is {A}\{A\}, partial information is knowing only which AiA_i in the partition that aAa\in A belongs to.