Perceptrons

Perceptrons were invented by (McCulloch and Pitts 1943), two college professors in the department of psychiatry and neuropsychiatricietry, respectively, at the University of Chicago. Their abstract starts with:

(Ros1957?) coined the term percepton and described the model used today. The limitations of a single perceptron were quickly realized and Rosenblatt used layers containing multple perceptrons, a neural network, to build his Mark I Perceptron. His model assumed the output of a perceptron was 0 or 1 corresponding to an actual neuron either not firing or not firing. Later models softend this to emiting a number between 0 and 1 that was a smooth function of the inputs using a RelU.

“the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”

Classification

Large Language Models are functions from strings to strings. Given a string of characters it returns a string of characters. Models are trained by specifying a set of input strings and the expected output strings then finds a function that interpolates this data.

In (Searle 1980) the philosophical aspects of this were were considered. He stipulated a Chinese Room where English text was slipped under a door and a human being who did not know Chinese simply applied a set of fixed rules to translate it to Chinese characters.

Recall that a function from set A to set B is an element of the set exponential B^A = \{f\colon A\to B\}. Every function can be represented by its graph \{(a, f(a))\mid a\in A\} which is a subset of the cartesian product A\times B. If A has |A| elements and B has |B| elements then B^A has |B^A| = |B|^|A| elements. Given the current estimate of the number of elementary particles in the universe is considerably less than 10^{100} it is not physically possible to represent graphs of all functions in B^A on a computer even for moderately sizes of A and B.

(Church 1941) invented the lambda calculus to express any computable function. His student Alan Turing invented his eponymus Turing Machine that has equivalent computing power.

Turing considered an infinite discrete tape that could have marks made on it and move left and right. It’s legacy is BrainFuck

Church’s lambda calculus specified an expression as either a variable, abstraction, or application. A variable is a unique symbol, an abstraction is a \lambda x. E where x is a variable and E is an expression, an _application is EF where E and F are expressions. Unlike BrainFuck, many languaged based on the lambda calculus have been written. One of the earliest was Standard ML of New Jersey. This led to Ocaml that Jane Street Campital uses. The current most popular function programming language is Haskell based on the SKI combinators.

Given sets S_0, S_1\subseteq\boldsymbol{{R}}^n a function f\colon\boldsymbol{{R}}^n\to\{0,1\} classifies the data if x\in S_0 implies f(x) = 0 and x\in S_1 implies f(x) = 1. find a function

A perceptron is a function \pi\colon\boldsymbol{{R}}^n\to\{0,1\} defined by b\in\boldsymbol{{R}} and w\in\boldsymbol{{R}}^n where \pi(x) = b + w\cdot x > 0.

w' = w + r(d - f(x))x

Church, Alonzo. 1941. The Calculi of Lambda-Conversion. Vol. 6. Annals of Mathematics Studies. Princeton, NJ: Princeton University Press. https://doi.org/10.1515/9781400882618.

McCulloch, Warren S., and Walter Pitts. 1943. “A Logical Calculus of the Ideas Immanent in Nervous Activity.” The Bulletin of Mathematical Biophysics 5 (4): 115–33. https://doi.org/10.1007/BF02478259.

Searle, John. 1980. “Minds, Brains and Programs.” Behavioral and Brain Sciences 3 (3): 417–57. https://doi.org/10.1017/S0140525X00005756.