*complexity class P*. Having recently joined the Cryptography group at Bristol as a Mathematician, I knew very little theoretical computer science when I first started my PhD and I'm sure there will be plenty of others in my situation, so this blog will start right from the beginning and you can skip over parts you know already. First, we'll give an overview of what

*complexity*means and why it matters, then we'll define

*Turing machines*, and finally arrive at the

*complexity class P*, concluding with an example.

*Introduction to the Theory of Computation*by Michael Sipser [1], which I have found hugely helpful.

**Section 1: Complexity and Big O Notation**

*number of operations*that a certain

*model*of a computer would take to do it. This is called

*(time) complexity theory*.

Typically, though, the number of operations required will depend on the

*input*to the task and may vary even with inputs of the same length. As a pertinent example, say we design a computer program which tells you whether or not an integer you input is prime. If we give as input the number 256, the program will probably output 'not prime' sooner than if we had given it the number 323 (even though they both have length 9 when written as binary integers, for example), since the first integer has a very small prime factor (2) and the second has larger factors (17 and 19). Therefore we usually opt for a

*worst-case analysis*where we record the longest running time of all inputs of a particular length. So we obtain an algebraic expression $t(n)$ that reflects the longest running time of all inputs of length $n$.

*dominant*term in the expression and also ignore any constant factors. This is called

*asymptotic analysis*; we assume $n$ is enormous and ask roughly how many steps the model of computation will take to 'finish' when given the worst possible input of length $n$, writing our answer in the form $\mathcal{O}\left(t\left(n\right)\right)$. For example, if we find that our process takes $6n^3 – n^2 + 1$ steps, we write that it is $\mathcal{O}\left(n^{3}\right)$, since all other terms can be ignored for very large $n$.

**Section 2: Turing Machines**

*alphabet*is a non-empty finite set and a

*string*is a finite sequence of elements (symbols) from an alphabet. A

*language*is simply a set of strings.

*Turing machine*models what real computers can do. Its 'memory' is an infinitely long tape. At any time, each square of the tape is either blank or contains a symbol from some specified alphabet. The machine has a tape head that can move left or right along the tape, one square at a time, and read from and write to that square. At first, the tape is all blank except for the leftmost $n$ squares which constitute the input (none of which can be blank so that it is clear where the input ends). The tape head starts at the leftmost square, reads the first input symbol and then decides what to do next according to a

*transition function*. The transition function depends on what it reads at the square it is currently on and the

*state*that the machine is currently in (like a record of what it has done so far) and returns

- a new state
- another symbol to write to the square it is on (though this symbol might be the same as what was already written there)
- a direction to move in: left or right.

*accept state*or

*reject state*.

*accepts its input*. Similarly it may

*reject its input*. In either case we say the machine

*halts*on its input. But note that it may enter a loop without accepting or rejecting i.e. it may never halt. If a Turing machine accepts every string in some language and rejects all other strings, then we say the machine

*decides*that language. We can think of this as the machine testing whether or not the input string is a member of the language. Given a language, if there is a Turing machine that decides it, we say the language is

*decidable*.

*Church-Turing thesis*[2]). We define the

*time complexity class*$\mathrm{TIME}\left(t\left(n\right)\right)$ to be the collection of all languages that are decidable by an $\mathcal{O}\left(t\left(n\right)\right)$ time Turing machine, then we turn computational problems into questions about language membership (is an input string a member of a certain language? e.g. does this string representing an integer belong to the language of strings representing prime integers?) and can partition computational problems into time complexity classes.

**Section 3: The Complexity Class P**

*polynomial time*. The

*complexity class P*is the class of all languages that are decidable in polynomial time by a Turing machine. Since $k$ could be very large, such Turing machines are not necessarily all practical, (let alone 'fast'!), but this class is a rough model for what can be realistically achieved by a computer. Note that the class P is fundamentally different to those languages where $t(n)$ has $n$ in an exponent, such as $2^n$, which grow much, much faster as $n$ increases – so fast that even if you have a decider for some language, you may find that the universe ends before it halts on your input!

*directed graph*(a set of nodes and edges where there is at most one edge between any pair of nodes and each edge has an arrow indicating a direction). Then if we encode the graph and the two nodes as a single string, we can form a language consisting of those strings representing a graph and two nodes such that it is possible to follow the edges from the first node and eventually arrive at the second. So a decider for this language will effectively answer the question of whether there is a path from the first node A to the second B, called the

*path problem*, by accepting or rejecting the graph and nodes you input. We give a decider for this language and show that it decides in polynomial time.

- First put a mark on A.
- Scan all the edges of the graph. If you find an edge from a marked node to an unmarked node, mark the unmarked node.
- Repeat the above until you mark no new nodes.
- If B is marked, accept. Otherwise, reject.

*the path problem is in P*.