Decision problems are yes-or-no problems; this means, problems in which for every (sensible) input a "yes" or "no" answer must be given. Here are examples of some decision problems:
(Actually, to make these problems precise, one would need to specify what representation is used for the input; notice that there are several representations of graphs, etc.)
Let is ask if for any problem above there is an algorithm,
which accepts the problem inputs, always correctly answers "yes" or "no", and always terminates.
The reader can construct such algorithms for the first five problems. Is there
an algorithm for problem 6? Running the Java program (or simulating a run) provides
an answer to the question about halting, but only if the Java program terminates! If the Java program does not terminate
you will never know if it is going to run indefinitely or if it is going to
terminate after just one more minute of execution. So, an approach would be needed other than running the
program. You can look at the code, and if you
spot
while(true){...},
which is executed in the program, you will know that the
program does not terminate. But there are programs which do not terminate and do
not contain any error as obvious as
while(true){...}.
Is there an algorithm to perform such an analysis in those less obvious
cases? It has been proved
that such an algorithm does not exist. I do not mean that such an algorithm has not
been yet discovered -- I mean, there is no such an algorithm and there never will
be! So, we say that
the halting problem is undecidable. Remarkably,
undecidability of the halting problem has been proved for all sufficiently advanced programming languages (not just
Java) and this result has been obtained in 1930's -- years before computers and
Java came into being. It is also remarkable that the proof is based on an idea
of Epimenides, an ancient Greek philosopher who asked: if a man is
saying "I am lying", is he lying or telling the truth? This
logical problem lead to formulation of the diagonal method which proved
not only the undecidability of the halting problem but also many other results.
It is easy to provide an algorithm corresponding to any decision problem which accepts that problem's inputs and which gives only correct answers, however it may turn out impossible to make this algorithm to give an answer for every input. This may be restated: it may turn out impossible to make the algorithm to terminate on every input.
Now let us take a look at problems which have terminating decision algorithms. Let us ask, in the case of problems 1-5 above, how long it takes for an algorithm to terminate. Of course, for any chosen problem this still depends on:
With some technical considerations, the question how long it takes for an algorithm to terminate can be made independent of how it is translated to a program and what computer it runs on. (You can take it for granted or you can make some simplifying assumptions, such as: we will consider only the algorithms that are already written in Java and we will run them on unix.cs.plattsburgh.edu.) The two essential parameters remain: what algorithm you chose for this problem and what input you give.
If for (sufficiently big) n, on all inputs of length up to n, an algorithm terminates in
a time proportional to n or better, we will call it an O(n) algorithm or
a linear-or-better algorithm. (O(n) is read "big Oh of n").
If for (sufficiently big) n, on all inputs of length up to n, an
algorithm terminates in a time proportional to n2 or better we will call it an O(n2) algorithm or a quadratic-or-better
algorithm.
Similarly, we can define:
O(n3) algorithms, i.e. cubic-or-better algorithms,
O(n log n) algorithms, i.e. (n log n)-or-better algorithms,
O(2n) algorithms, i.e. 2n-or-better algorithms,
etc.
Notice the phrase "or better" used in these definitions -- because of it, every O(n) algorithm is also a O(n log n) algorithm, which is also a O(n2) algorithm, which is also a O(n3) algorithm, which is also a O(2n) algorithm.
If an algorithm is a O(n) algorithm but not a O(nk) algorithm for any k<1 then we call it a theta(n) algorithm or a linear algorithm. If an algorithm is a O(n2) algorithm but not a O(nk) algorithm for any k<2 then we call it a theta(n2) algorithm or a quadratic algorithm. If an algorithm is a O(n3) algorithm but not a O(nk) algorithm for any k<3 then we call it a theta(n3) algorithm or a cubic algorithm. One can also define:
theta(n log n) algorithms, i.e. n log n algorithms,
theta(2n) algorithms, i.e. 2n algorithms,
etc.
Algorithms which are 2n or 3n or kn for some fixed real number k>1 are called exponential algorithms. Notice that 23n = 8n , so a 23n algorithm is exponential.
There can be many algorithms for the same problem; roughly speaking there can be some best (fastest) algorithm and other algorithms which give the same output but waste time by doing things differently or even doing things which lead nowhere. Among many algorithms for one problem, a linear algorithm is better than any n log n algorithm, which is better than any quadratic algorithm, which is better than any cubic algorithm, which is better than any 2n algorithm. We say that a problem has linear complexity if the best (fastest) algorithm for that problem is linear. We say that a problem has quadratic complexity if the best (fastest) algorithm for that problem is quadratic. Etc.
For instance, the problem if for given two sorted lists, every member of the first list belongs to the second list, can be decided by a naive quadratic algorithm which involves a loop nested in another loop. The same problem can be decided by a linear algorithm which uses only one loop and takes advantage of the assumption that both lists are sorted. (Can you formulate such an algorithm?) So, this problem is linear or better. Actually, it cannot be better than linear because in order to process all the n elements of the lists one needs at least time proportional to n. So the problem is linear.
Algorithms which are linear or n log n are considered fast. Algorithms which are quadratic, cubic or nk for some fixed k which is not too big are considered acceptable. Exponential algorithms are considered impractical.
Hamiltonian path problem has an exponential algorithm. It is not known if a better (faster) algorithm exists. Proving or disproving existence of a better algorithm would solve the most famous problem of theoretical computer science -- the P=NP? problem. This problem was formulated in 1971 and remains open. It is one of the seven Millennium Problems with a $1,000,000 prize funded by Clay Mathematics Institute.
Decision problems without algorithms and related issues are considered in our csc422 - Theory of Computation. Useful algorithms and their complexity are considered in our csc321 - Design and Analysis of Algorithms. P=NP? is an extra credit problem given in either of theses courses; solving it will give you an A+ in the course (and a world fame rivalling that of Albert Einstein).
Originally published in Standard Output, Vol. 2, No. 3, April 2007.