__: In algorithms design and development, we are concerned with asymptotic performance i.e how does the algorithm behave as the problem size gets very large?__

**Asymptotic Performance**- Running time - Time Complexity
- Memory/storage requirements - Space Complexity - This is essentially the number of memory cells which an algorithm needs.
- Bandwidth/power requirements/logic gates/etc.

*time-space-tradeoff*involved in a problem, that is, it cannot be solved with few computing time

*and*low memory consumption. One then has to make a compromise and to exchange computing time for memory consumption or vice-versa, depending on which algorithm one chooses and how one parametrizes it.

**s is performed with respect to a computational model. A generic uniprocessor random-access machine (RAM) is used with**

__Analysis of algorithm__- All memory equally expensive to access
- No concurrent operations
- All reasonable instructions take unit time except, of course, function calls
- Constant word size unless explicitly bits are manipulated

Time and space complexity are generally functions of the input size. How we characterize input size depends on

- Number of input items - Sorting
- Total number of bits - Multiplication
- Number of nodes and edges - Graph Algorithms etc.

**Time Complexity analysis**- Worst Case provides an upper bound on running time i.e. absolute guarantee
- Average Case gives expected time for random (equally likely ) or real inputs

For example :

__Insertion Sort____Statement__

__Effort__

InsertionSort(A, n) {

for i = 2 to n { c

_{1}n key = A[i] c

_{2}(n-1) j = i - 1; c

_{3}(n-1) while (j > 0) and (A[j] > key) { c

_{4}T A[j+1] = A[j] c

_{5}(T-(n-1)) j = j - 1 c

_{6}(T-(n-1)) } 0

A[j+1] = key c

_{7}(n-1) } 0

}

where T = t

T(n) = c

_{2}+ t_{3}+ … + t_{n}where ti is number of while expression evaluations for the ith loop iterationT(n) = c

_{1}n + c_{2}(n-1) + c_{3}(n-1) + c_{4}T + c_{5}(T-(n-1)) + c_{6}(T-(n-1)) + c_{7}(n-1) = c_{8}T + c_{9}n + c_{10}_{ }- Best Case: Inner loop never gets executed => ti = 1 => T(n) is lineari.e .Ω(n)
- Worst Case: Inner loop gets executed for all previous elements => ti = i => T(n) is quadratic

*O*(

*n*) - in place,

*O*(1) extra space

__To prove the incorrectness of an algorithm, one__

**Correctness of Algorithm and Loop invariance:**counter-example is enough. Proving the correctness of an algorithm is similar to proving a mathematical theorem. Fundamentally, it’s algorithm-dependent, but there are still some general guidelines we can follow. An example: Proof by Loop Invariant. In simple words, a loop invariant is some predicate (condition) that holds for every iteration of the loop. For example, let's look at a simple

`for`

loop that looks like this:`int j = 9;`

for(int i=0; i<10; i++)

j--;

In this example it is true (for every iteration) that `i + j == 9`

. A weaker invariant that is also true is that`i >= 0 && i < 10`

(because this is the termination condition!) or that `j <= 9 && j >= 0. `

By itself, a loop invariant doesn't do much. However, given an appropriate invariant, it can be used to help prove the correctness of an algorithm. For example, let your loop invariant be something like, at the start of the loop, the first `i`

entries of this array are sorted. If you can prove that this is indeed a loop invariant (i.e. that it holds before and after every loop iteration), you can use this to prove the correctness of a sorting algorithm: at the termination of the loop, the loop invariant is still satisfied, and the counter `i`

is the length of the array. Therefore, the first `i`

entries are sorted means the entire array is sorted.- Step 0: Loop invariant: At the start of each iteration of the for loop, the sub-array A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order.
- Step 1: Initialization (the base case) when j = 2, A[1..1] is trivially sorted
- Step 2: Maintenance (the inductive step). None from A[1..j] moves beyond j; sorted
- Step 3: Termination upon completion, j = n+1 and by LI A[1..n] is sorted.

__Only highest order term counts because for asymptotic analysis, as input size grows larger, higher order term dominates. In general, Asymptotic bounds are given as__

**Order of growth:**- Upper bound: f(n) is O(g(n)) if ∃ positive constants c and n
_{0 }such that f(n) ≤ c . g(n) ∀ n ≥ n_{0} - Lower bound: f(n) is Ω(g(n)) if ∃ positive constants c and n
_{0 }such that 0 ≤ c . g(n) ≤ f(n) ∀ n ≥ n_{0} - Tight bound: f(n) is Θ(g(n)) if ∃ positive constants c
_{1, }c_{2}and n_{0 }such that c_{1}. g(n) ≤ f(n) ≤ c_{2}. g(n) ∀ n ≥ n_{0. }f(n) is Θ(g(n)) iff it is both O(g(n)) and Ω(g(n)). - Upper bound: f(n) is o(g(n)) if ∃ positive constants c and n
_{0 }such that f(n) < c . g(n) ∀ n ≥ n_{0} - Lower bound: f(n) is ω(g(n)) if ∃ positive constants c and n
_{0 }such that 0 < c . g(n) < f(n) ∀ n ≥ n_{0} - Tight bound: f(n) is θ(g(n)) if ∃ positive constants c
_{1, }c_{2}and n_{0 }such that c_{1}. g(n) < f(n) < c_{2}. g(n) ∀ n ≥ n_{0. }f(n) is θ(g(n)) iff it is both o(g(n)) and ω(g(n)).

So Insertion sort run time is

- Upper bound: O(n
^{2}) - Lower bound: Ω(n)

**Analysis of algorithms requires extensive usage of solving recurrences, wherein we break a larger problem into smaller problems and then set up recurrence to define time / space complexity of larger problem in terms of time / space complexities of smaller problems. An example of recurrence is as follows where time of algorithm for input size n is defined in terms of time of algorithms of input size n/2.**

__Recurrences:__Recurrences are solved by three methods

Guess the form of the answer, then use induction to find the constants and show that solution works**Substitution Method (Make a goof guess method ) :**- T(n) = 2T(n/2) + Θ(n) => T(n) = Θ(n lg n)
- T(n) = 2T(⌊n/2⌋) + n => T(n) = Θ(n lg n)
Expand the recurrence,work some algebra to express as a summation,evaluate the summation. For example**Iteration Method:**

T(n) = aT(n/b) + cn = a(aT(n/b/b) + cn/b) + cn = a2T(n/b2) + cna/b + cn

… = akT(n/bk) + cn(ak-1/bk-1 + ak-2/bk-2 + … + a2/b2 + a/b + 1)

For k = log

_{b}n i.e.n = bk T(n) = akT(1) + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)

= akc + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) = cak + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)

= cnak /bk + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)

= cn(ak/bk + ... + a2/b2 + a/b + 1)

- For a = b: T(n) = cn(k + 1) = cn(log
_{b}n + 1) = Θ(n log n) - For a < b: Since ∑(xk + xk-1 + … + x + 1) = (xk+1 -1)/(x-1)T(n) = cn . Θ(1) = Θ(n)
- For a > b: = cn · Θ(alog n / blog n) = cn · Θ(alog n / n)= cn · Θ(nlog a / n) = Θ(cn · nlog a / n)= Θ(nlog a )

Given: a divide and conquer algorithm i.e.an algorithm that divides the problem of size n into a sub-problems, each of size n/b and let the cost of each stage (i.e., the work to divide the problem + combine solved sub-problems) be described by the function f(n), i.e.if T(n) = aT(n/b) + f(n) then**The Master Theorem:**

Thus the solution is T(n) = Θ(n2)

Any algorithm that works

**by exchanging**one inversion per step - will have to have O(n^{2}) complexity on an*average*. Insertion sort, bubble sort, and selection sort are examples of such algorithms.To do any better an algorithm has to remove more than one inversion per step, at least on some steps.**Other sorting examples:**T(n) = O(n log n) : A heap is a complete binary tree satisfying heap property i.e.A[Parent(i)] ≥ A[i] for all nodes i > 1. Heapify() floats down the parent node in left or right heaps. If the heap at i has n elements, subtrees at l or r have at most 2n/3 elements. So worst case is T(n) ≤ T(2n/3) + Θ(1)or by case 2 of master theorem T(n) = O(log n). So Heapify takes linear time. The call to BuildHeap() takes O(n) time. Each of the n - 1 calls to Heapify() takes O(lg n) time. Thus the total time taken by HeapSort()= O(n) + (n - 1) O(lg n) = O(n) + O(n lg n) = O(n lg n). S(n) =**Heapsort**:*O*(*n*) - in place,*O*(1) extra space.Pick a pivot from the array, throw all lesser elements on the left side of the pivot, and throw all greater elements on the other side, so that the pivot is in the correct position. Recursively quicksort the left-side and the right-side. Use two pointers, leftptr and rightptr starting at both ends of the input array.Choose an element as pivot.Incerment leftptr as long as it points to element < pivot. Then decrement rightptr as long as it points to an element > pivot.**Quicksort**:When both the loops stop, both are pointing to elements on the wrong sides, so, exchange them. An element equal to the pivot is treated (for some reason) as on the*wrong*side . Force the respective pointers to increment (leftptr) and decrement (rightptr), and continue doing as above (loop) until the two pointers cross each other -then we are done in*partitioning*the input array (after putting the pivot in the middle with another exchange, i.e., at the current leftptr, if you started with swapping pivot at one end of the array). Then recursively quicksort both the rightside of the pivot and leftside of the pivot.Recursion stops, as usual, for a single-element array. Sorts in place.Pivot is always at one end (e.g., sorted array, with pivot being always the last element). T(n) = T(n-1) + cn. The additive cn term comes from the overhead in the partitioning (swapping operations) before another QuickSort is called. Because the pivot here is at one end, only one of the two QuickSort calls is actually invoked. That QuickSort call gets one less element (minus the pivot) than the array size the caller works with, hence T(n-1). This gives T(n) = O(n**Worst case:**^{2})Pivot**Best case:***luckily*always (in each recursive call) balances the partitions equally. T(n) = 2T(n/2) + cn => T(n) = O(n log n)Suppose the division takes place at the i-th element. T(n) = T(i) + T(n -i -1) + cn. To study the average case, vary i from 0 through n-1. T(n)= (1/n) Σ**Average case:**_{0 }^{n-1}[ T(i) +^{ }T(n -i -1) + cn]. which gives T(n) = O (n log n) Intuitively, a real-life run of quicksort will produce a mix of “bad” and “good” splits. We can assume they alternate between best-case (n/2 : n/2) and worst-case (n-1 : 1). Intuitively, the O(n) cost of a bad split (or 2 or 3 bad splits) can be absorbed into the O(n) cost of each good split.Thus running time of alternating bad and good splits is still O(n lg n), with slightly higher constants

*S(n) = O*(*n*) -*O*(log*n*) extra spaceT(n) = O(n log n). A recursive algorithm based on the merging of two sorted arrays using a third one. Mergesort takes a center and then recursively calls mergesort on both halves and then merges the two sorted halves.For n elements: T(n) = 2T(n/2) + n.Two**Merge sort:***MergeSort*calls with n/2 elements, and the*Merge*takes O(n). The solution is O(n log n).*MergeSort*needs extra O(n) space for the*temp*array. Best, Worst, and Average case for MergeSort is the same O(n log n) –__a very stable algorithm__whose complexity is independent of the actual input array but dependent only on the size. S(n) =*O*(*n*) -*O*(*n*) extra space.

__Heapsort's worst-case running time is always__**Why Quicksort wins ?***O*(*n*log*n*), but on average somewhat slower than in-place quicksort. Mergesort is a stable sort, and can be easily adapted to operate on linked lists and very large lists stored on slow-to-access media such as disk. Quicksort can be written to operate on linked lists, it will often suffer from poor pivot choices without random access. The main disadvantage of mergesort is that, when operating on arrays, it requires*O*(*n*) additional space, whereas quicksort with in-place partitioning and tail recursion uses only*O*(log*n*) space. Although when operating on linked lists, mergesort only requires a small, constant amount of auxiliary storage.

**Linear Time sorting Algorithms:**No comparisons between elements! Depends on assumption about the numbers being sorted. We assume numbers are in the range 1.. k.The algorithm:**Counting Sort:**- Input: A[1..n], where A[j] ε {1, 2, 3, …, k}
- Output: B[1..n], sorted (notice: not sorting in place)
- Also: Array C[1..k] for auxiliary storage

Sort on least significant digit first, then next digit and so on up to most significant digit.Sorting n numbers with d digits that range from 1..k takes d * O(n+k) = O(dn+dk). Since d is constant and k = O(n) so overall T(n) = O(n).In general, radix sort based on counting sort is Fast, Asymptotically fast (i.e., O(n)), Simple to code and A good choice.**Radix Sort:**

__Some problems are intractable i.e.as they grow large, we are unable to solve them in reasonable time. The working definition for reasonable time is: polynomial time, which means on an input of size n the worst-case running time is O(nk) for some constant k.__

**NP Completeness:**- Polynomial time: O(n2), O(n3), O(1), O(n lg n)
- Not in polynomial time: O(2n), O(nn), O(n!)

__problems are an interesting class of problems whose status is unknown.__

**NP-Complete**- No polynomial-time algorithm has been discovered for an NP-Complete problem
- No supra-polynomial lower bound has been proved for any NP-Complete problem, either

**P = NP question,**

**The biggest open problem in CS**An example of an NP-Complete problem is The

**hamiltonian****-cycle**(A hamiltonian cycle of an undirected graph is a simple cycle that contains every vertex)**problem**: given a graph G, does it have a hamiltonian cycle?P is set of problems that can be solved in polynomial time and NP (nondeterministic polynomial time) is the set of problems that can be solved in polynomial time by a nondeterministic computer - A computer that magically “guesses” a solution, then has to verify that it is correct. If a solution exists, computer always guesses it.One way to imagine it: a parallel computer that can freely spawn an infinite number of processes,have one processor work on each possible solution, all processors attempt to verify that their solution works and if a processor finds it has a working solution.So: NP = problems verifiable in polynomial time

NP-Complete problems are the “hardest” problems in NP. If any one NP-Complete problem can be solved in polynomial time, then every NP-Complete problem can be solved in polynomial time and in fact every problem in NP can be solved in polynomial time (which would show P = NP). For example if we solve hamiltonian-cycle in O(n100) time, we’ve proved that P = NP.

**:The crux of NP-Completeness is reducibility. A problem P can be reduced to another problem Q if any instance of P can be “easily rephrased / transformed” as an instance of Q, the solution to which provides a solution to the instance of P. For example, P: Given a set of Booleans, is at least one TRUE? and Q: Given a set of integers, is their sum positive? It's transformation can be given as : (x1, x2, …, xn) = (y1, y2, …, yn) where yi = 1 if xi = TRUE, yi = 0 if xi = FALSE. Another example would be. Another example would be: Solving linear equations is reducible to solving quadratic equations.**__Reduction__ If P is polynomial-time reducible to Q, we denote this P ≤p Q. If P is NP-Complete, R ≤p P ∀ R ε N.

If P ≤p Q and P is NP-Complete, Q is also NP-Complete.

Examples of NP-complete problems : Hamiltonian cycle, Hamiltonian path, Knapsack problem,Traveling salesman

Steps to prove that a problem Q is NP-complete:

(1) Pick a known NP-Complete problem P

(2) Reduce P to Q

(1) Describe a transformation that maps instances of P to instances of Q, s.t. “yes” for Q = "yes” for P

(2) Prove the transformation works

(3) Prove it runs in polynomial time

(3) Prove Q ε NP

P ⊆ NP, but no one knows whether P = NP

NP-hard and NP-complete: A problem Q is NP-hard if Every problem R ε NP ≤p Q. A problem Q is NP-complete if Q is NP-hard and Q ε NP.

If P ≤p Q and P is NP-Complete, Q is also NP-Complete.

Examples of NP-complete problems : Hamiltonian cycle, Hamiltonian path, Knapsack problem,Traveling salesman

Steps to prove that a problem Q is NP-complete:

(1) Pick a known NP-Complete problem P

(2) Reduce P to Q

(1) Describe a transformation that maps instances of P to instances of Q, s.t. “yes” for Q = "yes” for P

(2) Prove the transformation works

(3) Prove it runs in polynomial time

(3) Prove Q ε NP

P ⊆ NP, but no one knows whether P = NP

NP-hard and NP-complete: A problem Q is NP-hard if Every problem R ε NP ≤p Q. A problem Q is NP-complete if Q is NP-hard and Q ε NP.

Vinay Deolalikar, a mathematician who works for HP Labs, claims to have proven that P is not equal to NP. Deolalikar's proof works by connecting certain ideas in computer science and finite model theory to ideas in statistical mechanics. The proof works by showing that if certain problems known to be in NP were also in P then those problems would have impossible statistical properties. Computer scientists and mathematicians have expressed a variety of opinions about Deolalikar's proof, ranging from guarded optimism to near certainty that the proof is incorrect.

Great tutorial; Algorithms are at the core of Computer Science. Learning the design and analysis of algorithms can help you in honing your problem-solving abilities as well as cracking the coding interview. Keep sharing these educational articles. Great blog.

ReplyDelete