2. Data Structures
Data Structure is a way to store and
organize data in a computer so that it can
be used efficiently.
3. Algorithm
An algorithm is a sequence of clear and
precise step-by-step instructions for
solving a problem in a finite amount of
time.
4. Good Algorithms?
Run in less time
Consume less memory
But computational resources (time
complexity) is usually more important
5. Algorithm Analysis
It would seem that the most obvious way
to measure the efficiency of an algorithm
is to run it and measure how much
processor time is needed
But is it correct???
7. Time and Space Complexity
Analyzing an algorithm means determining the
amount of resources (such as time and space)
needed to execute it.
Time Complexity
◦ The time complexity of an algorithm is basically the
running time of a program as a function of the input
size.
Space Complexity
◦ The space complexity of an algorithm is the amount
of computer memory that is required during the
program execution as a function of the input size.
8. Space Complexity
Fixed part
◦ It varies from problem to problem. It includes
the space needed for storing instructions,
constants, variables, and other structures
variables like arrays
Variable part
◦ It includes the space needed for recursion
stack, and for variables that are allocated
space dynamically during the runtime of a
program.
10. Moving Beyond Experimental Analysis
Our goal is to develop an approach to analyze
the efficiency of algorithms that:
Allows us to evaluate the relative efficiency of
any two algorithms in a way that is independent
of the hardware and software environment.
Is performed by studying a high-level
description of the algorithm without need for
implementation.
Takes into account all possible inputs.
11. Running Time of an Algorithm
Depends upon
Input Size
Nature of Input
Generally time grows with size of input,
so running time of an algorithm is usually
measured as function of input size.
Running time is measured in terms of
number of steps/primitive operations
performed
Independent from machine, OS
12. Types of running time
Worst case running time
◦ This denotes the behavior of an algorithm with respect to the
worst possible case of the input instance.
Average case running time
◦ It is an estimate of the running time for an average input. It
specify the expected behavior of the algorithm when the input
is randomly drawn from a given distribution.
Best case running time
◦ It is used to analyze an algorithm under optimal condition.
Amortized running time
◦ Amortized running time refers to the time required to
perform a sequence of (related) operations averaged over all
the operations performed.
13. Time-Space Trade-off
If space is a big constraint then one might
choose an algorithm that takes less space
at the cost of more CPU time.
If time is major constraint, then one might
choose a program that takes minimum
time to execute at the cost of more
space.
14. Complexity Analysis - 1
'''Input: int A[N], array of N integers
Output: Sum of all numbers in array A''‘
def Sum(intList):
s=0
for i in range(len(intList)):
s = s + intList[i]
return s
Sum([5,6,7,8])
How should we analyse this?
15. Count the instructions
def Sum(intList):
s=0
for i in range(len(intList)):
s = s + intList[i]
return s
Sum([5,6,7,8])
1
2
3
4 5
6
7 1,2,7,8: Once
3,4,5,6: Once per each
iteration of for loop, N
iteration
Total: 4N + 4
The complexity function of
the algorithm is : f(N) =
4N +4
8
16. Growth of function - 4n+4
Estimated running time for different values of N:
N = 10 => 44 steps
N = 100 => 404 steps
N = 1,000 => 4004 steps
N = 1,000,000 => 4,000,004 steps
As N grows, the number of steps grow in linear
proportion to N for this function “Sum”
17. What Dominates in Previous Example?
What about the +4 and 4 in 4N+4?
◦ As N gets large, the +4 becomes insignificant
◦ 4 is inaccurate, as different operations require varying
amounts of time and also does not have any significant
importance
What is fundamental is that the time is linear in N.
Asymptotic Complexity:As N gets large, concentrate
on the highest order term:
Drop lower order terms such as +4
Drop the constant coefficient of the highest order
term i.e. N
18. Asymptotic Complexity
The 4N+4 time bound is said to "grow
asymptotically" like N
This gives us an approximation of the
complexity of the algorithm
Ignores lots of (machine dependent)
details, concentrate on the bigger picture
19. Big Oh Notation – Worst Case Analysis
If f(N) and g(N) are two complexity functions, we
say
f(N) = O(g(N))
(read "f(N) is order g(N)", or "f(N) is big-O of g(N)")
If g is an upper bound on f and if there are
constants c and N0 such that for N > N0,
f(N) ≤ c * g(N)
for all sufficiently large N.
21. Example
Consider
f(n)=2n2
+3
and g(n)=n2
Is f(n)=O(g(n))? i.e. Is 2n2
+3 = O(n2
)?
Proof:
2n2
+3 ≤ c * n2
Assume N0 =1 and c=1?
Assume N0 =1 and c=2?
Assume N0 =1 and c=3?
If true for one pair of N0 and c, then there exists infinite set of
such pairs of N0 and c
23. Counting Primitive Operations/Simple
Statements
Assigning an identifier to an object
Determining the object associated with an
identifier
Performing an arithmetic operation (for example,
adding two numbers)
Comparing two numbers
Accessing a single element of a Python list by index
Calling a function (excluding operations executed
within the function)
Returning from a function.
28. Performance Classification
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
log n Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.
n Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.
n log n When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions
n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).
n3 Cubic: when n doubles, runtime increases eightfold
2n Exponential: when n doubles, run time squares. This is often the result of a natural,
“brute force” solution.
32. Analyzing Loops-Uniform step size
Any loop has two parts:
◦ How many iterations are performed?
◦ How many steps per iteration?
sum = 0
for j in range(N)
sum = sum +j
◦ Loop executes N times (0..N-1)
◦ O(1) steps per iteration
Total time is N * O(1) = O(N*1) = O(N)
33. Analyzing Loops – Deceptive case
What about this for loop?
sum =0
for j in range(100)
sum = sum +j
Loop executes 100 times
O(1) steps per iteration
Total time is 100 * O(1) = O(100 * 1) = O(100)
= O(1)
37. Analyzing Nested Loops – Independent
loops
Treat just like a single loop and evaluate each
level of nesting as needed:
for j in range(N)
for k in range(N,-1,-1)
sum = k+j;
Start with outer loop:
◦ How many iterations? N
◦ How much time per iteration? Need to evaluate
inner loop
Inner loop uses O(N) time
Total time is N * O(N) = O(N*N) = O(N2
)
38. Analyzing Nested Loops – dependent
loop
What if the number of iterations of one loop
depends on the counter of the other?
for i in range( N)
for k in range(i, N)
sum += k+i;
Analyze inner and outer loop together