1
Chapter 1
Introduction
All the material are integrated from the textbook "Fundamentals of Data Structures
in C” and NTU Website
2
How to create programs
 Requirements
 Analysis: bottom-up vs. top-down
 Design: data objects and operations
 Refinement and Coding
 Verification
 Program Proving
 Testing
 Debugging
3
Data Structure: How data are
organized and hence operated
 You’ve been very familiar with arrays in
your programming assignments. They are
basic (yet powerful!) data structures.
 They can hold data (objects)—e.g., integers.
 They are structured—structured in a way that
the data held inside can be operated.
 Each element in an array has an index. With that,
you can store or retrieve an element.
4
Learning Data Structures, and
Algorithms
 You want your tasks to be performed
efficiently. You need good methods
(algorithms).
 Data must be structured in some
manner to be operated.
 Good structures can be operated efficiently.
5
Example
 For example, suppose that you are
building a database storing the data of
all (past, present, and future) students
of NTHU, which are growing in size.
 You’ll need to find anybody’s data in the
database (to search/retrieve).
 To enter new entries into it (to insert).
 Etc.
6
Example
 You can use an array to implement the
database.
 To insert a new entry, simply add it to the
first empty array cell.
 If the current array is full, allocate a new array
whose size is the double of the current. Then
copy all the original entries from the old array
to the new one.
 To search for an entry, simply looks at all
entries in the array, one by one.
7
Example
 Your programming experience, however,
tells you that this is not a good method.
 In Chapter 10, you’ll see sophisticated
data structures that can be operated
(searched, inserted/deleted, etc.)
efficiently.
 Then, you’ll just feel how data can be
cleverly structured to serve as the basis of
fast algorithms.
8
Example 2
 This time you want to work with
polynomials, i.e., functions of the form
f(x) = an xn + an-1 xn-1 + … + a0
 You’ll need to store them, as well as to
multiply or add them.
 You may allocate an array to store the
coefficients of a polynomial.
 E.g., A[i] stores ai.
9
Example 2
 Alternatively, you can use a linked list to
store it.
 Each node needs to store the coefficient and
the exponent.
E.g., to store 3x2 + 5, a linked list like
[3|2] -> [5|0] is constructed.
 To represent polynomials, both data
structures have their relative advantages
and drawbacks in time and space
considerations, etc (see later chapters).
10
Searching in Arrays
 You are able to write, in minutes, a
program to search sequentially in an
array.
 However, when the numbers in an array
are sorted, you can do much faster…
 …using binary search, which you probably
are also familiar with.
 You may say that, a sorted array is not
the same structure as a general array.
11
Binary Search
 Let A[0..n - 1] be an array of n integers.
 We want to ask: is some integer x
stored in A?
 Suppose that A is sorted (say, in
ascending order), i.e., (for simplicity
assume that the numbers are distinct)
A[0] < A[1] < … < A[n - 1].
 To be more concrete, let n be 5.
12
Binary Search
 Observation:
If x > A[2], then x must fall in A[3..4],
(or x is not in A).
If x < A[2], then x must fall in A[0..1].
 Ex: Let x = 11.
 A = 3 5 7 11 13
 x (= 11) > A[2] (= 7), so we need not
consider A[0..1].
13
Binary Search
 Example: Let x = 17. Let A contain the
following 8 integers. Initially, let left = 0
and right = 7 (shown in red).
 left and right define the range to be
searched.
 2 3 5 7 11 13 17 19
 mid = (0 + 7) / 2 = 3 (rounded).
 A[mid] = 7 < x, so let left = mid + 1 = 4,
and continue the search.
14
 2 3 5 7 11 13 17 19
 mid = (4 + 7) / 2 = 5, A[mid] = 13, x = 17.
 A[mid] < x, so let left = mid + 1 = 6, and
continue the search.
 2 3 5 7 11 13 17 19
 mid = (6 + 7) / 2 = 6, A[mid] = 17, x = 17.
 A[mid] = x, so return mid = 6, the desired
position, and we’re done!
15
Binary Search
 Binsearch(A, x, left, right) /* finds x in
A[left..right] */
while left <= right do
mid = (left + right) / 2;
if x < A[mid] then right = mid – 1;
else if x == A[mid] then return mid;
else left = mid + 1;
return -1; /* not found */
16
Recursive Functions
 A function that invokes (or is defined in
terms of) itself directly or indirectly is a
recursive function.
 Fibonacci sequence: Fn = Fn-1 + Fn-2
 Summation: sum(n) = sum(n - 1) + A[n]
 Where sum(i) is the sum of the first i items in A
 Binomial coefficient (n choose k):
C(n, k) = C(n – 1, k) + C(n – 1, k – 1)
 The combinatorial interpretation of this is profound—
try it out if you’ve not learned it yet!
17
Recursive Binary Search
 Binsearch(A, x, left, right)
if left > right then return -1;
mid = (left + right) / 2;
if x < A[mid] then return
Binsearch(A, x, left, mid – 1);
else if x == A[mid] then return mid;
else return Binsearch(A, x, mid+ 1, right);
18
Data Abstraction
 Before implementation, we need first to
know the specification of the objects to be
stored as well as the operations that should
be supported, before we can implement it.
 In our previous polynomial example, first we
have the demand to store polynomials (the
objects), supporting multiplications and
other operations. The specification is
independent of how it is implemented (e.g.
using arrays or linked lists).
19
Abstract Data Type (ADT)
 An abstract data type (ADT) is a data type
that is organized in such a way that the
specification of the objects and the
specification of the operations on the objects
is separated from the representation of the
objects and the implementation of the
operations.
 No fixed syntax to describe them. The
specifications need only be clear.
20
 Structure Natural_Number:
 Objects: an ordered subrange of the integers
starting at 0 and ending at the maximum integer
(INT_MAX) on the computer.
 Functions:
 Nat_Num Zero() ::= 0
 Nat_Num Add(x, y) ::= if (x+y) <= INT_MAX then
return x + y, else return INT_MAX.
 And so on. This example is actually too simple so that
you may feel that the implementation of the functions
have been stated. But this is generally not the case.
ADT for Natural Numbers (an
example)
21
*Structure 1.1:Abstract data type Natural_Number (p.17)
structure Natural_Number is
objects: an ordered subrange of the integers starting at zero and
ending at the maximum integer (INT_MAX) on the computer
functions:
for all x, y  Nat_Number; TRUE, FALSE  Boolean
and where +, -, <, and == are the usual integer operations.
Nat_No Zero ( ) ::= 0
Boolean Is_Zero(x) ::= if (x) return FALSE
else return TRUE
Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y
else return INT_MAX
Boolean Equal(x,y) ::= if (x== y) return TRUE
else return FALSE
Nat_No Successor(x) ::= if (x == INT_MAX) return x
else return x+1
Nat_No Subtract(x,y) ::= if (x<y) return 0
else return x-y
end Natural_Number ::= is defined as
22
Performance Analysis
 We’re most interested in the time and
space requirements of an algorithm.
 The space complexity of a program is
the amount of memory that it needs to
run to completion. The time complexity
is the amount of computer time that it
needs to run to completion.
23
Algorithm ?
 A number of rules, which are to be followed
in a prescribed order, for solving a specific
type of problems.
 Computer Algorithm
 Finiteness steps)
 Definiteness step
 Effectiveness
 Input/Output
 (O.S. terminate computational procedure)
24
Algorithm is everywhere !
 Operating Systems
 System Programming
 Numerical Applications
 Non-numerical Applications
:
 Algorithm Implement:
 Software
 Hardware
 Firmware
25
Measurements
 Criteria
 Is it correct?
 Is it readable?
 …
 Performance Analysis (machine independent)
 space complexity: storage requirement
 time complexity: computing time
 Performance Measurement (machine
dependent)
26
Space Complexity
 The space needed includes two parts:
 (1) Fixed space requirement: Not
dependent on the number and size of
the program’s inputs and outputs.
 Instruction space, simple variables (e.g.,
int), fixed-size structure variables (such as
struct), and constants.
27
Space Complexity
 (2) Variable space requirement S(I):
Depends on the instance I involved.
 In recursive calls, many copies of simple
variables (e.g. many int’s) may exist. Such
space requirement is included in S(I).
 S(I) may depend on some characteristics
of I. The characteristic we’ll most often
encounter is n, the size of the instance.
 In this case we denote S(I) as S(n).
28
Space Complexity
float abc(float a, float b, float c) {
return a+b+b*c + (a+b-c) / (a+b) + 4;
}
 Sabc(I) = 0.
 Only has fixed space requirement.
29
Space Complexity
float sum(float list[], int n) {/* adds up list[] */
float tempsum = 0;
int i;
for (i=0; i < n; i++) tempsum += list[i];
return tempsum; }
 In C, the array is passed using the address. So
Ssum(n) = 0.
 In Pascal, the array may be passed by copying values. If
this is the case, then Ssum(n) = n.
30
Space Complexity
float rsum(float list[], int n) {
if (n > 0) return rsum(list, n-1) + list[n-1];
return 0; }
 For each recursive call, the OS must save:
parameters, local variables, and the return
address.
 In this example, two parameters (list[] and n)
and the return address (internally) are saved
for each recursive call.
31
 To add a list of n numbers, there are n
recursive calls in total.
 So Srsum(n) = (c1 + c2 + c3) * n, where c1, c2
and c3 are the number of bytes (or other unit
of interest) needed for each of these types
(list[], n, and return address).
32
Time Complexity
 We’re interested in the number of steps
taken by a program. But what is a step?
 A program step is a syntactically or
semantically meaningful program segment
whose execution time is independent of the
instance characteristics.
 For a program, everybody can have his/her own
steps defined. The important thing is the
independency of the instance size, etc.
33
Statement s/e Freq. Tot.
float rsum(float list[], int n) {
if (n > 0)
return rsum(list, n-1) + list[n-1];
return 0;
}
1 n+1 n+1
1 n n
1 1 1
Total 2n+2
s/e: #steps per execution.
34
Time Complexity
 For some programs, for a fixed n, the time
taken by different instances may still be
different.
 For example, the binary search algorithm
depends on the position of x in array A.
 Let A = <3, 5, 7, 11>.
 If x = 5, then we need less steps than what if
x = 7.
 But in both cases, n = 4.
35
Time Complexity
 So we may consider the worst case, average case,
or best case time complexity of an algorithm.
 Worst case: the maximum number of steps needed
for any possible instance of size n.
 Most commonly used. The concept of guarantee.
 Average case: under some assumption of instance
distribution, the expected number of steps needed.
 Useful. But usually most complicated to analyze.
 Best case: the opposite of worst case.
 Rarely seen.
36
Asymptotic Notations
 To make exact step counts is often not necessary.
 The concept of “step” is even inexact itself.
 It is more impressive to obtain a “functional”
improvement in the time complexity than an
improvement by a constant multiple.
 It is good to improve 2n2 to n2.
 It is even better to improve 2n2 to 1000n.
 For n large enough, you know that n2 is much larger
than 1000n.
37
*Figure 1.7:Function values (p.38)
38
39
*Figure 1.9:Times on a 1 billion instruction per second
computer(p.40)
40
Asymptotic Notations
 Therefore, in many cases we do not worry about
the coefficients in the time complexity.
 Not in “all” cases since, when we cannot have functional
improvement, we’ll still want improvements in the
coefficients.
 We regard 2n2 as equivalent to n2. They belong to
the same class in this sense.
 Similarly, 2n2 + 100n and n2 belong to the same
class, since the quadratic term is dominant.
 n2 and n belong to different classes.
41
Asymptotic Notation (O)
 Definition
f(n) = O(g(n)) iff there exist positive
constants c and n0 such that f(n)  cg(n) for
all n, n  n0.
 Examples
 3n+2=O(n) /* 3n+24n for n2 */
 3n+3=O(n) /* 3n+34n for n3 */
 100n+6=O(n) /* 100n+6101n for n10 */
 10n2+4n+2=O(n2) /* 10n2+4n+211n2 for n5 */
 6*2n+n2=O(2n) /* 6*2n+n2 7*2n for n4 */
42
Example
 Complexity of c1n2+c2n and c3n
 for sufficiently large of value, c3n is faster than c1n2+c2n
 for small values of n, either could be faster
 c1=1, c2=2, c3=100 --> c1n2+c2n  c3n for n  98
 c1=1, c2=2, c3=1000 --> c1n2+c2n  c3n for n  998
 break even point
 no matter what the values of c1, c2, and c3, the n
beyond which c3n is always faster than c1n2+c2n
43
 O(1): constant
 O(n): linear
 O(n2): quadratic
 O(n3): cubic
 O(2n): exponential
 O(logn)
 O(nlogn)
44
Asymptotic Notations
 To make these concepts more precise,
asymptotic notations are introduced.
 f(n) = O(g(n)) if there exist positive
constants c and n0 such that f(n)  c g(n) for
all n > n0.
 1000n = O(2n2). This is to say that 1000n is
no larger than () 2n2 in this sense.
 You can choose c = 500 and n0 = 1.
 Then 1000n  500 * 2n2 = 1000n2 for all n > 1.
45
Asymptotic Notations
 1000n = O(n) since you can choose c =
1000 and n0 = 1.
 1000 = O(1).
 For constant functions, the choice of n0 is
arbitrary.
 n2  O(n) since for any c > 0 and any n0 > 0,
there always exists some n > n0 such that n2
> cn.
46
Asymptotic Notations
 2n2 + 100n = O(n2), since
2n2 + 100n < 200n2 = O(n2)
 This last equation may be validated by choosing
c = 200 and n0 = 1.
 So you can feel that in asymptotic notations we
only care about the most dominant term. Simply
throw out the minor terms. Throw out the
constants, too.
 log2 n = O(n). May choose c = 1 and n0 = 2.
 n log2 n = O(n2).
47
Asymptotic Notations
 n100 = O(2n). You may choose c = 1 and n0 = 1000.
 O(1): constant, O(n): linear,
O(n2): quadratic, O(n3): cubic,
O(log n): logarithmic, O(n log n),
O(2n): exponential.
 These are the most commonly encountered time
complexities.
 The base of the log is not relevant asymptotically,
since logba = (1/log2b) log2a, different only by a
constant multiple 1/log2b.
48
Asymptotic Notations
 f(n) = O(g(n)) is just a notation. f(n)
and O(g(n)) are not the same thing.
 So you can’t write O(g(n)) = f(n).
 It is also common to view O(g(n)) as a
set of functions, and f(n) = O(g(n))
actually means f(n)  O(g(n)).
 O(g(n)) = {f(n): for some c > 0 and n0 > 0
such that f(n) < c g(n) for all n > n0}
49
Asymptotic Notations
 n = O(n), n = O(n2), n = O(n3), …
 Choose the function g(n) closer to f(n) is
more informative.
 You may ask, why not use f(n) itself? f(n)
= O(f(n)) is the best choice. The difficulty
is: when we analyze an algorithm, we may
even not know the exact f(n). We can only
obtain an upper bound in some cases.
50
Asymptotic Notations
 Thm 1.2: If f(n) = am nm + … + a1 n + a0,
then f(n) = O(nm).
 Pf: Let a = max{|am|, …, |a0|} + 1. Then
 f(n) < a nm + … + a n + a
< (m+1)a * nm for n > 1.
 So choosing c = (m+1)a and n0 = 1 just
works.
 So, again, drop the constants and minor
terms.
51
Asymptotic Notations
 We also have a notation for lower bounds.
 f(n) = (g(n)) if for some c > 0 and n0 > 0,
f(n)  c g(n) for all n > n0.
 n2 = (2n); choose c = 1/2 and n0 = 1.
 2n = (n100); choose c = 1 and n = 1000.
 You can prove that: If f(n) = O(g(n)), then
g(n) = (f(n)).
52
Asymptotic Notations
 Thm 1.3:
If f(n) = am nm + … + a1 n + a0 and am
> 0, then f(n) = (nm).
 Pf: Exercise.
 Note that, if am < 0, then f(n)  (nm).
For example, -n2 + 1000n  (n2) since
for any c > 0 and n0 > 0, we have
–n2 + 1000n < n2 for n large.
53
Asymptotic Notations
 We also have a notation for “equivalence”.
 f(n) = (g(n)) if there exist c1 > 0, c2 > 0,
and n0 > 0 such that
c1 g(n)  f(n)  c2 g(n) for all n > n0.
 2n2 + 100n = (n2).
 n2  (n).
 You can prove that, f(n) = (g(n)) if and
only if f(n) = O(g(n)) and f(n) = (g(n)).
54
Asymptotic Notations
 Thm 1.4: If f(n) = am nm + … + a1n + a0
and am > 0, then f(n) = (nm).
 Pf: Immediate from Thm 1.2 and 1.3.
55
Time Complexity of Binary
Search
 At each iteration, the search range for binary
search is reduced by about a half.
 So, in any case, the number of iterations needed
cannot exceed log2n. The time complexity is
O(log n) in any case (each iteration takes O(1)).
 The worst case time complexity is (log n), which
occurs when, e.g., the number to search is not in
the array.
 Therefore, the worst case time complexity is
(log n).
 In the best case, one iteration suffices. The best
case time complexity is (1).
56
Time Complexity of Binary
Search
 Note that the time complexity for a
sequential search is (n), which occurs
when, e.g., the number to search is not
in the array.
 So binary search is faster than
sequential search, but it requires the
array to be sorted.
57
Time Complexity of Binary
Search
 The worst case time complexity of the
recursive binary search can be stated
elegantly as:
 T(n) = T(n/2) + (1) for n > 1; T(1) = (1).
 The (1) term means some anonymous
function f(n) s.t. f(n) = (1), which is for
the time needed in addition to the time
taken by the recursive call.
58
Time Complexity of Binary
Search
 Note that any function f(n) = (1) is
bounded above by a constant, and is
bounded below by a constant.
 By the definition of , there exists c > 0
and n0 > 0 such that f(n) < c for n > n0.
So f(n) < sup{|f(n)|: n  n0} + c. Similarly
f(n) is bounded below by a constant.
 So T(n) < T(n/2) + c.
59
Time Complexity of Binary
Search
 For simplicity let n = 2m, so m = log n. Then
 T(n) = T(2m) < T(2m-1) + c
< T(2m-2) + 2c < …
< T(1) + m*c = O(m)
 So T(n) = O(log n).
 This may be seen as an initial guess. To
confirm the answer we may use induction.
60
Time Complexity of Binary
Search
 Ind. Hyp.: T(m) < d log m for m < n.
 We are free in choosing the constant d, if it
satisfies the base case. So we can choose d to
be larger than c.
 Induction: T(n) < T(n/2) + c < d log(n/2) +
c = d log n – d + c < d log n.
 Base case: T(2) < T(1) + c < c’ < d log 2.
 Need to choose d to satisfy d > c’.
 So T(n) < d log n, implying T(n) = O(log n).
61
Selection Sort
 Given several numbers, how to sort them?
 You can find the smallest number, set it
aside, find the next smallest number, and so
on, continue until all numbers are done.
62
Selection Sort
 sort(A, n)
/* Assume that A is indexed by 1..n */
 for i = 1 to n - 1 do
find the index of the min. elem. in A[i..n];
swap the min. elem. with A[i];
 Observe that, at the end of the i th iteration,
A[j] holds the j th smallest element of A, for
all i  i.
63
Time Complexity of Selection
Sort
 Find the minimum in A[i..n] takes
(n - i) time.
 Total time:
)
(
)
( 2
1
1 n
i
n
n
i 






64
Comparison of Two Strategies
 Suppose that you want to do searches in A.
 If the search will be performed only once,
then a sequential search is good.
 If the search is to be done very frequently
(much more than n times), then it is worth
paying n2 time to sort the array first
(preprocessing), being able to do binary
search subsequently.

More Related Content

PPTX
III_Data Structure_Module_1.pptx
PDF
jn;lm;lkm';m';;lmppt of data structure.pdf
PDF
Iare ds ppt_3
PPT
chapter1.ppt
PPT
chapter1.ppt
PPT
Basic_analysis.ppt
PDF
chapter1.pdf ......................................
PPT
chapter1.ppt
III_Data Structure_Module_1.pptx
jn;lm;lkm';m';;lmppt of data structure.pdf
Iare ds ppt_3
chapter1.ppt
chapter1.ppt
Basic_analysis.ppt
chapter1.pdf ......................................
chapter1.ppt

Similar to III_Data Structure_Module_1.ppt (20)

PDF
Data structure and Alogorithm analysis unit one
PPTX
19. Java data structures algorithms and complexity
PPT
Data structure and algorithm first chapter
PPTX
19. Data Structures and Algorithm Complexity
PDF
Ch01 basic concepts_nosoluiton
PPTX
data structure and algoriythm pres.pptxD
PPTX
VCE Unit 01 (1).pptx
DOCX
3rd-Sem_CSE_Data-Structures and Applications.docx
PPT
L01 intro-daa - ppt1
PPT
FDS- Basic Concepts.ppt
PPT
assignment character education assignment
PPTX
Mca ii dfs u-1 introduction to data structure
PPTX
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
PPTX
ADS Introduction
DOC
Algorithm
PPTX
19. algorithms and-complexity
PDF
Data Structures and Algorithm Analysis in C++, 3rd Edition by Dr. Clifford A....
PDF
Data Structures (BE)
PPTX
Array Data StructureData StructureData Structure.pptx
Data structure and Alogorithm analysis unit one
19. Java data structures algorithms and complexity
Data structure and algorithm first chapter
19. Data Structures and Algorithm Complexity
Ch01 basic concepts_nosoluiton
data structure and algoriythm pres.pptxD
VCE Unit 01 (1).pptx
3rd-Sem_CSE_Data-Structures and Applications.docx
L01 intro-daa - ppt1
FDS- Basic Concepts.ppt
assignment character education assignment
Mca ii dfs u-1 introduction to data structure
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
ADS Introduction
Algorithm
19. algorithms and-complexity
Data Structures and Algorithm Analysis in C++, 3rd Edition by Dr. Clifford A....
Data Structures (BE)
Array Data StructureData StructureData Structure.pptx
Ad

More from shashankbhadouria4 (19)

PPT
EC203DSD - Module 5 - 3.ppt
PPTX
Artificial Neural Network.pptx
PPTX
IT201 Basics of Intelligent Systems-1.pptx
PPT
MO 2020 DS Doubly Linked List 1 AB.ppt
PPT
MO 2020 DS Stacks 3 AB.ppt
PPTX
A New Multi-Level Inverter Topology With Reduced Switch.pptx
PPTX
Birla Institute of Technology Mesra Jaipur.pptx
PPTX
EE306_EXP1.pptx
PPT
Chap 6 Graph.ppt
PPT
Chap 5 Tree.ppt
PPT
Chap 2 Arrays and Structures.ppt
PPT
Chap 4 List of Data Structure.ppt
PPTX
SUmmer Training PPT FINAL.pptx
PPTX
RVPN TRAINING PPT.pptx
PPTX
MODULE 1.pptx
PPTX
MODULE 1.pptx
PPT
MO 2020 DS Applications of Linked List 1 AB.ppt
PPT
MO 2020 DS Stacks 1 AB.ppt
PPTX
Chap 2 Arrays and Structures.pptx
EC203DSD - Module 5 - 3.ppt
Artificial Neural Network.pptx
IT201 Basics of Intelligent Systems-1.pptx
MO 2020 DS Doubly Linked List 1 AB.ppt
MO 2020 DS Stacks 3 AB.ppt
A New Multi-Level Inverter Topology With Reduced Switch.pptx
Birla Institute of Technology Mesra Jaipur.pptx
EE306_EXP1.pptx
Chap 6 Graph.ppt
Chap 5 Tree.ppt
Chap 2 Arrays and Structures.ppt
Chap 4 List of Data Structure.ppt
SUmmer Training PPT FINAL.pptx
RVPN TRAINING PPT.pptx
MODULE 1.pptx
MODULE 1.pptx
MO 2020 DS Applications of Linked List 1 AB.ppt
MO 2020 DS Stacks 1 AB.ppt
Chap 2 Arrays and Structures.pptx
Ad

Recently uploaded (20)

PPTX
ENERGY RESOUCES_Class 10_Geo_2020-21.pptx
PDF
Blockchain purposing their uses qnkogghjuvv
PPTX
VEHICLE BODY ENGINEERING - UNIT 3 Commercial vehicle details
PPTX
Narendra-Modi-Stadium AasddsadSSIGNMENT.pptx
PPTX
IMPRESSION MAKING IN FIXED PARTIAL DENTURE.pptx
PPTX
657381.pptxghfhfghgfhfghgfhfghfghfghgfhfg
PPTX
Pre-ph.d. presentation on numeric study of some double diffusive convection r...
PPT
fy10_sh-20856-10_Machine_Guarding (1).ppt
PDF
Why Has Vertical Farming Recently Become More Economical.pdf.pdf
PDF
D85PX-15E0 Komatsu d85 px 15e0 dozer bulldozer Technical manual.pdf
PDF
Troubleshooting Komatsu d85 px 15e0 Technical manual.pdf
PDF
CASE CX50B Series 2 Mini Excavator Service Repair Manual Instant Download.pdf
PDF
John Deere 370E 410E, 460E Articulated Dump Trucks Diagnostic Operator’s Manu...
PPTX
DNA Packaging_ Structure and Function in Cells.pptx
PPT
industrialsafetyhazards-180118095222.ppt
PDF
John Deere 5055e Engine Repair Manual Tm900319.pdf
PDF
Bobcat tl38.70 telescopic handler service repair manual sn b3 zu11001 and abo...
PDF
Bobcat tl35.70 telescopic handler service manual sn b3 zu11001 and above.pdf
PDF
John Deere 410E Articulated Dump Trucks Diagnostic Manuals.pdf
PDF
Auris, Corolla (EM3026E) Overall Electrical Wiring Diagram.pdf
ENERGY RESOUCES_Class 10_Geo_2020-21.pptx
Blockchain purposing their uses qnkogghjuvv
VEHICLE BODY ENGINEERING - UNIT 3 Commercial vehicle details
Narendra-Modi-Stadium AasddsadSSIGNMENT.pptx
IMPRESSION MAKING IN FIXED PARTIAL DENTURE.pptx
657381.pptxghfhfghgfhfghgfhfghfghfghgfhfg
Pre-ph.d. presentation on numeric study of some double diffusive convection r...
fy10_sh-20856-10_Machine_Guarding (1).ppt
Why Has Vertical Farming Recently Become More Economical.pdf.pdf
D85PX-15E0 Komatsu d85 px 15e0 dozer bulldozer Technical manual.pdf
Troubleshooting Komatsu d85 px 15e0 Technical manual.pdf
CASE CX50B Series 2 Mini Excavator Service Repair Manual Instant Download.pdf
John Deere 370E 410E, 460E Articulated Dump Trucks Diagnostic Operator’s Manu...
DNA Packaging_ Structure and Function in Cells.pptx
industrialsafetyhazards-180118095222.ppt
John Deere 5055e Engine Repair Manual Tm900319.pdf
Bobcat tl38.70 telescopic handler service repair manual sn b3 zu11001 and abo...
Bobcat tl35.70 telescopic handler service manual sn b3 zu11001 and above.pdf
John Deere 410E Articulated Dump Trucks Diagnostic Manuals.pdf
Auris, Corolla (EM3026E) Overall Electrical Wiring Diagram.pdf

III_Data Structure_Module_1.ppt

  • 1. 1 Chapter 1 Introduction All the material are integrated from the textbook "Fundamentals of Data Structures in C” and NTU Website
  • 2. 2 How to create programs  Requirements  Analysis: bottom-up vs. top-down  Design: data objects and operations  Refinement and Coding  Verification  Program Proving  Testing  Debugging
  • 3. 3 Data Structure: How data are organized and hence operated  You’ve been very familiar with arrays in your programming assignments. They are basic (yet powerful!) data structures.  They can hold data (objects)—e.g., integers.  They are structured—structured in a way that the data held inside can be operated.  Each element in an array has an index. With that, you can store or retrieve an element.
  • 4. 4 Learning Data Structures, and Algorithms  You want your tasks to be performed efficiently. You need good methods (algorithms).  Data must be structured in some manner to be operated.  Good structures can be operated efficiently.
  • 5. 5 Example  For example, suppose that you are building a database storing the data of all (past, present, and future) students of NTHU, which are growing in size.  You’ll need to find anybody’s data in the database (to search/retrieve).  To enter new entries into it (to insert).  Etc.
  • 6. 6 Example  You can use an array to implement the database.  To insert a new entry, simply add it to the first empty array cell.  If the current array is full, allocate a new array whose size is the double of the current. Then copy all the original entries from the old array to the new one.  To search for an entry, simply looks at all entries in the array, one by one.
  • 7. 7 Example  Your programming experience, however, tells you that this is not a good method.  In Chapter 10, you’ll see sophisticated data structures that can be operated (searched, inserted/deleted, etc.) efficiently.  Then, you’ll just feel how data can be cleverly structured to serve as the basis of fast algorithms.
  • 8. 8 Example 2  This time you want to work with polynomials, i.e., functions of the form f(x) = an xn + an-1 xn-1 + … + a0  You’ll need to store them, as well as to multiply or add them.  You may allocate an array to store the coefficients of a polynomial.  E.g., A[i] stores ai.
  • 9. 9 Example 2  Alternatively, you can use a linked list to store it.  Each node needs to store the coefficient and the exponent. E.g., to store 3x2 + 5, a linked list like [3|2] -> [5|0] is constructed.  To represent polynomials, both data structures have their relative advantages and drawbacks in time and space considerations, etc (see later chapters).
  • 10. 10 Searching in Arrays  You are able to write, in minutes, a program to search sequentially in an array.  However, when the numbers in an array are sorted, you can do much faster…  …using binary search, which you probably are also familiar with.  You may say that, a sorted array is not the same structure as a general array.
  • 11. 11 Binary Search  Let A[0..n - 1] be an array of n integers.  We want to ask: is some integer x stored in A?  Suppose that A is sorted (say, in ascending order), i.e., (for simplicity assume that the numbers are distinct) A[0] < A[1] < … < A[n - 1].  To be more concrete, let n be 5.
  • 12. 12 Binary Search  Observation: If x > A[2], then x must fall in A[3..4], (or x is not in A). If x < A[2], then x must fall in A[0..1].  Ex: Let x = 11.  A = 3 5 7 11 13  x (= 11) > A[2] (= 7), so we need not consider A[0..1].
  • 13. 13 Binary Search  Example: Let x = 17. Let A contain the following 8 integers. Initially, let left = 0 and right = 7 (shown in red).  left and right define the range to be searched.  2 3 5 7 11 13 17 19  mid = (0 + 7) / 2 = 3 (rounded).  A[mid] = 7 < x, so let left = mid + 1 = 4, and continue the search.
  • 14. 14  2 3 5 7 11 13 17 19  mid = (4 + 7) / 2 = 5, A[mid] = 13, x = 17.  A[mid] < x, so let left = mid + 1 = 6, and continue the search.  2 3 5 7 11 13 17 19  mid = (6 + 7) / 2 = 6, A[mid] = 17, x = 17.  A[mid] = x, so return mid = 6, the desired position, and we’re done!
  • 15. 15 Binary Search  Binsearch(A, x, left, right) /* finds x in A[left..right] */ while left <= right do mid = (left + right) / 2; if x < A[mid] then right = mid – 1; else if x == A[mid] then return mid; else left = mid + 1; return -1; /* not found */
  • 16. 16 Recursive Functions  A function that invokes (or is defined in terms of) itself directly or indirectly is a recursive function.  Fibonacci sequence: Fn = Fn-1 + Fn-2  Summation: sum(n) = sum(n - 1) + A[n]  Where sum(i) is the sum of the first i items in A  Binomial coefficient (n choose k): C(n, k) = C(n – 1, k) + C(n – 1, k – 1)  The combinatorial interpretation of this is profound— try it out if you’ve not learned it yet!
  • 17. 17 Recursive Binary Search  Binsearch(A, x, left, right) if left > right then return -1; mid = (left + right) / 2; if x < A[mid] then return Binsearch(A, x, left, mid – 1); else if x == A[mid] then return mid; else return Binsearch(A, x, mid+ 1, right);
  • 18. 18 Data Abstraction  Before implementation, we need first to know the specification of the objects to be stored as well as the operations that should be supported, before we can implement it.  In our previous polynomial example, first we have the demand to store polynomials (the objects), supporting multiplications and other operations. The specification is independent of how it is implemented (e.g. using arrays or linked lists).
  • 19. 19 Abstract Data Type (ADT)  An abstract data type (ADT) is a data type that is organized in such a way that the specification of the objects and the specification of the operations on the objects is separated from the representation of the objects and the implementation of the operations.  No fixed syntax to describe them. The specifications need only be clear.
  • 20. 20  Structure Natural_Number:  Objects: an ordered subrange of the integers starting at 0 and ending at the maximum integer (INT_MAX) on the computer.  Functions:  Nat_Num Zero() ::= 0  Nat_Num Add(x, y) ::= if (x+y) <= INT_MAX then return x + y, else return INT_MAX.  And so on. This example is actually too simple so that you may feel that the implementation of the functions have been stated. But this is generally not the case. ADT for Natural Numbers (an example)
  • 21. 21 *Structure 1.1:Abstract data type Natural_Number (p.17) structure Natural_Number is objects: an ordered subrange of the integers starting at zero and ending at the maximum integer (INT_MAX) on the computer functions: for all x, y  Nat_Number; TRUE, FALSE  Boolean and where +, -, <, and == are the usual integer operations. Nat_No Zero ( ) ::= 0 Boolean Is_Zero(x) ::= if (x) return FALSE else return TRUE Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y else return INT_MAX Boolean Equal(x,y) ::= if (x== y) return TRUE else return FALSE Nat_No Successor(x) ::= if (x == INT_MAX) return x else return x+1 Nat_No Subtract(x,y) ::= if (x<y) return 0 else return x-y end Natural_Number ::= is defined as
  • 22. 22 Performance Analysis  We’re most interested in the time and space requirements of an algorithm.  The space complexity of a program is the amount of memory that it needs to run to completion. The time complexity is the amount of computer time that it needs to run to completion.
  • 23. 23 Algorithm ?  A number of rules, which are to be followed in a prescribed order, for solving a specific type of problems.  Computer Algorithm  Finiteness steps)  Definiteness step  Effectiveness  Input/Output  (O.S. terminate computational procedure)
  • 24. 24 Algorithm is everywhere !  Operating Systems  System Programming  Numerical Applications  Non-numerical Applications :  Algorithm Implement:  Software  Hardware  Firmware
  • 25. 25 Measurements  Criteria  Is it correct?  Is it readable?  …  Performance Analysis (machine independent)  space complexity: storage requirement  time complexity: computing time  Performance Measurement (machine dependent)
  • 26. 26 Space Complexity  The space needed includes two parts:  (1) Fixed space requirement: Not dependent on the number and size of the program’s inputs and outputs.  Instruction space, simple variables (e.g., int), fixed-size structure variables (such as struct), and constants.
  • 27. 27 Space Complexity  (2) Variable space requirement S(I): Depends on the instance I involved.  In recursive calls, many copies of simple variables (e.g. many int’s) may exist. Such space requirement is included in S(I).  S(I) may depend on some characteristics of I. The characteristic we’ll most often encounter is n, the size of the instance.  In this case we denote S(I) as S(n).
  • 28. 28 Space Complexity float abc(float a, float b, float c) { return a+b+b*c + (a+b-c) / (a+b) + 4; }  Sabc(I) = 0.  Only has fixed space requirement.
  • 29. 29 Space Complexity float sum(float list[], int n) {/* adds up list[] */ float tempsum = 0; int i; for (i=0; i < n; i++) tempsum += list[i]; return tempsum; }  In C, the array is passed using the address. So Ssum(n) = 0.  In Pascal, the array may be passed by copying values. If this is the case, then Ssum(n) = n.
  • 30. 30 Space Complexity float rsum(float list[], int n) { if (n > 0) return rsum(list, n-1) + list[n-1]; return 0; }  For each recursive call, the OS must save: parameters, local variables, and the return address.  In this example, two parameters (list[] and n) and the return address (internally) are saved for each recursive call.
  • 31. 31  To add a list of n numbers, there are n recursive calls in total.  So Srsum(n) = (c1 + c2 + c3) * n, where c1, c2 and c3 are the number of bytes (or other unit of interest) needed for each of these types (list[], n, and return address).
  • 32. 32 Time Complexity  We’re interested in the number of steps taken by a program. But what is a step?  A program step is a syntactically or semantically meaningful program segment whose execution time is independent of the instance characteristics.  For a program, everybody can have his/her own steps defined. The important thing is the independency of the instance size, etc.
  • 33. 33 Statement s/e Freq. Tot. float rsum(float list[], int n) { if (n > 0) return rsum(list, n-1) + list[n-1]; return 0; } 1 n+1 n+1 1 n n 1 1 1 Total 2n+2 s/e: #steps per execution.
  • 34. 34 Time Complexity  For some programs, for a fixed n, the time taken by different instances may still be different.  For example, the binary search algorithm depends on the position of x in array A.  Let A = <3, 5, 7, 11>.  If x = 5, then we need less steps than what if x = 7.  But in both cases, n = 4.
  • 35. 35 Time Complexity  So we may consider the worst case, average case, or best case time complexity of an algorithm.  Worst case: the maximum number of steps needed for any possible instance of size n.  Most commonly used. The concept of guarantee.  Average case: under some assumption of instance distribution, the expected number of steps needed.  Useful. But usually most complicated to analyze.  Best case: the opposite of worst case.  Rarely seen.
  • 36. 36 Asymptotic Notations  To make exact step counts is often not necessary.  The concept of “step” is even inexact itself.  It is more impressive to obtain a “functional” improvement in the time complexity than an improvement by a constant multiple.  It is good to improve 2n2 to n2.  It is even better to improve 2n2 to 1000n.  For n large enough, you know that n2 is much larger than 1000n.
  • 38. 38
  • 39. 39 *Figure 1.9:Times on a 1 billion instruction per second computer(p.40)
  • 40. 40 Asymptotic Notations  Therefore, in many cases we do not worry about the coefficients in the time complexity.  Not in “all” cases since, when we cannot have functional improvement, we’ll still want improvements in the coefficients.  We regard 2n2 as equivalent to n2. They belong to the same class in this sense.  Similarly, 2n2 + 100n and n2 belong to the same class, since the quadratic term is dominant.  n2 and n belong to different classes.
  • 41. 41 Asymptotic Notation (O)  Definition f(n) = O(g(n)) iff there exist positive constants c and n0 such that f(n)  cg(n) for all n, n  n0.  Examples  3n+2=O(n) /* 3n+24n for n2 */  3n+3=O(n) /* 3n+34n for n3 */  100n+6=O(n) /* 100n+6101n for n10 */  10n2+4n+2=O(n2) /* 10n2+4n+211n2 for n5 */  6*2n+n2=O(2n) /* 6*2n+n2 7*2n for n4 */
  • 42. 42 Example  Complexity of c1n2+c2n and c3n  for sufficiently large of value, c3n is faster than c1n2+c2n  for small values of n, either could be faster  c1=1, c2=2, c3=100 --> c1n2+c2n  c3n for n  98  c1=1, c2=2, c3=1000 --> c1n2+c2n  c3n for n  998  break even point  no matter what the values of c1, c2, and c3, the n beyond which c3n is always faster than c1n2+c2n
  • 43. 43  O(1): constant  O(n): linear  O(n2): quadratic  O(n3): cubic  O(2n): exponential  O(logn)  O(nlogn)
  • 44. 44 Asymptotic Notations  To make these concepts more precise, asymptotic notations are introduced.  f(n) = O(g(n)) if there exist positive constants c and n0 such that f(n)  c g(n) for all n > n0.  1000n = O(2n2). This is to say that 1000n is no larger than () 2n2 in this sense.  You can choose c = 500 and n0 = 1.  Then 1000n  500 * 2n2 = 1000n2 for all n > 1.
  • 45. 45 Asymptotic Notations  1000n = O(n) since you can choose c = 1000 and n0 = 1.  1000 = O(1).  For constant functions, the choice of n0 is arbitrary.  n2  O(n) since for any c > 0 and any n0 > 0, there always exists some n > n0 such that n2 > cn.
  • 46. 46 Asymptotic Notations  2n2 + 100n = O(n2), since 2n2 + 100n < 200n2 = O(n2)  This last equation may be validated by choosing c = 200 and n0 = 1.  So you can feel that in asymptotic notations we only care about the most dominant term. Simply throw out the minor terms. Throw out the constants, too.  log2 n = O(n). May choose c = 1 and n0 = 2.  n log2 n = O(n2).
  • 47. 47 Asymptotic Notations  n100 = O(2n). You may choose c = 1 and n0 = 1000.  O(1): constant, O(n): linear, O(n2): quadratic, O(n3): cubic, O(log n): logarithmic, O(n log n), O(2n): exponential.  These are the most commonly encountered time complexities.  The base of the log is not relevant asymptotically, since logba = (1/log2b) log2a, different only by a constant multiple 1/log2b.
  • 48. 48 Asymptotic Notations  f(n) = O(g(n)) is just a notation. f(n) and O(g(n)) are not the same thing.  So you can’t write O(g(n)) = f(n).  It is also common to view O(g(n)) as a set of functions, and f(n) = O(g(n)) actually means f(n)  O(g(n)).  O(g(n)) = {f(n): for some c > 0 and n0 > 0 such that f(n) < c g(n) for all n > n0}
  • 49. 49 Asymptotic Notations  n = O(n), n = O(n2), n = O(n3), …  Choose the function g(n) closer to f(n) is more informative.  You may ask, why not use f(n) itself? f(n) = O(f(n)) is the best choice. The difficulty is: when we analyze an algorithm, we may even not know the exact f(n). We can only obtain an upper bound in some cases.
  • 50. 50 Asymptotic Notations  Thm 1.2: If f(n) = am nm + … + a1 n + a0, then f(n) = O(nm).  Pf: Let a = max{|am|, …, |a0|} + 1. Then  f(n) < a nm + … + a n + a < (m+1)a * nm for n > 1.  So choosing c = (m+1)a and n0 = 1 just works.  So, again, drop the constants and minor terms.
  • 51. 51 Asymptotic Notations  We also have a notation for lower bounds.  f(n) = (g(n)) if for some c > 0 and n0 > 0, f(n)  c g(n) for all n > n0.  n2 = (2n); choose c = 1/2 and n0 = 1.  2n = (n100); choose c = 1 and n = 1000.  You can prove that: If f(n) = O(g(n)), then g(n) = (f(n)).
  • 52. 52 Asymptotic Notations  Thm 1.3: If f(n) = am nm + … + a1 n + a0 and am > 0, then f(n) = (nm).  Pf: Exercise.  Note that, if am < 0, then f(n)  (nm). For example, -n2 + 1000n  (n2) since for any c > 0 and n0 > 0, we have –n2 + 1000n < n2 for n large.
  • 53. 53 Asymptotic Notations  We also have a notation for “equivalence”.  f(n) = (g(n)) if there exist c1 > 0, c2 > 0, and n0 > 0 such that c1 g(n)  f(n)  c2 g(n) for all n > n0.  2n2 + 100n = (n2).  n2  (n).  You can prove that, f(n) = (g(n)) if and only if f(n) = O(g(n)) and f(n) = (g(n)).
  • 54. 54 Asymptotic Notations  Thm 1.4: If f(n) = am nm + … + a1n + a0 and am > 0, then f(n) = (nm).  Pf: Immediate from Thm 1.2 and 1.3.
  • 55. 55 Time Complexity of Binary Search  At each iteration, the search range for binary search is reduced by about a half.  So, in any case, the number of iterations needed cannot exceed log2n. The time complexity is O(log n) in any case (each iteration takes O(1)).  The worst case time complexity is (log n), which occurs when, e.g., the number to search is not in the array.  Therefore, the worst case time complexity is (log n).  In the best case, one iteration suffices. The best case time complexity is (1).
  • 56. 56 Time Complexity of Binary Search  Note that the time complexity for a sequential search is (n), which occurs when, e.g., the number to search is not in the array.  So binary search is faster than sequential search, but it requires the array to be sorted.
  • 57. 57 Time Complexity of Binary Search  The worst case time complexity of the recursive binary search can be stated elegantly as:  T(n) = T(n/2) + (1) for n > 1; T(1) = (1).  The (1) term means some anonymous function f(n) s.t. f(n) = (1), which is for the time needed in addition to the time taken by the recursive call.
  • 58. 58 Time Complexity of Binary Search  Note that any function f(n) = (1) is bounded above by a constant, and is bounded below by a constant.  By the definition of , there exists c > 0 and n0 > 0 such that f(n) < c for n > n0. So f(n) < sup{|f(n)|: n  n0} + c. Similarly f(n) is bounded below by a constant.  So T(n) < T(n/2) + c.
  • 59. 59 Time Complexity of Binary Search  For simplicity let n = 2m, so m = log n. Then  T(n) = T(2m) < T(2m-1) + c < T(2m-2) + 2c < … < T(1) + m*c = O(m)  So T(n) = O(log n).  This may be seen as an initial guess. To confirm the answer we may use induction.
  • 60. 60 Time Complexity of Binary Search  Ind. Hyp.: T(m) < d log m for m < n.  We are free in choosing the constant d, if it satisfies the base case. So we can choose d to be larger than c.  Induction: T(n) < T(n/2) + c < d log(n/2) + c = d log n – d + c < d log n.  Base case: T(2) < T(1) + c < c’ < d log 2.  Need to choose d to satisfy d > c’.  So T(n) < d log n, implying T(n) = O(log n).
  • 61. 61 Selection Sort  Given several numbers, how to sort them?  You can find the smallest number, set it aside, find the next smallest number, and so on, continue until all numbers are done.
  • 62. 62 Selection Sort  sort(A, n) /* Assume that A is indexed by 1..n */  for i = 1 to n - 1 do find the index of the min. elem. in A[i..n]; swap the min. elem. with A[i];  Observe that, at the end of the i th iteration, A[j] holds the j th smallest element of A, for all i  i.
  • 63. 63 Time Complexity of Selection Sort  Find the minimum in A[i..n] takes (n - i) time.  Total time: ) ( ) ( 2 1 1 n i n n i       
  • 64. 64 Comparison of Two Strategies  Suppose that you want to do searches in A.  If the search will be performed only once, then a sequential search is good.  If the search is to be done very frequently (much more than n times), then it is worth paying n2 time to sort the array first (preprocessing), being able to do binary search subsequently.