មេរៀនៈ data structure and algorithm in c/c++

168
Data Structure Data Structure and and Algorithms Algorithms Lecturer: Lecturer: CHHAY Nuppakun CHHAY Nuppakun E-mail: E-mail: [email protected] Department of Computer Studies Department of Computer Studies Norton University - 2013 Norton University - 2013

Upload: ngeam-soly

Post on 08-May-2015

8.088 views

Category:

Documents


7 download

DESCRIPTION

នេះជាមេរៀនទាំងអស់នៃមុខវិជ្ជា Data Structure and Algorithm in C/C++

TRANSCRIPT

Data StructureData Structureandand

AlgorithmsAlgorithmsLecturer:Lecturer: CHHAY NuppakunCHHAY Nuppakun

E-mail: E-mail: [email protected]

Department of Computer StudiesDepartment of Computer StudiesNorton University - 2013Norton University - 2013

Fundamental ideasFundamental ideasofof

data structuredata structureandand

algorithmalgorithm

Chapter 1

33

Read AheadRead Ahead

You are expected to read the lecture You are expected to read the lecture notesnotes beforebefore t the lecturehe lecture..

This will facilitate more productive This will facilitate more productive discussiondiscussion duringduring class.class.

Like in an English class

Also please proof readassignments & tests.

44

Programs and Programs and programmingprogramming

What is a program?What is a program? A set of A set of instructionsinstructions working with working with datadata

designed to accomplish a specific taskdesigned to accomplish a specific task The “recipe” analogy The “recipe” analogy

Ingredients are the DataIngredients are the Data Directions are the Program StatementsDirections are the Program Statements

What is programmingWhat is programming The art and craft of writing programsThe art and craft of writing programs The art to control these “idiot The art to control these “idiot

servants” and “naïve children”servants” and “naïve children”

55

Introduction to ProgrammingIntroduction to Programming Programming is to solve problems using computersProgramming is to solve problems using computers

How to do it at all ?How to do it at all ? How to do it robustly ?How to do it robustly ? How to do it effectively ?How to do it effectively ?

Programming consists of two steps:Programming consists of two steps: Algorithmic design (the architects)Algorithmic design (the architects) Coding (the construction workers)Coding (the construction workers)

Programming requires:Programming requires: A programming language (C/C++/C#) to express your A programming language (C/C++/C#) to express your

ideasideas A set of tools to design, edit, and debug your codeA set of tools to design, edit, and debug your code A compiler to translate your programs into machine codeA compiler to translate your programs into machine code A machine to run the executable codeA machine to run the executable code

66

Crafting Programs EffectivelyCrafting Programs Effectively Program designProgram design

design processdesign process stepwise refinement & top-down designstepwise refinement & top-down design bottom-up designbottom-up design modularization, interfacesmodularization, interfaces use of abstractionsuse of abstractions

Programming styleProgramming style structured programmingstructured programming readable codereadable code effective use of language constructseffective use of language constructs ““formatting”formatting” software organizationsoftware organization

Documentation and commentsDocumentation and comments

77

Good ProgramsGood Programs There are a number of facets to good There are a number of facets to good

programs: they mustprograms: they must run correctly run correctly run efficiently run efficiently be easy to read and understand be easy to read and understand be easy to debug be easy to debug andand be easy to modifybe easy to modify

better running times will generally be better running times will generally be obtained from use of the most obtained from use of the most appropriate data structures and appropriate data structures and algorithmsalgorithms

88

Why Data Structure and AlgorithmsWhy Data Structure and Algorithms

Computer is becoming ubiquitous … Computer is becoming ubiquitous … programming gets you more out of computerprogramming gets you more out of computer learn how to solve problemslearn how to solve problems dealing with abstractionsdealing with abstractions be more precisebe more precise

Unfortunately, most peopleUnfortunately, most people know little about Computer Scienceknow little about Computer Science know little about Programmingknow little about Programming write bad or buggy programswrite bad or buggy programs become lost when writing large programsbecome lost when writing large programs

99

Algorithms and Data Algorithms and Data StructuresStructures

Algorithm: a strategy for computing something, e.g.,Algorithm: a strategy for computing something, e.g., sorting: putting data in order by keysorting: putting data in order by key searching: finding data in some kind of indexsearching: finding data in some kind of index finding primes and generating random numbersfinding primes and generating random numbers string processingstring processing graphics: drawing lines, arcs, and other geometric graphics: drawing lines, arcs, and other geometric

objectsobjects Data structure: a way to store data, e.g., Data structure: a way to store data, e.g.,

arrays and vectorsarrays and vectors linked listslinked lists

Two are related:Two are related: data structures organize datadata structures organize data algorithms use that organizationalgorithms use that organization

1010

What are computers?What are computers? ““idiot servants” that can do simple idiot servants” that can do simple

operations incredibly fast if you tell operations incredibly fast if you tell them them every stepevery step to do to do

like little children in their need for like little children in their need for specific and detailed specific and detailed instructioninstruction

computers are not “brains” & are not computers are not “brains” & are not “smart” - they only as good as the “smart” - they only as good as the programprogram they are running they are running

1111

Computer Environment: Computer Environment: HardwareHardware

HardwareHardware the physical, tangible parts of a computerthe physical, tangible parts of a computer E.g., CPU, storage, keyboard, monitorE.g., CPU, storage, keyboard, monitor

Monitor

Keyboard

MainMemory

CentralProcessing

Unit

CD ROM

Hard Disk

chip that executes chip that executes program commandsprogram commandse.g., e.g., Intel Pentium IVIntel Pentium IVSun SparcSun SparcTransmetaTransmeta

primary storage area for primary storage area for programs and dataprograms and data

also called RAMalso called RAM

1212

Computer Environment: SoftwareComputer Environment: Software Operating SystemOperating System

E.g., Linux, Mac OS X, Windows 2000, Windows E.g., Linux, Mac OS X, Windows 2000, Windows XPXP

manages resources such as CPU, memory, and manages resources such as CPU, memory, and diskdisk

controls all machine activitiescontrols all machine activities

Application programsApplication programs generic term for any other kind of softwaregeneric term for any other kind of software compiler, word processors, missile control compiler, word processors, missile control

systems, gamessystems, games

1313

Operating SystemOperating System What does an OS do?What does an OS do?

hides low level details of bare machinehides low level details of bare machine arbitrates competing resource demandsarbitrates competing resource demands

Useful attributesUseful attributes multi-usermulti-user multi-taskingmulti-tasking

OperatingSystem

UserProgram

CPU

Disk

Network

Review of C++ Review of C++ EssentialsEssentials

Chapter 2

1515

Main Program and Library FilesMain Program and Library Files<preprocessor directives><preprocessor directives><global data and function declarations><global data and function declarations>int main( )int main( ){{

<local data declarations><local data declarations><statements><statements>return 0;return 0;

}}<main program function <main program function

implementation>implementation>

1616

Program CommentsProgram Comments

/* /*

<multiline comments ><multiline comments >

*/*/

//<end-of-line comments>//<end-of-line comments>

1717

structured

address

pointer reference

simple

integral enum floating

float double long double

C++ Data TypesC++ Data Types

1818

Simple Data TypesSimple Data Types

intint floatfloat

charchar long, double, long, double,

unsignedunsigned

VariablesVariables

<data type> <list of identifiers><data type> <list of identifiers>

<data type> <identifier> = <data type> <identifier> = <initial value><initial value>

1919

Symbolic ConstantsSymbolic Constants

const float PI = const float PI = 3.141592653589793238;3.141592653589793238;

const int UPPER_BOUND = 100;const int UPPER_BOUND = 100;

const char BLANK = ` `;const char BLANK = ` `;

2020

Expressions and AssignmentExpressions and Assignment

Operators: +, -, *, /, %, =, <, <=, >=, Operators: +, -, *, /, %, =, <, <=, >=, = =, !=,= =, !=,

&&, ||, !, ( )&&, ||, !, ( ) Examples:Examples:

a = b = c = 5;a = b = c = 5;

((a = b) = c) = 5; //((a = b) = c) = 5; //??

a = = 0;a = = 0;

a = 0a = 0

2121

Type ConversionType Conversion

<type name> <type name> (<expression>) or(<expression>) or

(<type name>) (<type name>) <expression><expression>

Example:Example:int (3.14) returns 3int (3.14) returns 3

(float) 3 returns 3.0(float) 3 returns 3.0

2222

Interactive I/OInteractive I/O

cout << “Enter an int, a float, and a cout << “Enter an int, a float, and a string, “string, “

<< “separated by spaces”<< “separated by spaces”

cin >> int_value >> float_value >> cin >> int_value >> float_value >> stringstring

2323

FunctionsFunctions

double pow(double base, double exponent);double pow(double base, double exponent);

cout << pow(5,2) << endl; // function callcout << pow(5,2) << endl; // function call

******************************************************************

void hello_world( )void hello_world( )

{{

cout << “Hello World” << endl;cout << “Hello World” << endl;

}}

hello_world( );hello_world( ); //call to a void function//call to a void function

2424

SelectionSelection if if if … elseif … else switch switch

For (<initiation>, <termination>, For (<initiation>, <termination>, <update>)<update>)

while (<condition>) { <statements>}while (<condition>) { <statements>} do {<statements>} while do {<statements>} while

(<condition>)(<condition>)

IterationIteration

2525

User Defined TypeUser Defined Type Using typedefUsing typedef

typedef int boolean;typedef int boolean; Using enumUsing enum enum weekday {MON, TUE, WED, THUR, enum weekday {MON, TUE, WED, THUR,

FRI};FRI};

enum primary_color {RED, YELLOW, enum primary_color {RED, YELLOW, BLUE};BLUE};

weekday day = MON;weekday day = MON;

primary_color color = RED;primary_color color = RED;

2626

Structured Data TypesStructured Data Types

ArraysArrays StringsStrings StructsStructs FilesFiles

2727

Why do we need an array?Why do we need an array?

#include #include <iostream.h><iostream.h>

int value0;int value0; int value1;int value1; int value2;int value2; …… int value999;int value999;

cin >> value0;cin >> value0; cin >> value1;cin >> value1; …… cin >> value999;cin >> value999; cout << value0;cout << value0; cout << value1;cout << value1; cout << value2;cout << value2; …… cout << value999cout << value999

2828

Array DeclarationArray Declaration<type><type><ArrayName>[Size]; <ArrayName>[Size];

Example int value[1000];Example int value[1000];

Multidimensional ArrayMultidimensional Array Declaration Declaration <type> <ArrayName> [index0][...][indexN]<type> <ArrayName> [index0][...][indexN]

ExampleExample int hiTemp[52][7]int hiTemp[52][7] int ThreeD[10][10][5]int ThreeD[10][10][5]

2929

Accessing an ArrayAccessing an Array Array initialization Array initialization

for (I = 0; I < = 999, I++) for (I = 0; I < = 999, I++)

value[I] = 2 *I -1;value[I] = 2 *I -1; Each of an array’s elements can be Each of an array’s elements can be

accessed in sequence by varying an accessed in sequence by varying an array index variable within a loop array index variable within a loop

Multidimensional arrays can be accessed Multidimensional arrays can be accessed with nested loops. with nested loops.

AlgorithmsAlgorithms

Chapter 3

3131

AlgorithmAlgorithm

DefinitionDefinition A step-by-step procedure for solving A step-by-step procedure for solving

a problem in a finite amount of timea problem in a finite amount of time Pseudo-codePseudo-code

is a compact and informal high-level is a compact and informal high-level description of a description of a computer programming algorithm that uses the structural that uses the structural conventions of a conventions of a programming language

3232

Algorithms Algorithms (Continue)(Continue)

1.1. Most algorithms of interest involve methods of organizing Most algorithms of interest involve methods of organizing the data involved in the computation. the data involved in the computation. Objects created in Objects created in this way are called data structuresthis way are called data structures => => algorithms and algorithms and data structures go hand in handdata structures go hand in hand

2.2. use a computer to help us solve a problem for small or for use a computer to help us solve a problem for small or for huge problems - quickly become motivated to devise huge problems - quickly become motivated to devise methods that use time or space as efficiently as possible.methods that use time or space as efficiently as possible.

3.3. Careful algorithm design is an extremely effective Careful algorithm design is an extremely effective part of the process of solving a huge problem, part of the process of solving a huge problem, whatever the applications areawhatever the applications area

Algorithm is used in computer science to Algorithm is used in computer science to describe a describe a problem-solving method suitableproblem-solving method suitable for for implementation as a computer program:implementation as a computer program:

3333

Algorithms Algorithms (Continue)(Continue)

4.4. Huge or complex computer program is to be developed, a Huge or complex computer program is to be developed, a great deal of effort must go into understanding and defining great deal of effort must go into understanding and defining the problem to be solved, In most cases, however, there are a the problem to be solved, In most cases, however, there are a few algorithms whose choice is critical because most of the few algorithms whose choice is critical because most of the system resources will be spent running those algorithmssystem resources will be spent running those algorithms

5.5. The sharing of programs in computer systems is becoming The sharing of programs in computer systems is becoming more widespread => to reimplement basic algorithms arises more widespread => to reimplement basic algorithms arises frequently, that we are faced with completely new computing frequently, that we are faced with completely new computing environments (hardware and software) with new features that environments (hardware and software) with new features that old implementations may not use to best advantage. To make old implementations may not use to best advantage. To make our solutions more portable and longer lasting. our solutions more portable and longer lasting.

6.6. The choice of the best algorithm for a particular task can be a The choice of the best algorithm for a particular task can be a complicated process, perhaps complicated process, perhaps involving sophisticated involving sophisticated mathematical analysismathematical analysis. The branch of computer science that . The branch of computer science that comprises the study of such questions is called comprises the study of such questions is called analysis of analysis of algorithmsalgorithms. .

3434

Analysis of Algorithms Analysis of Algorithms

Analysis is the key to being able to understand Analysis is the key to being able to understand algorithms sufficiently wellalgorithms sufficiently well

Analysis plays a role at every point in the process of Analysis plays a role at every point in the process of designing and implementing algorithmsdesigning and implementing algorithms

which mathematical analysis can play a role in the which mathematical analysis can play a role in the process of comparing the performance of algorithmsprocess of comparing the performance of algorithms

The following are among the reasons that we perform The following are among the reasons that we perform mathematical analysis of algorithms:mathematical analysis of algorithms:

To compare different algorithms for the same taskTo compare different algorithms for the same task To predict performance in a new environmentTo predict performance in a new environment To set values of algorithm parametersTo set values of algorithm parameters

3535

Growth of Functions Growth of Functions Most algorithms have a Most algorithms have a primary parameterprimary parameter N that N that

affects the running time most significantly:affects the running time most significantly: The parameter N might be the degree of a The parameter N might be the degree of a

polynomialpolynomial the size of a file to be sorted or searchedthe size of a file to be sorted or searched the number of characters in a text stringthe number of characters in a text string or some other abstract measure of the size of or some other abstract measure of the size of

the problem being consideredthe problem being considered By using mathematical formulas that are as simple as By using mathematical formulas that are as simple as

possible and that are accurate for large values of the possible and that are accurate for large values of the parametersparameters

3636

Growth of FunctionsGrowth of Functions (Continue)(Continue)

The algorithms in typically have running times The algorithms in typically have running times proportional to one of the following functions:proportional to one of the following functions:

1 Most instructions of most programs are executed once or 1 Most instructions of most programs are executed once or at most only a few times, that the program's running time is at most only a few times, that the program's running time is constant constant

log N When the running time of a program is logarithmic, the log N When the running time of a program is logarithmic, the program gets slightly slower as N grows. This running time program gets slightly slower as N grows. This running time commonly occurs in programs that solve a big problem by commonly occurs in programs that solve a big problem by transformation into a series of smaller problemstransformation into a series of smaller problems

N When the running time of a program is linear, it is N When the running time of a program is linear, it is generally the case that a small amount of processing is done generally the case that a small amount of processing is done on each input elementon each input element

N log N The N log N running time arises when algorithms N log N The N log N running time arises when algorithms solve a problem by breaking it up into smaller subproblems, solve a problem by breaking it up into smaller subproblems, solving them independently, and then combining the solutions solving them independently, and then combining the solutions

3737

Growth of FunctionsGrowth of Functions (Continue)(Continue)

N2 When the running time of an algorithm is quadratic, N2 When the running time of an algorithm is quadratic, thatthat algorithm is practical for use on only algorithm is practical for use on only relatively small problemsrelatively small problems

N3N3 Similarly, an algorithm that processes triples Similarly, an algorithm that processes triples of data items (perhaps in a triple nested loop) has a cubic of data items (perhaps in a triple nested loop) has a cubic running time and is practical for use on only small running time and is practical for use on only small problemsproblems

2N Few algorithms with exponential running time are 2N Few algorithms with exponential running time are likely to be appropriate for practical use, even though such likely to be appropriate for practical use, even though such algorithms arise naturally as brute-force solutions to algorithms arise naturally as brute-force solutions to problems.problems.

The running time of a particular program is likely The running time of a particular program is likely to be some constant multiplied by one of these to be some constant multiplied by one of these terms (the leading term) plus some smaller terms.terms (the leading term) plus some smaller terms.

3838

3939

Running TimeRunning Time

Most algorithms transform input objects Most algorithms transform input objects into output objects.into output objects.

The running time of an algorithm The running time of an algorithm typically grows with the input size.typically grows with the input size.

Average case time is often difficult to Average case time is often difficult to determine.determine.

We focus on the worst case running We focus on the worst case running time.time. Easier to analyzeEasier to analyze Crucial to applications such as games, finance Crucial to applications such as games, finance

and roboticsand robotics

4040

Experimental StudiesExperimental Studies

Write a program implementing the algorithmWrite a program implementing the algorithm Run the program with inputs of varying size Run the program with inputs of varying size

and compositionand composition Use a function, like the built-in clock() Use a function, like the built-in clock()

function, to get an accurate measure of the function, to get an accurate measure of the actual running timeactual running time

Plot the resultsPlot the results

Limitations of ExperimentsLimitations of Experiments

It is necessary to implement the algorithm, which It is necessary to implement the algorithm, which may be difficultmay be difficult

Results may not be indicative of the running time Results may not be indicative of the running time on other inputs not included in the experiment.on other inputs not included in the experiment.

In order to compare two algorithms, the same In order to compare two algorithms, the same hardware and software environments must be hardware and software environments must be usedused

4141

4242

Algorithm AnalysisAlgorithm Analysis

C= a + b;C= a + b;

Operands: c, a, bOperands: c, a, b

Operators: +, =Operators: +, =

Simple model computation stepsSimple model computation steps::

- load operands (fetch time for c, a, b)- load operands (fetch time for c, a, b)

- perform operations (operates time for + and =)- perform operations (operates time for + and =)

- so above instruction needs 3T- so above instruction needs 3Tfetchfetch + 1T + 1T++ + 1T + 1Tstorestore

4343

Algorithm AnalysisAlgorithm Analysis

int num= 25;int num= 25;

Operands: num, constant: 25, operator: =Operands: num, constant: 25, operator: =

Time needed: 1TTime needed: 1Tfetchfetch + 1T + 1Tstorestore

n>= I;n>= I;

Operands: n, i, operator: >=Operands: n, i, operator: >=

Time needed: 2TTime needed: 2Tfetchfetch + 1T + 1T>=>=

++i; i=i+1;++i; i=i+1;

Time needed: 2TTime needed: 2Tfetchfetch + 1T + 1T++ + 1T + 1Tstorestore

4444

Algorithm AnalysisAlgorithm Analysis

ExercisesExercises

1- cout<< i;1- cout<< i;

2- area= l * w;2- area= l * w;

3- C=5/9 * (F-32);3- C=5/9 * (F-32);

4- return i;4- return i;

5- *p= &a;5- *p= &a;

4545

Computing running timeComputing running timeArithmetic series summationArithmetic series summation (eg.) (eg.)

1- unsignet int Sum (unsigned int n)1- unsignet int Sum (unsigned int n)

2- {2- {

3- unsigned int result=0;3- unsigned int result=0;

4- for (int i=0; i<=n; i++)4- for (int i=0; i<=n; i++)

5- result+=l;5- result+=l;

6- return result;6- return result;

7- }7- }

Statement Time Code

3 Tfetch + Tstore result=0;

4a Tfetch + Tstore i=0;

4b (2Tfetch + T<) * (n+1) i<=n;

4c (2Tfetch + T+ + Tstore) * n i++;

5 (3Tfetch + T+ + Tstore) * n result+=I;

6 Tfetch + Treturn return result;

Total

(7Tfetch + 2T+ + 2Tstore + T<) * n

+(5Tfetch + 2Tstore + T< + Treturn)

Computing running time of the program

4646

Big-Oh Notation Big-Oh Notation The mathematical artifact that allows us to suppress detail The mathematical artifact that allows us to suppress detail

when we are analyzing algorithms is called the when we are analyzing algorithms is called the O-notation, O-notation, or "big-Oh notation,"or "big-Oh notation,"

Definition 1Definition 1 A function g(N) is said to be O(f (N)) if A function g(N) is said to be O(f (N)) if there exist constants co and No such that g(N) < co f (N) there exist constants co and No such that g(N) < co f (N) for all N > Nofor all N > No..

We use the O-notation for three distinct purposes:We use the O-notation for three distinct purposes: To bound the error that we make when we ignore To bound the error that we make when we ignore

small terms in mathematical formulassmall terms in mathematical formulas To bound the error that we make when we ignore To bound the error that we make when we ignore

parts of a program that contribute a small amount to parts of a program that contribute a small amount to the total being analyzedthe total being analyzed

To allow us to classify algorithms according to upper To allow us to classify algorithms according to upper bounds on their total running timesbounds on their total running times

4747

Big-Oh Notation Big-Oh Notation (Continue)(Continue)

Often, the results of a mathematical analysis are not exact, Often, the results of a mathematical analysis are not exact, but rather are approximate in a precise technical sensebut rather are approximate in a precise technical sense

The O-notation allows us to keep track of the leading terms The O-notation allows us to keep track of the leading terms while ignoring smaller terms when manipulating while ignoring smaller terms when manipulating approximate mathematical expressionsapproximate mathematical expressions

For example, if we expand the expression:For example, if we expand the expression:

(N + O (1)) (N + O (log N) + O(1)),(N + O (1)) (N + O (log N) + O(1)),

we get six terms: N2 + O (N) + O (N log N) + O (log N) + we get six terms: N2 + O (N) + O (N log N) + O (log N) + O (N) + O (1),O (N) + O (1),

but can drop all but the largest O-term, leaving the but can drop all but the largest O-term, leaving the approximationapproximation

N2 + O (N log N).N2 + O (N log N).

That is, N2 is a good approximation to this expression That is, N2 is a good approximation to this expression when N is large.when N is large.

4848

Another ExampleAnother Example What if the input size is 10,000What if the input size is 10,000

Algorithm 1: 1,000,000Algorithm 1: 1,000,000 Algorithm 2: 100,000,000Algorithm 2: 100,000,000

ConclusionConclusion Algorithm 1 is better!Algorithm 1 is better!

Question:Question: Who is REALLY better?Who is REALLY better? Confused!Confused!

ReasonReason Too precise!Too precise!

SolutionSolution Big-O notation Big-O notation –– Order of the algorithm Order of the algorithm Rougher measurementRougher measurement Measure the increasing speed, ignoring the constants and smaller Measure the increasing speed, ignoring the constants and smaller

itemsitems Better algorithms have lower increasing speedBetter algorithms have lower increasing speed

RememberRemember The order of an algorithm generally is more important than the The order of an algorithm generally is more important than the

speed of the processor (CPU)speed of the processor (CPU) Why?Why?

4949

Data StructureData Structure

Chapter 4

5151

Data StructureData Structure

DefinitionDefinition A data structure is a collection of data, A data structure is a collection of data,

generally organized so that items can be stored generally organized so that items can be stored and retrieved by some fixed techniquesand retrieved by some fixed techniques

ExampleExample An arrayAn array Stored and retrieved based on an index Stored and retrieved based on an index

assigned to each itemassigned to each item

5252

Data Structures vs. SoftwareData Structures vs. Software

How They RelatedHow They Related Software is designed to help people solve Software is designed to help people solve

problems in realityproblems in reality To solve the problems, there are some THINGS, To solve the problems, there are some THINGS,

or INFOs in reality to be processedor INFOs in reality to be processed Those THINGS or INFOs are called Those THINGS or INFOs are called DATADATA DATA and their DATA and their RELATIONSRELATIONS can be complicated can be complicated

5353

Data Structures vs. SoftwareData Structures vs. Software

How They RelatedHow They Related Reasonable organization of DATAReasonable organization of DATA helps improving helps improving

software efficiency, decreasing software design software efficiency, decreasing software design difficultydifficulty

Experiences accumulated in the past will be learned Experiences accumulated in the past will be learned in this course, and they are certain in this course, and they are certain DATA DATA STRUCUTRESSTRUCUTRES, such as the linked list and the binary , such as the linked list and the binary treetree

DATA STRUCTURE is a smart way to organize DATA, DATA STRUCTURE is a smart way to organize DATA, depends on the features of DATA, and how the DATA depends on the features of DATA, and how the DATA are processedare processed

5454

Phases of Software Phases of Software DevelopmentDevelopment

PhasesPhases Specification of the taskSpecification of the task Design of a solutionDesign of a solution Implementation of the solutionImplementation of the solution Analysis of the solutionAnalysis of the solution Testing and debuggingTesting and debugging Maintenance and evolution of the Maintenance and evolution of the

systemsystem ObsolescenceObsolescence

5555

Phases of Software Phases of Software DevelopmentDevelopment

Features of the PhasesFeatures of the Phases NOT a fixed sequenceNOT a fixed sequence

For example, in a widely used OO DESIGN For example, in a widely used OO DESIGN method, Unified Process (UP), there are many method, Unified Process (UP), there are many iterations, and in each iteration, there are iterations, and in each iteration, there are specification, design, implementation and test specification, design, implementation and test involved. Feedback from previous iteration helps involved. Feedback from previous iteration helps improving the next iterationimproving the next iteration

You can find other examples from textbookYou can find other examples from textbook Most phases are independent of Most phases are independent of

programming languagesprogramming languages We will use Java for IMPLEMNTATIONWe will use Java for IMPLEMNTATION However, most of what we learned in this course However, most of what we learned in this course

applies to other languagesapplies to other languages

5656

Arrays Arrays The most fundamental data structure is the arrayThe most fundamental data structure is the array An array is a fixed number of data items that are stored An array is a fixed number of data items that are stored

contiguously and that are accessible by an indexcontiguously and that are accessible by an index A simple example of the use of an array, which prints A simple example of the use of an array, which prints

out all the prime numbers less than 1000.out all the prime numbers less than 1000.

const int N = 1000;const int N = 1000;main( )main( ){ int i, j, a[N+1];{ int i, j, a[N+1];for (a[1] = 0, i = 2; i <= N; i++) a[i]=1;for (a[1] = 0, i = 2; i <= N; i++) a[i]=1;for (i = 2; i <= N/2; i++)for (i = 2; i <= N/2; i++) for (j = 2; j <= N/i; j++) a[i*j] = 0;for (j = 2; j <= N/i; j++) a[i*j] = 0;for (i = 1; i <= N; i++)for (i = 1; i <= N; i++) if (a[i]) cout << i << ‘ ‘ ; cout << ‘\n’;if (a[i]) cout << i << ‘ ‘ ; cout << ‘\n’;}}

5757

Arrays Arrays (Continue)(Continue)

The primary feature of arrays is that if the index is The primary feature of arrays is that if the index is known, any item can be accessed in constant timeknown, any item can be accessed in constant time

The size of the array must be known beforehand, The size of the array must be known beforehand, it is possible to declare the size of an array at it is possible to declare the size of an array at execution timeexecution time

Arrays are fundamental data structures in that Arrays are fundamental data structures in that they have a direct correspondence with memory they have a direct correspondence with memory systems on virtually all computerssystems on virtually all computers

The entire computer memory as an array, with the The entire computer memory as an array, with the memory addresses corresponding to array indicesmemory addresses corresponding to array indices

5858

Linked ListsLinked Lists

The second elementary data structure to The second elementary data structure to consider is the linked listconsider is the linked list

The The primaryprimary advantage of linked advantage of linked lists over arrayslists over arrays is that: is that:

linked lists can linked lists can grow and shrink in sizegrow and shrink in size during during their lifetimetheir lifetime

their maximum size need not be known in their maximum size need not be known in advanceadvance

it possible to have several data structures share it possible to have several data structures share the same spacethe same space

5959

Linked Lists Linked Lists (Continue)(Continue)

A A second advantagesecond advantage of linked lists is of linked lists is that:that:

they they provide flexibilityprovide flexibility in allowing the in allowing the items items to be rearrangedto be rearranged efficiently efficiently

This flexibility is This flexibility is gainedgained at the expense of at the expense of quick quick access to any arbitrary item in the listaccess to any arbitrary item in the list

A A linked list is a set of items organized linked list is a set of items organized sequentiallysequentially, just like an array, just like an array

A linked listA linked list

A L I S T

6060

Flexible space useFlexible space use Dynamically allocate space for each element Dynamically allocate space for each element

as neededas needed Include a pointer to the next itemInclude a pointer to the next item

Linked listLinked list Each Each nodenode of the list contains of the list contains

the data item the data item (an object pointer in our ADT)(an object pointer in our ADT) a pointer to the next nodea pointer to the next node

Data Next

object

Linked Lists Linked Lists (Continue)(Continue)

6161

Collection structure has a pointer to the list Collection structure has a pointer to the list headhead Initially NULLInitially NULL

Add first itemAdd first item Allocate space for nodeAllocate space for node Set its data pointer to objectSet its data pointer to object Set Next to NULLSet Next to NULL Set Head to point to new nodeSet Head to point to new node

Data Next

object

Head

Collection

node

Linked Lists Linked Lists (Continue)(Continue)

6262

Add second itemAdd second item Allocate space for nodeAllocate space for node Set its data pointer to objectSet its data pointer to object Set Next to current HeadSet Next to current Head Set Head to point to new nodeSet Head to point to new node

Data Next

object

Head

Collection

node

Data Next

object2

node

Linked Lists Linked Lists (Continue)(Continue)

6363

Linked Lists Linked Lists (Continue)(Continue)

A L I S Thead z

A linked list with its dummy nodes.

A L I S Thead z

T A L I Shead z

Rearranging a linked list

6464

Linked Lists Linked Lists (Continue)(Continue)

head z

A L I S T

X

head

A L I X

z

S T

head

A L I X

z

S T

Insertion into and deletion from a linked list.

6565

Linked Lists - LIFO and FIFOLinked Lists - LIFO and FIFO

Single Linked ListSingle Linked List One-way cursorOne-way cursor Only can move forwardOnly can move forward

Simplest implementationSimplest implementation Add to headAdd to head Last-In-First-OutLast-In-First-Out (LIFO) semantics (LIFO) semantics

ModificationsModifications First-In-First-OutFirst-In-First-Out (FIFO) (FIFO) Keep a tail pointerKeep a tail pointer

head

tail

6666

Linked Lists - Doubly linkedLinked Lists - Doubly linked

Doubly linkedDoubly linked lists lists Can be scanned in Can be scanned in both directionsboth directions Two-way cursorTwo-way cursor Can move forward and backwardCan move forward and backward

head

tail

prev prev prev

6767

Arrays are better at random accessArrays are better at random access What is the 4What is the 4thth element in the list? element in the list? Arrays need O(C) timeArrays need O(C) time Linked lists need O(n) time at worst caseLinked lists need O(n) time at worst case

Linked lists are better at additions and Linked lists are better at additions and removals at a cursorremovals at a cursor Operations at the cursor need O(C) timeOperations at the cursor need O(C) time Arrays don’t have cursor, so addition and removal Arrays don’t have cursor, so addition and removal

operations need O(n) time at worst caseoperations need O(n) time at worst case

Linked List vs. ArrayLinked List vs. ArrayLinked List vs. ArrayLinked List vs. Array

6868

Resizing can be inefficient for an arrayResizing can be inefficient for an array For arrays, capacity must be maintained in an For arrays, capacity must be maintained in an

inefficient wayinefficient way For linked lists, no problemFor linked lists, no problem

SummarySummary ArrayArray

Frequent random access operationsFrequent random access operations Linked listsLinked lists

Operations occur at a cursorOperations occur at a cursor Frequent capacity changesFrequent capacity changes Operations occur at a two-way cursor (DLL)Operations occur at a two-way cursor (DLL)

Linked List vs. ArrayLinked List vs. ArrayLinked List vs. ArrayLinked List vs. Array

6969

Storage Allocation Storage Allocation arrays are a rather direct representation of arrays are a rather direct representation of

the memory of the computerthe memory of the computer direct-array representation of linked listsdirect-array representation of linked lists is to use "parallel arrays“is to use "parallel arrays“ The advantage of using parallel arrays is The advantage of using parallel arrays is

that the structure can be built on top of' the that the structure can be built on top of' the data: the array key contains data and only data: the array key contains data and only data all the structure is in the parallel array data all the structure is in the parallel array nextnext

more data can be added with more parallel more data can be added with more parallel arraysarrays

7070

Pushdown Stacks Pushdown Stacks The most important restricted-access data structure The most important restricted-access data structure

is the is the pushdown stack. pushdown stack. Items are added in a: Items are added in a: LLast ast IIn n FFirst irst OOut (LIFO) approachut (LIFO) approach

two basic operationstwo basic operations are involved: one can are involved: one can push an push an item onto the stackitem onto the stack (insert it at the beginning) and (insert it at the beginning) and pop an itempop an item (remove it from the beginning) (remove it from the beginning)

pushdown stacks appear as the fundamental data pushdown stacks appear as the fundamental data structure for many algorithmsstructure for many algorithms

The stack is represented with an array stack and The stack is represented with an array stack and pointer p to the top of the stack the functions push, pointer p to the top of the stack the functions push, pop, and empty are straightforward pop, and empty are straightforward implementations of the basic stack operationsimplementations of the basic stack operations

7171

Stack Example – Math ParserStack Example – Math Parser Define Define ParserParser 9 * ( 3 + 5 ) * (4 + 2) = ?9 * ( 3 + 5 ) * (4 + 2) = ?

Why not Why not 1010?? In INFIX notationIn INFIX notation Convert to Postfix using a STACKConvert to Postfix using a STACK

9 5 3 + * 4 2 + *9 5 3 + * 4 2 + * Then compute using a STACKThen compute using a STACK

Answer: Answer:

7272

Infix -> Postfix AlgorithmInfix -> Postfix Algorithm 9 * ( 3 + 5 ) * (4 + 2) = ?9 * ( 3 + 5 ) * (4 + 2) = ?

Only worrying about +, *, and ()Only worrying about +, *, and () Initialize StackInitialize Stack If you get a #, output itIf you get a #, output it If you get a operand, entries are popped If you get a operand, entries are popped

until we get a lower priorityuntil we get a lower priority If you get a ‘)’, pop and output operands If you get a ‘)’, pop and output operands

until you clear a ‘(‘until you clear a ‘(‘

7373

Infix -> PostfixInfix -> Postfix

StartStart 9 * ( 3 + 5 ) * (4 + 2) = ? 9 * ( 3 + 5 ) * (4 + 2) = ?

9

Output 9

7474

Infix -> PostfixInfix -> PostfixEndEnd 9 * ( 3 + 5 ) * (4 + 2) = ? 9 * ( 3 + 5 ) * (4 + 2) = ?

9 3 5 +* 4 2 + *

Pop until stack is empty

Top

7575

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * * Given a #, push itGiven a #, push it Given an operandGiven an operand

Pop the top two #sPop the top two #s Apply operandApply operand Push result back onto stackPush result back onto stack

7676

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push 9

Top

7777

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push 3

Top3

7878

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push 5

Top

3

5

7979

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Pop Two Numbers

Top

35

3

5

8080

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Apply +

Top

35

3

5

+

8181

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push Result (8)

Top8

8282

Calculate PostfixCalculate Postfix 9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push 4

Top

8

4

8383

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push 2

Top

8

4

2

8484

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Pop 2 and 4, Add

Top8

42 +

8585

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push Result (6)

Top

8

6

8686

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Pop 6 and Pop 8 and Multiply

Top

86 *

8787

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

9

Push Result (48)

Top48

8888

Calculate PostfixCalculate Postfix

9 3 5 + 4 2 + * *9 3 5 + 4 2 + * *

Pop 48 and Pop 9

Multiply

Top

948 *

Answer: 432

8989

Using StacksUsing Stacks

Computer ArchitectureComputer Architecture Operating SystemsOperating Systems Event Planning (Networking, OS)Event Planning (Networking, OS) Computer Graphics (Scene graphs)Computer Graphics (Scene graphs) Compilers, ParsersCompilers, Parsers

9090

Queues Queues Another fundamental restricted-access data Another fundamental restricted-access data

structure is called the queuestructure is called the queue two basic operationstwo basic operations are involved: are involved: oneone can can insert insert

(add) an item into the queue at the beginning(add) an item into the queue at the beginning and and remove an item from the endremove an item from the end

queues obey a "first in, first out” (FIFO) disciplinequeues obey a "first in, first out” (FIFO) discipline There is three class variables: the size of the There is three class variables: the size of the

queue and two indices, one to the beginning of queue and two indices, one to the beginning of the queue (head) and one to the end (tail)the queue (head) and one to the end (tail)

If head and tail are equalIf head and tail are equal, then the , then the queue is queue is defined to be emptydefined to be empty; but ; but if put would make them if put would make them equalequal, then it is , then it is defined to be fulldefined to be full

Applications of QueuesApplications of Queues

Direct applicationsDirect applications Waiting linesWaiting lines Access to shared resources (e.g., Access to shared resources (e.g.,

printer)printer) MultiprogrammingMultiprogramming

Indirect applicationsIndirect applications Auxiliary data structure for algorithmsAuxiliary data structure for algorithms Component of other data structuresComponent of other data structures

9191

9292

Queue ExampleQueue Example

You: Bank of America employeeYou: Bank of America employee Boss: How many tellers do I need?Boss: How many tellers do I need? How do you go about solving this How do you go about solving this

problem?problem? Simulations! Simulations!

What are the parameters?What are the parameters?

9393

Bank Teller ExampleBank Teller Example ClassesClasses Data structuresData structures InputInput

Time step = 5 secTime step = 5 sec Transaction = 2 minutesTransaction = 2 minutes Customer Frequency = 50% chance every Customer Frequency = 50% chance every

15 seconds15 seconds What questions do we want to know?What questions do we want to know?

Average wait timeAverage wait time Average line lengthAverage line length

How a simulation would workHow a simulation would work

9494

More Queue examplesMore Queue examples

Networking: RouterNetworking: Router Computer Architecture: Execution UnitsComputer Architecture: Execution Units Printer queuesPrinter queues File systemsFile systems Wal-Mart checkout linesWal-Mart checkout lines Disney entranceDisney entrance

9595

RecursionRecursion Two Necessary PartsTwo Necessary Parts

Recursive callsRecursive calls Stopping or base casesStopping or base cases

Infinite recursionInfinite recursion Every recursive call produces another recursive callEvery recursive call produces another recursive call Stopping case not well defined, or not reachedStopping case not well defined, or not reached

Very useful techniqueVery useful technique Definition of mathematical functionsDefinition of mathematical functions Definition of data structuresDefinition of data structures

Recursive structures are naturally processed by Recursive structures are naturally processed by recursive functions!recursive functions!

Recursively defined functionsRecursively defined functions factorialfactorial FibonacciFibonacci GCD by Euclid’s algorithmGCD by Euclid’s algorithm GamesGames Towers of HanoiTowers of Hanoi

9696

RecurrencesRecurrencesFactorial Factorial function, defined by thefunction, defined by the formulaformula N! = N . (N - N! = N . (N - 1)!, for 1)!, for N N > > 1 with 0! = 1.1 with 0! = 1.

This corresponds directly to the following simple recursive This corresponds directly to the following simple recursive program:program:

int factorial(int N)int factorial(int N){ if (N { if (N == == 0) return 1;0) return 1; return N * factorial(N-1);return N * factorial(N-1);}}

This program illustrates the basic features of a recursive This program illustrates the basic features of a recursive program: it calls itself and it has a termination condition in program: it calls itself and it has a termination condition in

which it directly computes its resultwhich it directly computes its result

9797

Recurrences Recurrences (Continue)(Continue)

Well-known recurrence relation is the one that defines the Well-known recurrence relation is the one that defines the Fibonacci numbers:Fibonacci numbers:

FN = FN- 1 + FN-2 ,FN = FN- 1 + FN-2 , for N >= 2 with F0 = F1 = 1for N >= 2 with F0 = F1 = 1

The recurrence corresponds directly to theThe recurrence corresponds directly to the simple recursivesimple recursive program:program:

int fibonacci(int N)int fibonacci(int N) { if (N <= 2) return 1;{ if (N <= 2) return 1;

return fibonacci(N-1) + fibonacci(N-2);return fibonacci(N-1) + fibonacci(N-2); }}

This is an even less convincing example of the “power" of recursion, This is an even less convincing example of the “power" of recursion, that the recursive that the recursive

calls indicate that FN-1 and FN-2 should be computed independently.calls indicate that FN-1 and FN-2 should be computed independently.

9898

Recurrences Recurrences (Continue)(Continue)

The relationship between recursive programs The relationship between recursive programs and recursively defined functions is often and recursively defined functions is often more philosophical than practical more philosophical than practical

factorial function really could be implemented factorial function really could be implemented with a loop and that the Fibonacci function is with a loop and that the Fibonacci function is better handled by storing all precomputed better handled by storing all precomputed values in an arrayvalues in an array

9999

Divide-and-Conquer Divide-and-Conquer Most of the recursive programs use two Most of the recursive programs use two

recursive calls, each operating on about recursive calls, each operating on about half the input - called "half the input - called "divide and divide and conquerconquer" paradigm for algorithm design " paradigm for algorithm design

Divide-and conquer is a general algorithm Divide-and conquer is a general algorithm design paradigm:design paradigm: Divide: divide the input data S in two or more is Divide: divide the input data S in two or more is

joint subsets S1, S2, …joint subsets S1, S2, … Recur: solve the subproblems recursivelyRecur: solve the subproblems recursively Conquer: combine the solutions for S1, S2, …, Conquer: combine the solutions for S1, S2, …,

into a solution for Sinto a solution for S

100100

Divide-and-Conquer Divide-and-Conquer (Continue)(Continue)

divide-and-conquer recursive program is a divide-and-conquer recursive program is a straightforward straightforward

Way to accomplish our objective:Way to accomplish our objective:

void rule (int l, int r, int h)void rule (int l, int r, int h){ int m = (l+r) /2;{ int m = (l+r) /2;

if (h > 0)if (h > 0) { rule (l,m,h-1);{ rule (l,m,h-1); mark (m, h) ;mark (m, h) ; rule (m,r,h-1);rule (m,r,h-1); }} }}

The idea behind the method is the following: to make the The idea behind the method is the following: to make the marksmarks

in an interval, first make the long mark in the middlein an interval, first make the long mark in the middle

101101

rule (0,8,3) mark (4,3) rule

(0,4,2) mark (2,2) rule

(0,2,1) mark (1,1) rule

(0,1,0) rule

(1,2,0) rule

(2,4,1) mark (3,1) rule

(2,3,0) rule

(3,4,0) rule

(4,8,2) mark (6,2) rule

(4,6,1) mark (5,1) rule

(4,5,0) rule

(5,6,0) rule

(6,8,1) mark (7,1)

rule (6,7,0)

rule (7,8,0)

Drawing a ruler (Preorder) in detail, giving the list of procedure calls and marks resulting from the call rule (0, 8, 3). We mark the middle and call rule for the left half, then do the same for the left half, and so forth, until a mark of length 0 is called for. Eventually we return from rule and mark right halves in the same way.

102102

rule (0,8,3) rule (0,4,2) rule (0,2,1)

rule (0,1,0) mark

(1,1) rule

(1,2,0) mark

(2,2) rule

(2,4,1) rule (2,3,0) mark (3,1) rule

(3,4,0) mark

(4,3) rule

(4,8,2) rule

(4,6,1) rule

(4,5,0) mark

(5,1) rule

(5,6,0) mark

(6,2) rule

(6,8,1) rule

(6,7,0) mark(

7,1)

Drawing a ruler (Inorder version) In general, divide-and-

conquer algorithms involve doing some work to split the input into two pieces, or to merge the results of processing two independent "solved" portions of the input, or to help things along after half of the input has been processed.

103103

Divide-and-ConquerDivide-and-Conquer (Continue) (Continue)

nonrecursive algorithm, which does not correspond nonrecursive algorithm, which does not correspond to any recursive implementation, is to draw the to any recursive implementation, is to draw the shortest marks first, then the next shortest, etc.shortest marks first, then the next shortest, etc.

rule(int l, int r, int h);rule(int l, int r, int h);{{ int i , j , t;int i , j , t;

for (i=1,j=1; i<=h; i++, j+=j)for (i=1,j=1; i<=h; i++, j+=j) for (t = 0 ; t<=(l+r)/j; t++)for (t = 0 ; t<=(l+r)/j; t++)

mark (l+j+t*(j+j), i);mark (l+j+t*(j+j), i); }}

combine and conquercombine and conquer - - method of algorithm design method of algorithm design where we solve a problem by first solving trivial subproblems, where we solve a problem by first solving trivial subproblems, then combining those solutions to solve slightly bigger then combining those solutions to solve slightly bigger subproblems, etc., until the whole problem is solved. subproblems, etc., until the whole problem is solved.

TREESTREES

Chapter 5

105105

TREES GLOSSARYTREES GLOSSARY one item follows the other, which will consider two-one item follows the other, which will consider two-

dimensional linked structures called treesdimensional linked structures called trees Trees are encountered frequently in everyday lifeTrees are encountered frequently in everyday life A A treetree is a is a nonempty collection of vertices and nonempty collection of vertices and

edges:edges: A vertex is a simple object (also referred to as a node)A vertex is a simple object (also referred to as a node) An edge is a connection between two verticesAn edge is a connection between two vertices

A A pathpath in a in a treetree is a is a list of distinct verticeslist of distinct vertices in which in which successive vertices are connected by edges in the treesuccessive vertices are connected by edges in the tree

One One nodenode in the in the treetree is designated, as the is designated, as the rootroot the the defining property of a treedefining property of a tree

If there is more than one path between the root and If there is more than one path between the root and some node, or if there is no path between the root and some node, or if there is no path between the root and some node, then what we have is a graph, not a treesome node, then what we have is a graph, not a tree

106106

TREESTREES In computer science, a tree is an abstract model of

a hierarchical structure Nodes with no children are sometimes called

leaves, or terminal nodes Nodes with at least one child are sometimes called

nonterminal nodes nonterminal nodes refer as internal nodes and

terminal nodes as external nodes Applications:Applications:

Organization chartsOrganization charts File systemsFile systems Programming environmentsProgramming environments

E

R

T

P L EM

EA

SA

A sample A sample tree tree

107107

TREES TREES (Continue)(Continue)

The nodes in a tree divide themselves The nodes in a tree divide themselves into levels - into levels - the the level of a node is the number of nodes on the level of a node is the number of nodes on the pathpath from the node to the root from the node to the root

The The heightheight of a tree of a tree is the is the maximum levelmaximum level among among all nodes in the tree (or the maximum distance to all nodes in the tree (or the maximum distance to the root from any node)the root from any node)

The The path lengthpath length of a tree is the of a tree is the sum of the levelssum of the levels of of all the nodes in the treeall the nodes in the tree (or the (or the sum of the sum of the lengths of the paths from each node to the rootlengths of the paths from each node to the root))

The tree in figure of slide No 3 is The tree in figure of slide No 3 is height 3 and path height 3 and path length 21length 21

108108

Binary TreesBinary TreesBinary TreesBinary Trees

A binary tree has A binary tree has nodesnodes, similar to nodes in , similar to nodes in a linked list structure.a linked list structure.

DataData of one sort or another may be stored of one sort or another may be stored at each node.at each node.

But it is theBut it is the connectionsconnections between the between the nodes which characterize a binary tree.nodes which characterize a binary tree.

109109

110110

111111

A Binary Tree of StatesA Binary Tree of StatesA Binary Tree of StatesA Binary Tree of States

In this example, the In this example, the data contained at data contained at each node is one of each node is one of the 50 states.the 50 states.

Each tree has a Each tree has a special node special node called its called its rootroot, , usually drawn usually drawn at the top.at the top.

112112

A Binary Tree of StatesA Binary Tree of StatesA Binary Tree of StatesA Binary Tree of States

Each node is Each node is permitted to have permitted to have two links to other two links to other nodes, called the nodes, called the leftleft childchild and the and the right childright child..

Some nodes Some nodes have only one have only one child.child.

Arkansas has aleft child, but no

right child.

Arkansas has aleft child, but no

right child.

113113

A Binary Tree of StatesA Binary Tree of StatesA Binary Tree of StatesA Binary Tree of States

A node with no A node with no children is children is called a called a leafleaf..

Washington is theparent of Arkansas

and Colorado.

Washington is theparent of Arkansas

and Colorado.

Each node is Each node is called the called the parentparent of its of its children.children.

114114

A Binary Tree of StatesA Binary Tree of StatesA Binary Tree of StatesA Binary Tree of States

Two rules about Two rules about parents:parents:

The root has no parent.

Every other node has exactly one parent.

115115

A Binary Tree of StatesA Binary Tree of StatesA Binary Tree of StatesA Binary Tree of States

Two nodes Two nodes with the with the same parent same parent are called are called siblingssiblings..

Arkansasand Coloradoare siblings.

Arkansasand Coloradoare siblings.

116116

Complete Binary TreesComplete Binary TreesComplete Binary TreesComplete Binary Trees

A complete binary tree is a A complete binary tree is a special kind of binary tree special kind of binary tree which will be useful to us.which will be useful to us.

When a completebinary tree is built,

its first node must bethe root.

When a completebinary tree is built,

its first node must bethe root.

The second node of a The second node of a complete binary tree is always complete binary tree is always the left child of the root...the left child of the root...

117117

Complete Binary TreesComplete Binary TreesComplete Binary TreesComplete Binary Trees

The second node of a complete The second node of a complete binary tree is always the left child binary tree is always the left child of the root...of the root...

... and the third node is always the ... and the third node is always the right child of the root.right child of the root.

The next nodes must The next nodes must always fill the next level always fill the next level from from left to rightleft to right..

. . . .

118118

Binary TreeBinary Tree Consists ofConsists of

NodeNode Left and Right sub-treesLeft and Right sub-trees Both sub-trees are binary treesBoth sub-trees are binary trees

Each sub-treeis itself

a binary tree

119119

Trees - Trees - PerformancePerformance FindFind

Complete TreeComplete Tree

Height, Height, hh Nodes traversed in a path from the root to a Nodes traversed in a path from the root to a

leafleaf

Number of nodes, Number of nodes, hh nn = 1 + 2 = 1 + 211 + 2 + 222 + … + 2 + … + 2hh = 2 = 2h+1h+1 - 1 - 1 hh = floor( log = floor( log22 nn ) )

120120

Trees - Trees - PerformancePerformance FindFind

Complete TreeComplete Tree

Since we need at most Since we need at most h+h+1 1 comparisons, comparisons,find in find in O(O(h+h+1)1) or or O(log O(log nn))

Same as binary searchSame as binary search

121121

Binary trees contain nodes.Binary trees contain nodes. Each node may have a left child and a right child.Each node may have a left child and a right child. If you start from any node and move upward, you will If you start from any node and move upward, you will

eventually reach the root.eventually reach the root. Every node except the root has one parent. The root Every node except the root has one parent. The root

has no parent.has no parent. Complete binary trees require the nodes to fill in each Complete binary trees require the nodes to fill in each

level from left-to-right before starting the next level.level from left-to-right before starting the next level.

SummarySummary SummarySummary

122122

PROPERTIES PROPERTIES Property 1 -Property 1 - There is exactly one path There is exactly one path

connecting any two nodes in a treeconnecting any two nodes in a tree:: Any two nodes have a least common ancestorAny two nodes have a least common ancestor that any node can be the rootthat any node can be the root: : each node in a treeeach node in a tree has the has the

property that there is property that there is exactly one path connectingexactly one path connecting that that node with every other node in the treenode with every other node in the tree

Property 2 - A tree with N codes has N - 1 edges Property 2 - A tree with N codes has N - 1 edges each nodeeach node, except the root, , except the root, has a unique parenthas a unique parent, and , and

every edge connects a node to its parentevery edge connects a node to its parent Property 3 - A binary tree with N internal nodes Property 3 - A binary tree with N internal nodes

has N + 1 external nodeshas N + 1 external nodes A binary tree with no internal nodes has one external nodeA binary tree with no internal nodes has one external node the the left subtree has k + 1 external nodes and the right left subtree has k + 1 external nodes and the right

subtree has N - k external nodessubtree has N - k external nodes, for a total of N + 1, for a total of N + 1

123123

PROPERTIES PROPERTIES ((ContinueContinue))

Property 4 - The external path length of any Property 4 - The external path length of any binary tree with N internal nodes is 2N greater binary tree with N internal nodes is 2N greater than the internal path lengththan the internal path length start with the binary tree consisting of one external start with the binary tree consisting of one external

nodenode The process starts with a tree with internal and The process starts with a tree with internal and

external path length both 0 and, for each of N external path length both 0 and, for each of N steps, increases the external path length by 2 more steps, increases the external path length by 2 more than the internal path lengththan the internal path length

Property 5 - The height of a full binary tree with Property 5 - The height of a full binary tree with N internal nodes is about 10g2 NN internal nodes is about 10g2 N if the height is n, then we must have if the height is n, then we must have 2n-1 <N+1 ≤ 2n-1 <N+1 ≤

2n2n, since there are N + 1 external nodes, since there are N + 1 external nodes

124124

Representing Binary Trees Representing Binary Trees The most prevalent representation of binary trees The most prevalent representation of binary trees

is a straightforward is a straightforward use of records with two links use of records with two links per nodeper node

For the representation corresponds to have For the representation corresponds to have two two different types of recordsdifferent types of records, one for , one for internal nodesinternal nodes, , one for one for external nodesexternal nodes; for others, it may be ; for others, it may be appropriate to appropriate to use just one typeuse just one type of node and to of node and to use use the links in external nodesthe links in external nodes for some other purpose for some other purpose

The parse tree for an expression is defined by the The parse tree for an expression is defined by the simple simple recursive rulerecursive rule: ": "put the operator at the root put the operator at the root and then put the tree for the expression and then put the tree for the expression corresponding to the first operand on the left and corresponding to the first operand on the left and the tree corresponding to the expression for the the tree corresponding to the expression for the second operand on the rightsecond operand on the right

125125

Representing Binary TreesRepresenting Binary Trees ((ContinueContinue))

+

F

C

+

B

A

*

D E

*

*

Parse tree for A * ( ( ( B + C ) * ( D * E ) ) + F )

The parse tree for A B C + D E * * F + * (the same expression in postfix)-- infix and postfix are two ways to represent arithmetic expressions, parse trees are a third

126126

Representing Binary TreesRepresenting Binary Trees ((ContinueContinue))

+

F

C

+

B

A

*

D E

*

*

+

F

C

+

B D E

*

*

C

+

B D E

*

*

C

+

B D E

*

Building the parse tree for A B C + D E * * F + *

There are two other commonly used solutions. One option is to use a different type of node for external nodes, one with no links. Another option is to mark the links in some way (to distinguish them from other links in the tree), then have them point elsewhere in the tree.

127127

TRAVERSING TREESTRAVERSING TREES How to traverse tree and how to systematically visit every How to traverse tree and how to systematically visit every

nodenode - there are - there are a number of different ways to proceeda number of different ways to proceed The The firstfirst method method to consider is preorder traversal - The to consider is preorder traversal - The

method method is defined by the simple recursive ruleis defined by the simple recursive rule. ". "Visit the Visit the root, then visit the left subtree, then visit the right root, then visit the left subtree, then visit the right subtreesubtree." ."

traverse(struct node *t) { stack.push(t); while ( !stack.empty ( ) ) { t = stack.popo; visit(t); if (t->r != z)

stack.push(t->r ) ; if (t->l != z)

stack.push(t->l ); }

}

128128

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

Preorder traversal

129129

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

The The SecondSecond method to consider is inorder method to consider is inorder traversal - traversal - is defined with the recursive ruleis defined with the recursive rule ""visit the left subtree, then visit the root, visit the left subtree, then visit the root, then visit the right subtreethen visit the right subtree." , sometimes ." , sometimes called called symmetricsymmetric orderorder

The implementation of a stack-based program for The implementation of a stack-based program for inorder is almost identical to the above program.inorder is almost identical to the above program.

This method of traversal is probably the most This method of traversal is probably the most widely usedwidely used

130130

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

Inorder traversal

131131

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

The The ThirdThird method to consider is postorder method to consider is postorder traversal - is defined by the recursive rule "traversal - is defined by the recursive rule "visit visit the left subtree, then visit the right subtreethe left subtree, then visit the right subtree, , then visit the rootthen visit the root." ."

Implementation of Implementation of a stack-based program for a stack-based program for postorder is more complicated than for the other postorder is more complicated than for the other twotwo because because one must arrange for the root and one must arrange for the root and the right subtree to be saved while the left the right subtree to be saved while the left subtree is visited and for the root to be saved subtree is visited and for the root to be saved while the right subtree is visitedwhile the right subtree is visited..

132132

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

Postorder traversal

133133

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue) The The FourthFourth method to consider is method to consider is level-orderlevel-order

traversal - is defined not recursive at all - traversal - is defined not recursive at all - simply visit simply visit the nodes as they appear on the pagethe nodes as they appear on the page, reading , reading down down from top to bottom and from left to rightfrom top to bottom and from left to right, , because all the nodes on each level appear together .because all the nodes on each level appear together .level-order traversal can be achieved by using the

program above for preorder, with a queue instead of a stack:traverse(struct node *t){ queue.put(t); while ( !queue.empty( ) ) { t = queue.get( ); visit(t); if (t->l != z) queue.put(t->l); if (t->r != z) queue.put(t->r); }}

134134

TRAVERSING TREES TRAVERSING TREES (Continue)(Continue)

Level order traversal

135135

HeapsHeapsHeapsHeaps

A A heapheap is a certain is a certain kind of complete kind of complete binary tree.binary tree.

When a completebinary tree is built,

its first node must bethe root.

When a completebinary tree is built,

its first node must bethe root.

Root

136136

HeapsHeapsHeapsHeaps

Complete Complete binary binary tree.tree.

Right childof the

root

The third node isalways the right child

of the root.

The third node isalways the right child

of the root.

The second node isalways the left child

of the root.

The second node isalways the left child

of the root.

Left childof theroot

The next nodesalways fill the next

level from left-to-right..

The next nodesalways fill the next

level from left-to-right..

137137

HeapsHeapsHeapsHeaps

A heap is a A heap is a certaincertain kind kind of complete of complete binary tree.binary tree.

Each node in a heapcontains a key thatcan be compared toother nodes' keys.

Each node in a heapcontains a key thatcan be compared toother nodes' keys.

19

4222127

23

45

35

The "heap property"requires that each

node's key is >= thekeys of its children

The "heap property"requires that each

node's key is >= thekeys of its children

138138

Adding a Node to a HeapAdding a Node to a HeapAdding a Node to a HeapAdding a Node to a Heap

Put the new node in Put the new node in the next available spot.the next available spot.

Push the new node Push the new node upward, swapping with upward, swapping with its parent until the new its parent until the new node reaches an node reaches an acceptable location.acceptable location.

19

4222127

23

45

35

42

42

42

139139

Adding a Node to a HeapAdding a Node to a HeapAdding a Node to a HeapAdding a Node to a Heap

The parent has a key The parent has a key that is >= new node, orthat is >= new node, or

The node reaches the The node reaches the root.root.

The process of pushing The process of pushing the new node upward the new node upward is called is called reheapificationreheapification upwardupward.. 19

4222135

23

45

42

27

140140

Removing the Top of a HeapRemoving the Top of a HeapRemoving the Top of a HeapRemoving the Top of a Heap

Move the last node onto Move the last node onto the root.the root.

Push the out-of-place node Push the out-of-place node downward, swapping with downward, swapping with its larger child until the its larger child until the new node reaches an new node reaches an acceptable location.acceptable location.

19

4222135

23

27

42

141141

Removing the Top of a HeapRemoving the Top of a HeapRemoving the Top of a HeapRemoving the Top of a Heap

The children all have keys The children all have keys <= the out-of-place node, or<= the out-of-place node, or

The node reaches the leaf.The node reaches the leaf. The process of pushing the The process of pushing the

new node downward is new node downward is called called reheapificationreheapification downwarddownward..

19

4222127

23

42

35

142142

Implementing a HeapImplementing a HeapImplementing a HeapImplementing a Heap

Data from the root Data from the root goes in the first goes in the first location of location of the array.the array.

An array of dataAn array of data

2127

23

42

35

42 Data from the Data from the next row goes next row goes in the next two in the next two array array locations. locations.

35 23

143143

Implementing a HeapImplementing a HeapImplementing a HeapImplementing a Heap

Data from the next Data from the next row goes in the next row goes in the next two array locations. two array locations.

An array of dataAn array of data

2127

23

42

35

42 35 23 27 21

We don't care what's inWe don't care what's inthis part of the array.this part of the array.

144144

A heap is a complete binary tree, where the A heap is a complete binary tree, where the entry at each node is greater than or equal to entry at each node is greater than or equal to the entries in its children.the entries in its children.

To add an entry to a heap, place the new To add an entry to a heap, place the new entry at the next available spot, and perform a entry at the next available spot, and perform a reheapification upward.reheapification upward.

To remove the biggest entry, move the last To remove the biggest entry, move the last node onto the root, and perform a node onto the root, and perform a reheapification downward.reheapification downward.

SummarySummary SummarySummary

SORTINGSORTING

Chapter 6

146146

SortingSorting

In numerous sorting applications, a simple algorithm In numerous sorting applications, a simple algorithm may be the method of choicemay be the method of choice

often use a often use a sorting program only oncesorting program only once, or just , or just a a few timesfew times

elementary methods are always elementary methods are always suitablesuitable for for small filessmall files

As a rule, the As a rule, the elementary methodselementary methods - - take time take time proportional to N2 to sort Nproportional to N2 to sort N randomly arranged randomly arranged items. If items. If N N is is smallsmall, this running time may be , this running time may be perfectly adequateperfectly adequate

147147

SELECTION SORT SELECTION SORT

findfind the the smallest elementsmallest element in the array, and in the array, and exchangeexchange it it withwith the the element in the first positionelement in the first position

findfind the the second smallest elementsecond smallest element and and exchangeexchange it it withwith the the element in the second positionelement in the second position

Continue in this way Continue in this way until the entire array is sorteduntil the entire array is sorted

- It - It worksworks by by repeatedly selecting the smallest remaining repeatedly selecting the smallest remaining elementelement

- A disadvantage of selection sort is that its running time - A disadvantage of selection sort is that its running time

depends only slightly on the amount of order already in depends only slightly on the amount of order already in the file. the file.

148148

Selection sortFor each i from l to r-1,

exchange a[i] with the minimum element in a [i], . . . , a [r]. As the index i travels from left to right, the elements to its left are in their final position in the array (and will not be touched again), so the array is fully sorted when i reaches the right end.

template <class Item>void selection(Item a[], int l,

int r){ for (int i = l; i < r; i++){ int min = i; for (int j = i+1; j <= r; j++)if (a[j] < a[min]) min = j; exch(a[i], a[min]); }}

149149

INSERTION SORT INSERTION SORT often use to sort bridge hands is to often use to sort bridge hands is to

consider the elements one at a time consider the elements one at a time inserting each into its proper place inserting each into its proper place need to make space for the element need to make space for the element

being inserted by moving larger being inserted by moving larger elements one position to the right elements one position to the right

then inserting the element into the then inserting the element into the vacated position vacated position

150150

Insertion sort exampleDuring the first pass of insertion sort, the S in the second position is larger than the A, so it does not have to be moved. On the second pass, when the O in the third position is encountered, it is ex- changed with the S to put A 0 S in sorted order, and so forth. Un-shaded elements that are not circled are those that were moved one position to the right.

The running time of insertion sort primarily depends on the initial order of the keys in the input. For example, if the file is large and the keys are already in order (or even are nearly in order), then insertion sort is quick and selection sort is slow.

151151

Insertion sortFirst puts the smallest element in the array into the first position, so that that element can serve as a sentinel;For each i, it sorts the elements a [1], . . ., a [i] by moving one position to the right elements in the sorted list a [1], . . . , a [i-1] that are larger than a [i], then putting a [i] into its proper position.

template <class Item>void insertion(Item a[], int l, int r){ int i; for (i = r; i > l; i--) compexch(a[i-1], a[i]); for (i = l+2; i <= r; i++) {int j = i; Item v = a[il; while (v < a[j-1]) { a[j] = a[j-1]; j--; }

a[jl = v; }

}

152152

BUBBLE SORT BUBBLE SORT

Keep passing through the file Keep passing through the file exchanging adjacent elements that are out of exchanging adjacent elements that are out of

order order continuing until the file is sorted continuing until the file is sorted

it is actually easier to implement than insertion or it is actually easier to implement than insertion or selection sort is arguable selection sort is arguable

Bubble sort generally will he slower than the Bubble sort generally will he slower than the other two methods other two methods

153153

Bubble Sort Bubble Sort (Continue)(Continue)

/* Bubble sort for integers *//* Bubble sort for integers */

#define SWAP(a,b) { int t; t=a; a=b; b=t; }#define SWAP(a,b) { int t; t=a; a=b; b=t; }

void bubble( int a[], int n )void bubble( int a[], int n )

{ int i, j;{ int i, j;

for(i=0;i<n;i++)for(i=0;i<n;i++)

{ /* n passes thru the array */{ /* n passes thru the array */

/* From start to the end of unsorted part *//* From start to the end of unsorted part */

for(j=1;j<(n-i);j++)for(j=1;j<(n-i);j++)

{/* If adjacent items out of order, swap */{/* If adjacent items out of order, swap */

if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]);if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]);

}}

}}

} }

154154

Bubble sort exampleSmall keys percolate over to the left in bubble sort. As the sort moves from right to left, each key is exchanged with the one on its left until a smaller one is encountered. On the first pass, the E is exchanged with the L, the P, and the M before stopping at the A on the right; then the A moves to the beginning of the file, stopping at the other A, which is already in position. The ith smallest key reaches its final position after the ith pass, just as in selection sort, but other keys are moved closer to their final position, as well.

Bubble sort : O(n2) - Very simple code

Insertion sort:Slightly better than bubble sort

Fewer comparisons - Also O(n2)

155155

156156

157157

SEARCHINGSEARCHING

Chapter 7

159159

SearchingSearching The goal of the search is to find all records with The goal of the search is to find all records with

keys matching a given search key keys matching a given search key Applications of searching are widespread, and Applications of searching are widespread, and

involve a variety of different operations involve a variety of different operations Two common terms often used to describe data Two common terms often used to describe data

structures for searching are dictionaries and structures for searching are dictionaries and symbol tables symbol tables

In searching have programs that are in widespread In searching have programs that are in widespread and frequent use to study a variety of methods and frequent use to study a variety of methods that store records in arrays that are either that store records in arrays that are either searched with key comparisons or indexed by key searched with key comparisons or indexed by key value.value.

160160

Searching Searching (Continue)(Continue)

search algorithms as belonging to packages search algorithms as belonging to packages implementing a variety of generic operations implementing a variety of generic operations that can be separated from particular that can be separated from particular implementations, so that alternate implementations, so that alternate implementations can be substituted easily. The implementations can be substituted easily. The operations of interest include:operations of interest include:

Initialize the data structure.Initialize the data structure. Search for a record (or records) having a given key.Search for a record (or records) having a given key. Insert a new record.Insert a new record. Delete a specified record.Delete a specified record. Join two dictionaries to make a large one.Join two dictionaries to make a large one. Sort the dictionary; output all the records in sorted order.Sort the dictionary; output all the records in sorted order.

161161

Searching Searching (Continue)(Continue)

search and insert operation is often included for efficiency in search and insert operation is often included for efficiency in situations where records with duplicate keys are not to be situations where records with duplicate keys are not to be kept within the data structurekept within the data structure

Records with duplicate keys can be handled in several ways:Records with duplicate keys can be handled in several ways: the primary searching data structure contain only the primary searching data structure contain only

records with distinct keysrecords with distinct keys to leave records with equal keys in the primary to leave records with equal keys in the primary

searching data structure and return any record with the searching data structure and return any record with the given key for a searchgiven key for a search

to assume that each record has a unique identifier to assume that each record has a unique identifier (apart from the key) and require that a search find the (apart from the key) and require that a search find the record with a given identifier, given the keyrecord with a given identifier, given the key

to arrange for the search program to call a specified to arrange for the search program to call a specified function for each record with the given key function for each record with the given key

162162

Sequential Searching Sequential Searching

method for searching is simply to method for searching is simply to store the records in an array:store the records in an array:

When a new record is to be inserted, we When a new record is to be inserted, we put it at the end of the array put it at the end of the array

when a search is to perform, we look when a search is to perform, we look through the array sequentially through the array sequentially

163163

Sequential Searching Sequential Searching (Continue)(Continue)

Property 1Property 1 - - Sequential search (array Sequential search (array implementation) uses N + 1 comparisons for an implementation) uses N + 1 comparisons for an unsuccessful search (always) and about N/2 unsuccessful search (always) and about N/2 comparisons for a successful search (on the average)comparisons for a successful search (on the average)

For unsuccessful search, this property follows For unsuccessful search, this property follows directly from the code: each record must be directly from the code: each record must be examined to decide that a record with any examined to decide that a record with any particular key is absent. For successful search, if particular key is absent. For successful search, if we assume that each record is equally likely to we assume that each record is equally likely to be sought, then the average number of be sought, then the average number of comparisons is (1 + 2 +…+ N)/N = (N + 1)/2, comparisons is (1 + 2 +…+ N)/N = (N + 1)/2, exactly half the cost of unsuccessful searchexactly half the cost of unsuccessful search

164164

Sequential Searching Sequential Searching (Continue)(Continue)

Property 2Property 2 - - Sequential search (sorted list Sequential search (sorted list implementation) uses about N/2 comparisons for both implementation) uses about N/2 comparisons for both successful and unsuccessful search (on the average)successful and unsuccessful search (on the average)

For successful search, the situation is the same For successful search, the situation is the same as before. For unsuccessful search, if we assume as before. For unsuccessful search, if we assume that the search is equally likely to be terminated that the search is equally likely to be terminated by the tail node z or by each of the elements in by the tail node z or by each of the elements in the list (which is the case for a number of the list (which is the case for a number of "random" search models), then the average "random" search models), then the average number of comparisons is the same as for number of comparisons is the same as for successful search in a table of size N + 1, or (N + successful search in a table of size N + 1, or (N + 2)/22)/2

165165

Binary Search Binary Search Binary Search is an incredibly powerful Binary Search is an incredibly powerful

technique for searching an ordered list technique for searching an ordered list The basic algorithm is to find the middle The basic algorithm is to find the middle

element of the listelement of the list compare it against the keycompare it against the key decide which half of the list must contain the keydecide which half of the list must contain the key and repeat with that halfand repeat with that half

Two requirements to support binary search: Two requirements to support binary search: Random access of the list elements, so we need Random access of the list elements, so we need

arrays instead of linked lists. arrays instead of linked lists. The array must contain elements in sorted order by The array must contain elements in sorted order by

the search keythe search key

166166

Binary Search Binary Search (Continue)(Continue)

Property 3Property 3 - - Binary search never uses more than lg N + 1 Binary search never uses more than lg N + 1 comparisons for either successful or unsuccessful searchcomparisons for either successful or unsuccessful search This follows from the fact that the subfile size is at least This follows from the fact that the subfile size is at least

halved at each step: an upper bound on the number of halved at each step: an upper bound on the number of comparisons satisfies the recurrence CN = CN/2 +1 with C, comparisons satisfies the recurrence CN = CN/2 +1 with C, = 1, which implies the stated result.= 1, which implies the stated result.

It is important to note that the time required to insert new It is important to note that the time required to insert new records is high for binary searchrecords is high for binary search

Property 4Property 4 - - Interpolation search uses fewer than lg lgN + 1 Interpolation search uses fewer than lg lgN + 1 comparisons for both successful and unsuccessful search, in comparisons for both successful and unsuccessful search, in files of random keysfiles of random keys This function is a very slowly growing one, which can be This function is a very slowly growing one, which can be

thought of as a constant for practical purposes: if N is one thought of as a constant for practical purposes: if N is one billion, lg lgN < 5. Thus, any record can be found using only billion, lg lgN < 5. Thus, any record can be found using only a few accesses (on the average), a substantial improvement a few accesses (on the average), a substantial improvement over binary search over binary search

167167

Binary Tree SearchBinary Tree Search Binary tree search is a simple, efficient Binary tree search is a simple, efficient

dynamic searching method that qualifies dynamic searching method that qualifies as of the most fundamental algorithms as of the most fundamental algorithms in computer sciencein computer science

The defining property of a binary tree is The defining property of a binary tree is that each node has left and right links that each node has left and right links

A binary search tree

168168

Binary Tree Search Binary Tree Search (Continue)(Continue) Property 5Property 5 - - A search or insertion in a binary search tree A search or insertion in a binary search tree

requires about 2 lnN comparisons, on the average, in a tree requires about 2 lnN comparisons, on the average, in a tree built from N random keys.built from N random keys.

For each node in the tree, the number of comparisons For each node in the tree, the number of comparisons used for a successful search to that node is the distance used for a successful search to that node is the distance to the root. The sum of these distances for all nodes is to the root. The sum of these distances for all nodes is called the called the internal path lengthinternal path length of the tree. Dividing the of the tree. Dividing the internal path length by N, we get the average number internal path length by N, we get the average number of comparisons for successful search. But if CN denotes of comparisons for successful search. But if CN denotes the average internal path length of a binary search tree the average internal path length of a binary search tree of N nodes, we have the recurrence of N nodes, we have the recurrence

Property 6Property 6 - - In the worse case, a search in a binary search In the worse case, a search in a binary search tree with N keys can require N comparisons.tree with N keys can require N comparisons.

For example, when the keys are inserted in order (or in For example, when the keys are inserted in order (or in reverse order), the binary- tree search method is no reverse order), the binary- tree search method is no better than the sequential search method that we saw better than the sequential search method that we saw at the beginning of this chapter at the beginning of this chapter