sorting classnotes aam

7/30/2019 Sorting Classnotes AAM

1/28

Introduction

Sorting and searching are fundamental operations in computer science. Sorting

refers to the operation of arranging data in some given order. Searching refers to

the operation of searching the particular record from the existing information.

Normally, the information retrieval involves searching, sorting and merging. In thischapter we will discuss the searching and sorting techniques in detail.

Sorting

Sorting is very important in every computer application. Sorting refers to arranging

of data elements in some given order. Many sorting algorithms are available to sort

the given set of elements. We will now discuss two sorting techniques and analyze

their performance. The two

Techniques are: internal sorting external sorting

Internal sorting

Internal Sorting takes place in the main memory of a computer. The internal

sorting methods are applied to small collection of data. It means that, the entire

collection of data to be sorted in small enough that the sorting can take place

within main memory. We will study the following methods of internal sorting Insertion sort Selection sort Merge sort Radix sort Quick sort Heap sort Bubble sort


2/28

Insertion sort

In this sorting we can read the given elements from 1 to n, inserting each element

into its proper position. For example, the card player arranging the cards dealt to

him. The player picks up the card and inserts them into the proper position. At

every step, we insert the item into its proper place.

This sorting algorithm is frequently used when n is small. The insertion sort

algorithm scans A from A[l] to A[N], inserting each element A[K] into its proper

position in the previously sorted subarray A[l], A[2], . . . , A[K-1]. That is:

Pass 1. A[l] by itself is trivially sorted.

Pass 2. A[2] is inserted either before or after A[l] so that: A[l], A[2] is sorted.Pass 3. A[3] is inserted into its proper place in A[l], A[2], that is, before A[l],

between A[l] and A[2], or after A[2], so that: A[l], A[2], A[3] is sorted.

Pass 4. A[4] is inserted into its proper place in A[l], A[2], A[3] so that:

A[l], A[2], A[3], A[4] is sorted.

Pass N. A[N] is inserted into its proper place in A[l], A[2], . . . , A[N - 1] sothat: A[l], A[2], . . . ,A[N] is sorted.


3/28

Algorithm INSERTION ( A , N )

This algorithm sorts the array A with N elements

1. Set A[0] := -- . [initializes the element]

2. Repeat Steps 3 to 5 for K= 2,3, ,N

3. Set TEMP := A[K] and PTR:= K-1

4. Repeat while TEMP < A[PTR]

(a) Set A[PTR +1]:=A[PTR] [Moves element forward]

(b) Set PTR := PTR-1

[End of loop].

5. Set A[PTR+1] := TEMP [inserts element in proper place]

[End of Step 2 loop]

6. Return

Selection sort

In this sorting we find the smallest element in this list and put it in the first

position. Then find the second smallest element in the list and put it in the second

position. And so on.

Pass 1. Find the location LOC of the smallest in the list of N elements A[l], A[2], .

. . , A[N], and then interchange A[LOC] and [1] . Then A[1] is sorted.

Pass 2. Find the location LOC of the smallest in the sublist of N 1 Elements

A[2], A[3],. . . , A[N], and then interchangeA[LOC] and A[2]. Then:A[l], A[2] is

sorted, since A[1]


4/28

Example

Suppose an array A contains 8 elements as follows:77, 33, 44, 11, 88, 22, 66, 55

Algorithm

1. To find the minimum element

MIN ( A, K , N, LOC)

An array A is in memory. This procedure finds the location

LOC of the smallest element among A[K] , A[K+1],.A[N].

1. Set MIN:= A[K] and LOC := K [Initializes pointers]2. Repeat for J = K +1, K+2

If MIN > A [J] , then : Set MIN := A[J] and LOC := A[j]

and LOC: = J

3. Return

2. To Sort the elements

SELECTION (A, N)

1. Repeat Steps 2 and 3 form K= 1,2, .., N 1

2. Call MIN(A,K,N,LOC)

3. [Interchange A[K] and A[LOC] ]

Set TEMP: = A [K], A [K]:= A [LOC] and A [LOC]:=TEMP4. Exit.


5/28

Merge sort

Combing the two lists is called as merging. For example A is a sorted list with r

elements and B is a sorted list with s elements. The operation that combines the

elements of A and B into a single sorted list C with n = r + s elements is called

merging. After combing the two lists the elements are sorted by using the

following merging algorithm Suppose one is given two sorted decks of cards. The

decks are merged as in Fig. That is, at each step, the two front cards are compared

and the smaller one is placed in the combined deck. When one of the decks is

empty, all of the remaining cards in the other deck are put at the end of the

combined deck. Similarly, suppose we have two lines of students sorted by

increasing heights, and suppose we want to merge them into a single sorted line.

The new line is formed by choosing, at each step, the shorter of the two students

who are at the head of their respective lines. When one of the lines has no more

students, the remaining students line up at the end of the combined line.

The above discussion will now be translated into a formal algorithm which merges

a sorted r-element array A and a sorted s-element array B into a sorted array C,

with n = r + s elements. First of all, we must always keep track of the locations of

the smallest element of A and the smallest element of B which have not yet been


6/28

placed in C. Let NA and NB denote these locations, respectively. Also, let PTR

denote the location in C to be filled. Thus, initially, we set NA : = 1, NB : = 1 and

PTR : = 1. At each step of the algorithm, we compare A[NA] and B[NB] and

assign the smaller element to C[PTR]. Then we increment PTR by setting PTR:=

PTR + 1, and we either increment NA by setting NA: = NA + 1 or increment NB

by setting NB: = NB + 1, according to whether the new element in C has come

from A or from B. Furthermore, if NA> r, then the remaining elements of B are

assigned to C; or if NB > s, then the remaining elements of A are assigned to C.

Algorithm MERGING ( A, R, B, S, C)

Let A and B be sorted arrays with R and S elements. This algorithm

merges A and B into an array C with N = R + S elements.

1. [Initialize ] Set NA : = 1 , NB := 1 AND PTR : = 1

2. [Compare] Repeat while NA


7/28

Quick sort Algorithm

Quick sort is one of the example of "Divide and Conquer approach" forsolving the problems.

Quick sort algorithm works by placing the last element of queue inproper position through comparing the other element from the first end of

queue.

The steps followed by quick sort algorithm are as follows:1. Adjust the dividing (pivot) point in the queue i.e last element of the

queue.

2. Then compare each element of the queue from the beginning of thequeue if condition satisfy that element is less then pivot element then

place it left hand side by exchanging the element else greater element

than pivot element will be at right hand side.

3. After completing a iteration exchange the pivot element with theexact element from where all element in left hand side are less & right

hand side are greater and after placing pivot element divide the queue

in two parts.

4. After dividing in two parts again choose the pivot element in both thequeues and sort them separately by repeating step 1,2,3,4.

5. Repeat the process until the queue is not sorted and after sorting eachsub queues recursively combine them to one one sorted queue.

Now we move to see the quick sort algorithm structure as follow:Firstly to set our pointers to get partition of an array:

Quick sort(Array, S ,Piv)

If S < Piv

Then q Partition(Array, S, Piv)

Quick_sort(Array, S, q-1)

Quick_sort(Array, q+1,Piv)

Partition (Array, S , Piv)

x Array[Piv]

i S - 1

For j S to Piv-1

do If Array[j] xThen i i + 1

Exchange Array[i] Array[j]

Exchange Array[i + 1] Array[Piv]

Return i + 1 (Return the position of S)


8/28


9/28


10/28


11/28


12/28

Bubble sort Algorithm

Bubble sort and somes say it as sinking sort. Selection sort algorithm simply start sorting step by step comparing element to

the next element and swapping them this procedure repeat's until all element in

array is sorted in some sequence accordingly.

Bubble sort algorithm gets name bubble because of sorting the elements inarray in shorter range i.e just next value of the element in array is checked and

swapped or we can say sorting function is perform in very smaller time that is

why it is also called comparison sort.

Now we will see the algorithm structure as follows:


13/28


14/28


15/28

Radix Sort

Radix sorting involves looking at a radix (or digit) of a number and placing it in

an array of linked lists to sort it.

Algorithm for radix sorting:

1. Look at the rightmost digit.2. Assign the full number to that digits index.3. Look at the next digit to the left FROM the current sorted array. IF there is

no digit, pad a 0.

4. REPEAT STEP 3 UNTIL all numbers have been sorted.Let's see a step by step example of a radix sort of the following set of unsorted

numbers. The bold digits here represent the first digit to look at when attempting to

sort the list. You must also append it to the end of that linked list in the array.

212 21 72 5 431 898 616 24 9

Step 1:

0

1 21 -> 431

2 212 -> 72

3

4 24

5 05

6 616

7

8 898

9 09

Step 2: (working from step 1)

0 005 -> 009

1 212 -> 616

2 021 -> 024

3 431

45

6

7 072

8

9 898


16/28


0 5 -> 9 -> 21 -> 24 -> 72

1

2 212

3

4 431

5

6 616

7

8 898

9

Step 3 is the final step and the list is sorted.

The benefits of a radix sort is the fact that it can be done by pencil and paper. It

also only contains a fixed data structure (an array of size 10). The downside of

radix sort is that it takes time to implement since you may manually go throughnumerous steps to sort the list depending on how many numbers you have to sort.

Here is another example of radix sort, this time using numbers up to 4 digits in

length. You will notice something interesting here

58 99 999 47 200 101 1002 12 1111

Step 1:

0 200

1 101 -> 1111

2 1002 -> 12

3

4

5

6

7 47

8 58

9 99 -> 999


0 200 -> 101 -> 10021 1111 -> 012

2

3

4 047

5 058

6


17/28

7

8

9 099 -> 999


0 1002 -> 0012 -> 0047 -> 0058 -> 0099

1 0101 -> 1111

2 0200

3

4

5

6

7

8

9 0999Step 4: (working from step 3)

0 12 -> 47 -> 58 -> 99 -> 101 -> 200 -> 999

1 1002 -> 1111

2

3

4

5

6

7

8

9

Step 4 is the final step here. Notice however that the index 0 goes from 0 to 999

while 1 goes from 1000 to 1999 etc...


18/28

Heapsort

Heaps

The (Binary) heap data structure is an array object that can be viewed as a nearly

complete binary tree.

A binary tree with n nodes and depth k is complete iff its nodes correspond to

the nodes numbered from 1 to n in the full binary tree of depth k.


19/28

Attributes of a Heap

An array A that presents a heap with two attributes:

length[A]: the number of elements in the array.

heapsize[A]: the number of elements in the heap stored with

array A.

length[A] heapsize[A]

Basic procedures

If a complete binary tree with n nodes is represented

sequentially, then for any node with index i, 1 i n, we have

A[1] is the root of the tree

the parent PARENT(i) is at i/2if i 1

the left child LEFT(i) is at 2i

the right child RIGHT(i) is at 2i+1

The LEFT procedure can compute 2i in one instruction by simply shifting thebinary representation of i left one bit position.

Similarly, the RIGHT procedure can quickly compute 2i+1 by shifting the

binary representation of i left one bit position and adding in a 1 as the loworder

bit.

The PARENT procedure can compute i/2 by shifting i right one bit position.


20/28

Heap properties

There are two kind of binary heaps: maxheaps and minheaps.

In a maxheap, the maxheap property is that for every node i other than the

root, A[PARENT(i) ] A[i] .

the largest element in a maxheap is stored at the root the subtree rooted at a node contains values no larger than that contained at the

node itself

In a minheap, the minheap property is that for every node i other than the root,

A[PARENT(i) ] A[i] .

the smallest element in a minheap is at the root

the subtree rooted at a node contains values no smaller than that contained at the

node itself


21/28

The height of a heap

The height of a node in a heap is the number of edges on the longest simple

downward path from the node to a leaf, and the height of the heap to be the height

of the root, that is (lgn).

For example:

the height of node 2 is 2

the height of the heap is 3

The MAXHEAPIFY procedure

MAXHEAPIFY is an important subroutine for manipulating max heaps.

Input: an array A and an index i

Output: the subtree rooted at index i becomes a max heap Assume: the binary trees rooted at LEFT(i) and RIGHT(i) are maxheaps, but

A[i] may be smaller than its children

Method: let the value at A[i] float down in the maxheap

MAXHEAPIFY(A, i)

1. l LEFT(i)

2. r RIGHT(i)

3. if l heapsize[A] and A[l] > A[i]

4. then largest l

5. else largest i6. if r heapsize[A] and a[r] > A[largest]

7. then largest r

8. if largest i

9. then exchange A[i] A[largest]10. MAXHEAPIFY (A, largest)


22/28

Building a Heap

We can use the MAXHEAPIFY procedure to convert an array A=[1..n] into amaxheap in a bottomup manner.

The elements in the subarray A[(n/2+1)n ] are all leaves ofthe tree, and so

each is a 1element heap.

The procedure BUILDMAXHEAP goes through the remaining nodes of the

tree and runs MAXHEAPIFY on each one.

BUILDMAXHEAP(A)

1. heapsize[A] length[A]

2. for i length[A]/2 downto 13. do MAXHEAPIFY(A,i)


23/28


24/28

The heapsort algorithm

Since the maximum element of the array is stored at the root, A[1] we can

exchange it with A[n].

If we now discard A[n], we observe that A[1...(n1)] can easily be made into

a maxheap.

The children of the root A[1] remain maxheaps, but the new root A[1] element

may violate the maxheap property, so we need to readjust the maxheap. That is to

call MAXHEAPIFY(A, 1).

HEAPSORT(A)

1. BUILDMAXHEAP(A)

2. for i length[A] downto 2

3. do exchange A[1] A[i]

4. heapsize[A] heapsize[A] 15. MAXHEAPIFY(A, 1)


25/28


26/28

3 essential properties of algorithms:

In computer science, an in-place algorithm (or in Latin in situ) is an algorithm

which transforms input using a data structure with a small, constant amount of

extra storage space. The input is usually overwritten by the output as the algorithm

executes. An algorithm which is not in-place is sometimes called not-in-place or

out-of-place

In computer science, an online algorithm is one that can process its input piece-by-

piece in a serial fashion, i.e., in the order that the input is fed to the algorithm,

without having the entire input available from the start. In contrast, an offline

algorithm is given the whole problem data from the beginning and is required to

output an answer which solves the problem at hand.

A sorting algorithm is said to be stable if two objects with equal keys appear in thesame order in sorted output as they appear in the input unsorted array.

Algorithm In-place Online Stable

Insertion sort Yes Yes Yes

Selection sort Yes No No

Merge sort No Yes Yes

Radix sort No No Yes

Quick sort Yes Yes NoHeap sort Yes No No

Bubble sort Yes No Yes

External sorting

External sorting is a term for a class of sorting algorithms that can handle massive

amounts of data. External sorting is required when the data being sorted do not fit

into the main memory of a computing device (usually RAM) and instead they must

reside in the slower external memory (usually a hard drive). External sorting

typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of datasmall enough to fit in main memory are read, sorted, and written out to a temporary

file. In the merge phase, the sorted subfiles are combined into a single larger file.


27/28

Basic External Sorting Algorithm

Assume unsorted data is on disk at start

Let M = maximum number of records that can be stored & sorted in internal

memory at one time

Algorithm

Repeat:

1. Read M records into main memory & sort internally.2. Write this sorted sub-list onto disk. (This is one run).

Until all data is processed into runs

Repeat:

1. Merge two runs into one sorted run twice as long2. Write this single run back onto disk

Until all runs processed into runs twice as long

Merge runs again as often as needed until only one large run: the sorted list


28/28

sorting classnotes aam

Documents