sorting classnotes aam

Upload: apurvamehta

Post on 14-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Sorting Classnotes AAM

    1/28

    Introduction

    Sorting and searching are fundamental operations in computer science. Sorting

    refers to the operation of arranging data in some given order. Searching refers to

    the operation of searching the particular record from the existing information.

    Normally, the information retrieval involves searching, sorting and merging. In thischapter we will discuss the searching and sorting techniques in detail.

    Sorting

    Sorting is very important in every computer application. Sorting refers to arranging

    of data elements in some given order. Many sorting algorithms are available to sort

    the given set of elements. We will now discuss two sorting techniques and analyze

    their performance. The two

    Techniques are: internal sorting external sorting

    Internal sorting

    Internal Sorting takes place in the main memory of a computer. The internal

    sorting methods are applied to small collection of data. It means that, the entire

    collection of data to be sorted in small enough that the sorting can take place

    within main memory. We will study the following methods of internal sorting Insertion sort Selection sort Merge sort Radix sort Quick sort Heap sort Bubble sort

  • 7/30/2019 Sorting Classnotes AAM

    2/28

    Insertion sort

    In this sorting we can read the given elements from 1 to n, inserting each element

    into its proper position. For example, the card player arranging the cards dealt to

    him. The player picks up the card and inserts them into the proper position. At

    every step, we insert the item into its proper place.

    This sorting algorithm is frequently used when n is small. The insertion sort

    algorithm scans A from A[l] to A[N], inserting each element A[K] into its proper

    position in the previously sorted subarray A[l], A[2], . . . , A[K-1]. That is:

    Pass 1. A[l] by itself is trivially sorted.

    Pass 2. A[2] is inserted either before or after A[l] so that: A[l], A[2] is sorted.Pass 3. A[3] is inserted into its proper place in A[l], A[2], that is, before A[l],

    between A[l] and A[2], or after A[2], so that: A[l], A[2], A[3] is sorted.

    Pass 4. A[4] is inserted into its proper place in A[l], A[2], A[3] so that:

    A[l], A[2], A[3], A[4] is sorted.

    Pass N. A[N] is inserted into its proper place in A[l], A[2], . . . , A[N - 1] sothat: A[l], A[2], . . . ,A[N] is sorted.

  • 7/30/2019 Sorting Classnotes AAM

    3/28

    Algorithm INSERTION ( A , N )

    This algorithm sorts the array A with N elements

    1. Set A[0] := -- . [initializes the element]

    2. Repeat Steps 3 to 5 for K= 2,3, ,N

    3. Set TEMP := A[K] and PTR:= K-1

    4. Repeat while TEMP < A[PTR]

    (a) Set A[PTR +1]:=A[PTR] [Moves element forward]

    (b) Set PTR := PTR-1

    [End of loop].

    5. Set A[PTR+1] := TEMP [inserts element in proper place]

    [End of Step 2 loop]

    6. Return

    Selection sort

    In this sorting we find the smallest element in this list and put it in the first

    position. Then find the second smallest element in the list and put it in the second

    position. And so on.

    Pass 1. Find the location LOC of the smallest in the list of N elements A[l], A[2], .

    . . , A[N], and then interchange A[LOC] and [1] . Then A[1] is sorted.

    Pass 2. Find the location LOC of the smallest in the sublist of N 1 Elements

    A[2], A[3],. . . , A[N], and then interchangeA[LOC] and A[2]. Then:A[l], A[2] is

    sorted, since A[1]

  • 7/30/2019 Sorting Classnotes AAM

    4/28

    Example

    Suppose an array A contains 8 elements as follows:77, 33, 44, 11, 88, 22, 66, 55

    Algorithm

    1. To find the minimum element

    MIN ( A, K , N, LOC)

    An array A is in memory. This procedure finds the location

    LOC of the smallest element among A[K] , A[K+1],.A[N].

    1. Set MIN:= A[K] and LOC := K [Initializes pointers]2. Repeat for J = K +1, K+2

    If MIN > A [J] , then : Set MIN := A[J] and LOC := A[j]

    and LOC: = J

    3. Return

    2. To Sort the elements

    SELECTION (A, N)

    1. Repeat Steps 2 and 3 form K= 1,2, .., N 1

    2. Call MIN(A,K,N,LOC)

    3. [Interchange A[K] and A[LOC] ]

    Set TEMP: = A [K], A [K]:= A [LOC] and A [LOC]:=TEMP4. Exit.

  • 7/30/2019 Sorting Classnotes AAM

    5/28

    Merge sort

    Combing the two lists is called as merging. For example A is a sorted list with r

    elements and B is a sorted list with s elements. The operation that combines the

    elements of A and B into a single sorted list C with n = r + s elements is called

    merging. After combing the two lists the elements are sorted by using the

    following merging algorithm Suppose one is given two sorted decks of cards. The

    decks are merged as in Fig. That is, at each step, the two front cards are compared

    and the smaller one is placed in the combined deck. When one of the decks is

    empty, all of the remaining cards in the other deck are put at the end of the

    combined deck. Similarly, suppose we have two lines of students sorted by

    increasing heights, and suppose we want to merge them into a single sorted line.

    The new line is formed by choosing, at each step, the shorter of the two students

    who are at the head of their respective lines. When one of the lines has no more

    students, the remaining students line up at the end of the combined line.

    The above discussion will now be translated into a formal algorithm which merges

    a sorted r-element array A and a sorted s-element array B into a sorted array C,

    with n = r + s elements. First of all, we must always keep track of the locations of

    the smallest element of A and the smallest element of B which have not yet been

  • 7/30/2019 Sorting Classnotes AAM

    6/28

    placed in C. Let NA and NB denote these locations, respectively. Also, let PTR

    denote the location in C to be filled. Thus, initially, we set NA : = 1, NB : = 1 and

    PTR : = 1. At each step of the algorithm, we compare A[NA] and B[NB] and

    assign the smaller element to C[PTR]. Then we increment PTR by setting PTR:=

    PTR + 1, and we either increment NA by setting NA: = NA + 1 or increment NB

    by setting NB: = NB + 1, according to whether the new element in C has come

    from A or from B. Furthermore, if NA> r, then the remaining elements of B are

    assigned to C; or if NB > s, then the remaining elements of A are assigned to C.

    Algorithm MERGING ( A, R, B, S, C)

    Let A and B be sorted arrays with R and S elements. This algorithm

    merges A and B into an array C with N = R + S elements.

    1. [Initialize ] Set NA : = 1 , NB := 1 AND PTR : = 1

    2. [Compare] Repeat while NA

  • 7/30/2019 Sorting Classnotes AAM

    7/28

    Quick sort Algorithm

    Quick sort is one of the example of "Divide and Conquer approach" forsolving the problems.

    Quick sort algorithm works by placing the last element of queue inproper position through comparing the other element from the first end of

    queue.

    The steps followed by quick sort algorithm are as follows:1. Adjust the dividing (pivot) point in the queue i.e last element of the

    queue.

    2. Then compare each element of the queue from the beginning of thequeue if condition satisfy that element is less then pivot element then

    place it left hand side by exchanging the element else greater element

    than pivot element will be at right hand side.

    3. After completing a iteration exchange the pivot element with theexact element from where all element in left hand side are less & right

    hand side are greater and after placing pivot element divide the queue

    in two parts.

    4. After dividing in two parts again choose the pivot element in both thequeues and sort them separately by repeating step 1,2,3,4.

    5. Repeat the process until the queue is not sorted and after sorting eachsub queues recursively combine them to one one sorted queue.

    Now we move to see the quick sort algorithm structure as follow:Firstly to set our pointers to get partition of an array:

    Quick sort(Array, S ,Piv)

    If S < Piv

    Then q Partition(Array, S, Piv)

    Quick_sort(Array, S, q-1)

    Quick_sort(Array, q+1,Piv)

    Partition (Array, S , Piv)

    x Array[Piv]

    i S - 1

    For j S to Piv-1

    do If Array[j] xThen i i + 1

    Exchange Array[i] Array[j]

    Exchange Array[i + 1] Array[Piv]

    Return i + 1 (Return the position of S)

  • 7/30/2019 Sorting Classnotes AAM

    8/28

  • 7/30/2019 Sorting Classnotes AAM

    9/28

  • 7/30/2019 Sorting Classnotes AAM

    10/28

  • 7/30/2019 Sorting Classnotes AAM

    11/28

  • 7/30/2019 Sorting Classnotes AAM

    12/28

    Bubble sort Algorithm

    Bubble sort and somes say it as sinking sort. Selection sort algorithm simply start sorting step by step comparing element to

    the next element and swapping them this procedure repeat's until all element in

    array is sorted in some sequence accordingly.

    Bubble sort algorithm gets name bubble because of sorting the elements inarray in shorter range i.e just next value of the element in array is checked and

    swapped or we can say sorting function is perform in very smaller time that is

    why it is also called comparison sort.

    Now we will see the algorithm structure as follows:

  • 7/30/2019 Sorting Classnotes AAM

    13/28

  • 7/30/2019 Sorting Classnotes AAM

    14/28

  • 7/30/2019 Sorting Classnotes AAM

    15/28

    Radix Sort

    Radix sorting involves looking at a radix (or digit) of a number and placing it in

    an array of linked lists to sort it.

    Algorithm for radix sorting:

    1. Look at the rightmost digit.2. Assign the full number to that digits index.3. Look at the next digit to the left FROM the current sorted array. IF there is

    no digit, pad a 0.

    4. REPEAT STEP 3 UNTIL all numbers have been sorted.Let's see a step by step example of a radix sort of the following set of unsorted

    numbers. The bold digits here represent the first digit to look at when attempting to

    sort the list. You must also append it to the end of that linked list in the array.

    212 21 72 5 431 898 616 24 9

    Step 1:

    0

    1 21 -> 431

    2 212 -> 72

    3

    4 24

    5 05

    6 616

    7

    8 898

    9 09

    Step 2: (working from step 1)

    0 005 -> 009

    1 212 -> 616

    2 021 -> 024

    3 431

    45

    6

    7 072

    8

    9 898

  • 7/30/2019 Sorting Classnotes AAM

    16/28

    Step 3: (working from step 2)

    0 5 -> 9 -> 21 -> 24 -> 72

    1

    2 212

    3

    4 431

    5

    6 616

    7

    8 898

    9

    Step 3 is the final step and the list is sorted.

    The benefits of a radix sort is the fact that it can be done by pencil and paper. It

    also only contains a fixed data structure (an array of size 10). The downside of

    radix sort is that it takes time to implement since you may manually go throughnumerous steps to sort the list depending on how many numbers you have to sort.

    Here is another example of radix sort, this time using numbers up to 4 digits in

    length. You will notice something interesting here

    58 99 999 47 200 101 1002 12 1111

    Step 1:

    0 200

    1 101 -> 1111

    2 1002 -> 12

    3

    4

    5

    6

    7 47

    8 58

    9 99 -> 999

    Step 2: (working from step 1)

    0 200 -> 101 -> 10021 1111 -> 012

    2

    3

    4 047

    5 058

    6

  • 7/30/2019 Sorting Classnotes AAM

    17/28

    7

    8

    9 099 -> 999

    Step 3: (working from step 2)

    0 1002 -> 0012 -> 0047 -> 0058 -> 0099

    1 0101 -> 1111

    2 0200

    3

    4

    5

    6

    7

    8

    9 0999Step 4: (working from step 3)

    0 12 -> 47 -> 58 -> 99 -> 101 -> 200 -> 999

    1 1002 -> 1111

    2

    3

    4

    5

    6

    7

    8

    9

    Step 4 is the final step here. Notice however that the index 0 goes from 0 to 999

    while 1 goes from 1000 to 1999 etc...

  • 7/30/2019 Sorting Classnotes AAM

    18/28

    Heapsort

    Heaps

    The (Binary) heap data structure is an array object that can be viewed as a nearly

    complete binary tree.

    A binary tree with n nodes and depth k is complete iff its nodes correspond to

    the nodes numbered from 1 to n in the full binary tree of depth k.

  • 7/30/2019 Sorting Classnotes AAM

    19/28

    Attributes of a Heap

    An array A that presents a heap with two attributes:

    length[A]: the number of elements in the array.

    heapsize[A]: the number of elements in the heap stored with

    array A.

    length[A] heapsize[A]

    Basic procedures

    If a complete binary tree with n nodes is represented

    sequentially, then for any node with index i, 1 i n, we have

    A[1] is the root of the tree

    the parent PARENT(i) is at i/2if i 1

    the left child LEFT(i) is at 2i

    the right child RIGHT(i) is at 2i+1

    The LEFT procedure can compute 2i in one instruction by simply shifting thebinary representation of i left one bit position.

    Similarly, the RIGHT procedure can quickly compute 2i+1 by shifting the

    binary representation of i left one bit position and adding in a 1 as the loworder

    bit.

    The PARENT procedure can compute i/2 by shifting i right one bit position.

  • 7/30/2019 Sorting Classnotes AAM

    20/28

    Heap properties

    There are two kind of binary heaps: maxheaps and minheaps.

    In a maxheap, the maxheap property is that for every node i other than the

    root, A[PARENT(i) ] A[i] .

    the largest element in a maxheap is stored at the root the subtree rooted at a node contains values no larger than that contained at the

    node itself

    In a minheap, the minheap property is that for every node i other than the root,

    A[PARENT(i) ] A[i] .

    the smallest element in a minheap is at the root

    the subtree rooted at a node contains values no smaller than that contained at the

    node itself

  • 7/30/2019 Sorting Classnotes AAM

    21/28

    The height of a heap

    The height of a node in a heap is the number of edges on the longest simple

    downward path from the node to a leaf, and the height of the heap to be the height

    of the root, that is (lgn).

    For example:

    the height of node 2 is 2

    the height of the heap is 3

    The MAXHEAPIFY procedure

    MAXHEAPIFY is an important subroutine for manipulating max heaps.

    Input: an array A and an index i

    Output: the subtree rooted at index i becomes a max heap Assume: the binary trees rooted at LEFT(i) and RIGHT(i) are maxheaps, but

    A[i] may be smaller than its children

    Method: let the value at A[i] float down in the maxheap

    MAXHEAPIFY(A, i)

    1. l LEFT(i)

    2. r RIGHT(i)

    3. if l heapsize[A] and A[l] > A[i]

    4. then largest l

    5. else largest i6. if r heapsize[A] and a[r] > A[largest]

    7. then largest r

    8. if largest i

    9. then exchange A[i] A[largest]10. MAXHEAPIFY (A, largest)

  • 7/30/2019 Sorting Classnotes AAM

    22/28

    Building a Heap

    We can use the MAXHEAPIFY procedure to convert an array A=[1..n] into amaxheap in a bottomup manner.

    The elements in the subarray A[(n/2+1)n ] are all leaves ofthe tree, and so

    each is a 1element heap.

    The procedure BUILDMAXHEAP goes through the remaining nodes of the

    tree and runs MAXHEAPIFY on each one.

    BUILDMAXHEAP(A)

    1. heapsize[A] length[A]

    2. for i length[A]/2 downto 13. do MAXHEAPIFY(A,i)

  • 7/30/2019 Sorting Classnotes AAM

    23/28

  • 7/30/2019 Sorting Classnotes AAM

    24/28

    The heapsort algorithm

    Since the maximum element of the array is stored at the root, A[1] we can

    exchange it with A[n].

    If we now discard A[n], we observe that A[1...(n1)] can easily be made into

    a maxheap.

    The children of the root A[1] remain maxheaps, but the new root A[1] element

    may violate the maxheap property, so we need to readjust the maxheap. That is to

    call MAXHEAPIFY(A, 1).

    HEAPSORT(A)

    1. BUILDMAXHEAP(A)

    2. for i length[A] downto 2

    3. do exchange A[1] A[i]

    4. heapsize[A] heapsize[A] 15. MAXHEAPIFY(A, 1)

  • 7/30/2019 Sorting Classnotes AAM

    25/28

  • 7/30/2019 Sorting Classnotes AAM

    26/28

    3 essential properties of algorithms:

    In computer science, an in-place algorithm (or in Latin in situ) is an algorithm

    which transforms input using a data structure with a small, constant amount of

    extra storage space. The input is usually overwritten by the output as the algorithm

    executes. An algorithm which is not in-place is sometimes called not-in-place or

    out-of-place

    In computer science, an online algorithm is one that can process its input piece-by-

    piece in a serial fashion, i.e., in the order that the input is fed to the algorithm,

    without having the entire input available from the start. In contrast, an offline

    algorithm is given the whole problem data from the beginning and is required to

    output an answer which solves the problem at hand.

    A sorting algorithm is said to be stable if two objects with equal keys appear in thesame order in sorted output as they appear in the input unsorted array.

    Algorithm In-place Online Stable

    Insertion sort Yes Yes Yes

    Selection sort Yes No No

    Merge sort No Yes Yes

    Radix sort No No Yes

    Quick sort Yes Yes NoHeap sort Yes No No

    Bubble sort Yes No Yes

    External sorting

    External sorting is a term for a class of sorting algorithms that can handle massive

    amounts of data. External sorting is required when the data being sorted do not fit

    into the main memory of a computing device (usually RAM) and instead they must

    reside in the slower external memory (usually a hard drive). External sorting

    typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of datasmall enough to fit in main memory are read, sorted, and written out to a temporary

    file. In the merge phase, the sorted subfiles are combined into a single larger file.

  • 7/30/2019 Sorting Classnotes AAM

    27/28

    Basic External Sorting Algorithm

    Assume unsorted data is on disk at start

    Let M = maximum number of records that can be stored & sorted in internal

    memory at one time

    Algorithm

    Repeat:

    1. Read M records into main memory & sort internally.2. Write this sorted sub-list onto disk. (This is one run).

    Until all data is processed into runs

    Repeat:

    1. Merge two runs into one sorted run twice as long2. Write this single run back onto disk

    Until all runs processed into runs twice as long

    Merge runs again as often as needed until only one large run: the sorted list

  • 7/30/2019 Sorting Classnotes AAM

    28/28