Random Selection Algorithm
This is the implementation of Random Selection Algorithm as instructed in Algorithms: Design and Analysis, Part1, Coursera. But I think there may be something wrong in his slides.
1. Analysis of the pseudo-code
The pseudo-code he gives is as follows:
RSelect (array A, length n, order statistics i)
- if n = 1, return A[1]
- choose pivot p from A uniformly at random
- partition A around p, let j = new index of p
- if j = i, return p
-if j > i, return RSelect(1st part of A, j-1, i)
-else if j < i, return RSelect(2nd part of A, n-j, i-j)
It should be noted that j is index of pivot point as opposed to the original array. Therefore, the last line here should be modified. Let me use an example to illustrate my point. Note that in this example, index of an array starts at 1.
Let A = {5,7,1,4,2}, suppose we want to pick out the 4th statistics (j=4), which is 5 in our case. In the first round, suppose 2 is chosen as the pivot. After PartitionAroundPivot subroutinem the array would be A = {1,2,5,7,4}, where j = 2. Therefore, we should recursively call RSelect on the second half of the array and look for the 2nd statistics, which should be computed as j - j =2, correct. But this set j to 2. On the second round ,suppose we choose 4 as pivot, then the subarray should be sub-A = {4,5,7}, with j = 3, and now we should call RSelect on the second half, and look for the 1st order statistics. BUT, j - j = 2 - 3 = -1. Problem has arose. Problem is, j does not denote the index of pivot in the subarray, while j does denote the statistics in new subarray, so the comparison between j and j is meaningless. We should therefore modify j to j - j, where j means the left boundary of the subarray.
Another possible explanation may be that ‘new index‘ means index in the ‘new‘ subarray. But this may cause trouble to return the final result.
2. My Implementation
The code will generate a random array and select 1st statistics to nth statistics sequentially, which equals a sorting program. Another things worth noticing is that array index in C starts at 0. So we should increment pivot index by one when comparing with order statistics, where the incremented pivot index itself becomes an order statistics. And also, I am kind of rusty when dealing with random number generation. I have learned it in freshmen year but kind of forget. So maybe someone can give me a link or something to a concise tutorial?
1 #include <stdio.h> 2 #include <stdlib.h> 3 #define N 14 4 int ChoosePivot(int l, int r){ 5 srand(time(NULL)); 6 int randNum = rand(); 7 int pivotIndex; 8 pivotIndex = randNum %(r-l+1); 9 pivotIndex = pivotIndex + l; 10 return pivotIndex; 11 } 12 13 int PartitionAroundPivot(int* A, int pivotIndex, int l, int r){ 14 int i,j; 15 int pivot = A[pivotIndex], temp; 16 17 i = l + 1; 18 j = l; 19 //swap pivot to the first one 20 if (pivotIndex != l){ 21 temp = A[l]; A[l] = A[pivotIndex]; A[pivotIndex] = temp; 22 } 23 for (j = l + 1; j < r + 1; j++){ 24 if (A[j] < pivot) { 25 temp = A[i]; A[i] = A[j]; A[j] = temp; 26 i++; 27 } 28 } 29 temp = A[i-1]; A[i-1] = pivot; A[l] = temp; 30 return i-1; 31 32 } 33 34 //l and r specifies the sub-array, i specifies the pivot 35 int RSelect(int* A, int l, int r, int i){ 36 int p, p_th; 37 if (r-l+1 == 1) 38 return A[l]; 39 p = ChoosePivot(l,r); 40 p_th = PartitionAroundPivot(A,p,l,r); 41 42 if (p_th-l+1==i) 43 return A[p_th]; 44 if (p_th-l+1>i) 45 return RSelect(A,l,p_th-1,i); 46 if (p_th-l+1<i) 47 return RSelect(A,p_th+1,r,i-p_th+l-1); 48 } 49 50 int main() 51 { 52 int A[N]; 53 int r,i; 54 srand(time(NULL)); 55 for (i=0;i<N;i++){ 56 A[i] = rand(); 57 } 58 for (i=1;i<N+1;i++){ 59 r = RSelect(A,0,N-1,i); 60 printf("%d\n",r); 61 } 62 63 return 0; 64 }