0. Why sorting?
When we put our clothes in cabinate, generally we don't put them messed up, because we want them to be looked neat, and more importantly, it is for ourselves's convenient to easy to find some of them when we need. Similarly, when we use computer to store data, we don't want the data stored randomly, because that would be hard to find some data when we need it, especially when the data is huge today. For example, if we have 1000 numbers stored and we want to find some number, it may cost us 1000 time to check each number. But if the data was stored sorted, then we need to check lg1000 = 10 times. See, that is realy a big difference. So we sort the data and store them is order, such as in sorted list, binary sort tree, etc.
1. Types of sorting
In CSC108 we learned Bubble sort, Selection sort, and Insertion sort; in CSC148 we learned Quick sort and Merge sort. We also heard that the Python builtin sort is Timsort.
2. Algorithm of sorts
Bubble sort compares adjacent items and exchanges those that are out of order, each pass it places the largest of unsorted part of list to the right position, so it gets the name. Its running time is n + (n-1) + (n-2) + .... + 2 = n(n-1)/2, so its compexity is O(n^2). Selection sort selects the smallest item from the unsorted part and put it at the proper position at each pass. So similarly, its running time is also n(n-1)/2, same complexity as Bubble sort. Insertion sort is kind of different from the first two, it compares the index item to its left and insert it in the proper position of the sorted part, so when the list is already sorted (best case), it only cost n steps, but in worst case, it still needs n(n-1)/2 steps.
Different from the three sorts above, Quick sort and Merge sort use recersion, break down the list into two portion, and check and merge them, so their complexity is O(nlgn), generally much faster than the three above.
Big-Oh | Timsort | Mergesort | Quicksort | Insertionsort | Selectionsort | Bubblesort |
BestCase | O(n) | O(nlgn) | O(nlgn) | O(n) | O(n^2) | O(n^2) |
AverageCase | O(nlgn) | O(nlgn) | O(nlgn) | O(n^2) | O(n^2) | O(n^2) |
WorstCase | O(nlgn) | O(nlgn) | O(n^2) | O(n^2) | O(n^2) | O(n^2) |
3 Performance in lab
This week we had lab ran and compared the sorts above. The following is their behavior:
case a) the list is random;
case b) the list is sorted;
case c) the list is sorted, but reversed.
Their performance:
Bubble sort is always the slowest one; selection sort always costs the same time no matter the list is sorted or not; Insertion sort has a big change when the list is sorted, it goes much faster than others.
Time sort and merge sort are always fast, but Quick sort has big different perfomance, when the list is already sorted, it costs much more time to do nothing.