Friday 28 March 2014

Sorting and Efficiency


0. Why sorting?

    When we put our clothes in cabinate, generally we don't put them messed up, because we want them to be looked neat, and more importantly, it is for ourselves's convenient to easy to find some of them when we need. Similarly, when we use computer to store data, we don't want the data stored randomly, because that would be hard to find some data when we need it, especially when the data is huge today. For example, if we have 1000 numbers stored and we want to find some number, it may cost us 1000 time to check each number. But if the data was stored sorted, then we need to check lg1000 = 10 times. See, that is realy a big difference. So we sort the data and store them is order, such as in sorted list, binary sort tree, etc.

1. Types of sorting
    In CSC108 we learned Bubble sort, Selection sort, and Insertion sort; in CSC148 we learned Quick sort and Merge sort. We also heard that the Python builtin sort is Timsort.

2. Algorithm of sorts
    Bubble sort compares adjacent items and exchanges those that are out of order, each pass it places the largest of unsorted part of list to the right position, so it gets the name. Its running time is n + (n-1) + (n-2) + .... + 2  = n(n-1)/2, so its compexity is O(n^2). Selection sort selects the smallest item from the unsorted part and put it at the proper position at each pass. So similarly, its running time is also n(n-1)/2, same complexity as Bubble sort. Insertion sort is kind of different from the first two, it compares the index item to its left and insert it in the proper position of the sorted part, so when the list is already sorted (best case), it only cost n steps, but in worst case, it still needs n(n-1)/2 steps.
    Different from the three sorts above, Quick sort and Merge sort use recersion, break down the list into two portion, and check and merge them, so their complexity is O(nlgn), generally much faster than the three above.

Big-Oh Timsort Mergesort Quicksort Insertionsort Selectionsort Bubblesort
BestCase O(n) O(nlgn) O(nlgn) O(n) O(n^2) O(n^2)
AverageCase O(nlgn) O(nlgn) O(nlgn) O(n^2) O(n^2) O(n^2)
WorstCase O(nlgn) O(nlgn) O(n^2) O(n^2) O(n^2) O(n^2)

3 Performance in lab
    This week we had lab ran and compared the sorts above. The following is their behavior:
case a) the list is random;
case b) the list is sorted;
case c) the list is sorted, but reversed.
Their performance:
Bubble sort is always the slowest one; selection sort always costs the same time no matter the list is sorted or not; Insertion sort has a big change when the list is sorted, it goes much faster than others.
Time sort and merge sort are always fast, but Quick sort has big different perfomance, when the list is already sorted, it costs much more time to do nothing.







































































































































































































































































































































































































































Sunday 2 March 2014

Recursion -- move cheeses using four (4) stools

The first assignment enhances our understanding of recursion and we start to realize its strong power.

The first assignment (A1) of this course is related to "The tour of Anne Hoy" -- move cheeses from one stool to another. In lecture, the instructor teaches us how to move n cheeses using three(3) stools, and the strategy is : (1) move n-1 cheeses from source stool to intermediate stool; (2) move 1 (the biggest one) cheese from source to destination stool; (3) move n-1 cheeses from intermediate to destination. The total moves is 2^n-1. To reduce moves, Anne decides to invest one more stool. Our task is to find the strategy to using four(4) stools to move cheeses from one stool to another and significantly reduce the moves used by 3 stools.

Because we have 4 stools now, so we do not have to move n-1 cheeses to intermediate stool and move 1 to destination stool. Instead we can (1) move n-i cheeses from source stool to one intermediate stool; (2) move i cheeses from source stool to destination stool; (3) move n-i cheeses from intermediate stool to destination stool. In step (2), Anne can use her old strategy to move the i cheeses using three(3) stools because the n-i smallest cheeses occupies one intermediate stool and she can not use it. But in step (1) and (3), Anne needs a better strategy to move the n-i cheeses because she can uses 4 stools and she can divides the n-i cheeses into two(2) parts (just like we divided the n cheeses into n-i and i two(2) parts), using the same strategy to move the n-i cheeses from source to one intermediate stool -- she will (1) move (n-i)-k cheeses from source to intermediate-1; (2) move k cheeses from source to intermediate-2; (3) move (n-i)-k cheeses from intermediate-1 to intermediate-2. Now you can easily find here is a repeat process -- recursion appear. After getting the idea, then code. At each step of the recursion, we code another function for choosing i value.

I used the console to trace each move and found all the moves were legal. Once again, I saw the power of recursion and its efficiency.

Sunday 23 February 2014

OOP -- What is OPP (Object-Oriented Programming) ?

What is OOP(Object-Oriented Programming), my partner and I met this question during the first lab of this course. Now half time of this course has gone, and we should have better understood the meaning of OOP and how to use it, but the first lab really gave me and my partner deep impression about OOP.

The first lab, our main task is to count the number of occurrences of words in a piece of text, and store the words with their frequency, find the "top n" words, and compare one collection to another. We thought it was not difficult, after a short discussion, we started to code. We knew how to open a file, read file, split, strip, and get a list of words. We use "for loop" to check every word in the list, find their frequency, and store them in dictionary, words are the keys and frequencies are the value. Every thing went smooth well. After we finished coding and implementation, we shew our work to TA. To our big surprise, TA told us that what we were required was Object-Oriented Programming, and what we did was just a procedure, though it was program, but not what this lab asked. Then we read the lab handout again, this time we found something we missed: "Perform an object-oriented analysis of the specifications in specs.txt". We just did discuss how to make the procedure and how to make it work on Python, but ignored the key word "object-oriented analysis, or we simply did not understand what it meant. We had to redo it: find the major noun and write the header for an appropriate class definition, along with a short docstring for the class; then find the main verb and write the header for an appropriate method, along with a short docstring for the method, and so on...

This first lab did give me a deep impression on OPP.

Thursday 6 February 2014

Recursion -- The Power of Recursion Insight

This week's lab, we were requested to write some code using recursion which we learned in lectures. The first question is to write a function to check if some number is in a possibly nested list. The header and description as below:

def find_num(SL: 'SearchList', n: int) -> bool:
"""
Return True if n is in SL, False otherwise

after a short discussion with my partners, we quickly formed a solution. The idea is: "for item in SL", check each item in the list SL, if the item is a number, then check if item == n; if the item is a list, then we check its items one by one, using for item in list. If we find any item equals to n, then return True and stop; if have checked all the items but no one equal to n, return False.

We were happy to have a solution. While my partner was typing the code in Pyphon, suddenly I remembered that the function the instructor showed in class has only one line -- return line for function body while our code has 8 someting lines. I realized what we had done is what we learned in CSC108, not CSC148. Then I asked my partner stop typing and told him that we need to write a one-line body function. Surprised by my words, my partner didn't believe we can do the job by one line. Then we checked the examples teacher used in class.

We checked the example "sum_list(L: list)", it uses only one line to sum up the numbers in a possibly nested list of numbers. We studied the function carefully, and wrote our one-line body function. save, run, and doctect, call check, everything was ok. Then we called TA to check our job, "GOOD!" was his answer.

Using this idea, we quickly completed the second question, and then did  some extra problems.

The key is using recursive code; and the power is recursion insight.



Tuesday 21 January 2014

CSC148 courSe LOG (SLOG)

University of Toronto
Introduction to Computer Science
CSC148 courSe LOG (SLOG)
Student exercises
A new Start Point ...