As we saw last time, searching goes a lot faster if the array is in sorted order. Algorithms for searching and sorting are some of the most fundamental. To sort a sequence means to re-arrange the elements of the sequence so that they are "in order," for instance either numeric order or alphabetical order. Let's concentrate on the idea of numeric order for now. Which of these arrays are sorted?
A[] = { 5, 9, 10, 11, 15, 20, 25 }B[] = { 5, 5, 6, 12, 15, 15, 16 }C[] = { 5, 6, 7, 8, 6, 7, 9, 10 }If you have an array of numbers, how do you tell whether or not it is sorted? One way is to look at each pair of adjacent numbers in the array, and ensure that the first one is no bigger than the second. For instance:
to CheckSorting given N[]: 1. Set index to 0 2. While index < (N.length - 1) do 3. If N[index] > N[index + 1] then 4. Print "Not sorted" 5. Halt and return False. 6. Else 7. Add 1 to index 8. End while 9. Print "Sorted" 10. Halt and return True.
This is similar to the sequential search algorithm described last time.
Now, suppose you have an array of numbers, and they are not sorted. How do you sort them? In other words, how do you rearrange them so that they are in the correct order? This is the problem of sorting.
to Sort given N[]:
N is an array of numbers in any order.Sort halts, running CheckSorting(N) should return True.N remain the same, only their order changes.Below are some intuitive descriptions of some sorting algorithms. Pretend like you've been dealt a hand of cards, which you're holding in your right hand.
Some demos, to follow up on the in-class demos:
In addition to being correct, we would ulimately like our algorithms to be efficient. For example, in writing an algorithm to play chess, we can't have one move take trillions of years. The analysis of algorithms is a complex and at times difficult process, but crucial to nearly every aspect of computer science. We will not spend too much time discussing the mathematics behind this, but I want to give you some appreciation for why this analysis is important -- beyond how long you have to wait for MS Word to start up. For example, such analysis can prove that your password is secure (more on this later).
Let's consider the following four algorithms on an array whose length is n.
| 1+1 | 2+1 | ... | 5+1 |
| 1+2 | 2+2 | ... | 5+2 |
| 1+3 | 2+3 | ... | 5+3 |
| 1+4 | 2+4 | ... | 5+4 |
| 1+5 | 2+5 | ... | 5+5 |
Now, computers are always getting faster, but these "orders of growth" help us see at a glance the inherent differences in run-time for different algorithms. Supposing a computer could do a single operation in 0.0001 second, we'd have the following total amounts of time, for various problem sizes and various orders of growth.
| order | 10 | 50 | 100 | 1000 |
|---|---|---|---|---|
| log(n) | 0.0003 s | 0.0006 s | 0.0007 s | 0.001 s |
| n | 0.001 s | 0.005 s | 0.01 s | 0.1 s |
| n² | 0.01 s | 0.25 s | 1 s | 1.67 min |
| 2n | 0.1024 s | 3570 yrs | 4x1018 yrs | forget about it |
Clearly, when designing algorithms we need to be careful. For example, a brute-force chess algorithm has runtime 2n which makes it completely impractical. Interestingly, though, this type of complexity can help us. In particular, the reason that it is difficult for someone to crack your password is because the best known algorithm for cracking passwords runs in 2n time (specifically factoring large numbers into primes).