CS61B Review Doc
Created by Yunhao Cao (Github@ToiletCommander) in Fall 2021 for UC Berkeley CS61B.
Reference Notice: Material highly and mostly derived from Prof. Hillfinger's lecture slides, some ideas were borrowed from CS61C Fall 2021 Lectures(Wawrzynek & Weaver) as well as wikipedia lol.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Basics
Language Features
Pointers
Destructive / non-destructive functions
- Destructive - means if a parameter is passed in as pointer / reference, the place the parameter is pointed to would be changed
- Non-destructive - means if the parameter is passed in as pointer / reference, the place the parameter is pointed to would not changed
Lambda Expressions
Should remind you of a special class Consumer<T>, has signature
class Consumer<T>{
public void accept(T something);
}
To implement Consumer<T>, either use:
//Anonymous class expression
Consumer<String> printerLambda = new Consumer<String>(){
public void accept(String something){
System.out.println(something);
}
}
or
//Single-line lambda expression
Consumer<String> printerLambda = (something) -> System.out.println(something);
or
//Multi-line lambda expression
Consumer<String> printerLambda = (something) -> {
System.out.println(something);
return;
}
Static Imports
Say I want to use System.out.println many times, but println is actually a static method, and out is a static class in System, so what do we do?
import static System.out;
And now we can use out.println as many times as we want.
Arrays
To declare an array of type T, use the following expression
T[] varName = new T[lengthOfArray];
Note that T can be any primitive / class types, including array as an acceptable type!
Some fun things to look at may include:
- Accessing Array Elements
//this prints out ith element of varName, i ranging from 0 to varName.length
System.out.println(varName[i]);
- Copying Array Contents
import java.lang.System;
System.arrayCopy(src,srcStartIndex,dest,destStartIndex,length);
Note that System.arrayCopy has the following signature
public static void arraycopy(Object src, int srcPos, Object dest, int destPos, int length)
Parameters
src − This is the source array.
srcPos − This is the starting position in the source array.
dest − This is the destination array.
destPos − This is the starting position in the destination data.
length − This is the number of array elements to be copied.
Exception
IndexOutOfBoundsException − if copying would cause access of data outside array bounds.
ArrayStoreException − if an element in the src array could not be stored into the dest array because of a type mismatch.
NullPointerException − if either src or dest is null.
Access modifiers
Determines if a class method(function) / member(variable) can be seen by itself / its sprouts / other classes
We have:
- public - can be seen by everyone
- protected - can be seen by itself and its sprouts
- private - can be only seen by itself
Other modifiers
- static - means there's only one instance per class
- final - means the value cannot be changed after its construction
- default - in an interface declaration you can provide a default implementation for abstract methods.
Some Commonly Used Interfaces / Types in 61B
/**
* The comparable interface defines a class that can be compared to other classes
* compares to returns value that < 0 if this item is smaller, = 0 is this item is the
* same as o, or > 0 is this item is bigger than o
*
* In normal settings like int, usually A.compareTo(B) returns A - B;
*/
interface Comparable<T>{
int compareTo(T o);
}
Typings
Integer Typings
Yeah, it sucks, but 61B actually teaches you something that 61C should be teaching you - type conventions and integer radices, bitwise operations etc.
But good news is that it's not as hard as 61C, so yeahhh?
We will touch on how Java represents its integers in memory later on...
But for now, think about how many numbers can each type represent?
Say we have an integer type that has n bits to store, we know each bit position can take 0 / 1 as its value, so how many numbers(N) can this type represent?
because we can take 2 different values for each bit position and we have n different positions!
Convert Numbers
We have number 159 in decimal, how do we convert this into binary?
Thus 159 in binary is 0b010011111
, I left the leftmost 0 there to avoid confusion with negative numbers.
Then what is 159 in hexidecimal? I love to view it starting from binary perspective
159 = 0b1001.1111
You see that I removed the front 0 because each 4 bit corresponds to a hexidecimal character.
159 = 0x9F
since 0b1001 = 9 in decimal, and 9 in dec. corresponds to 9 in hex, and 0b1111 = 15 in decimal, and 15 corresponds to F in hex.
Note: A lot of confusion when talking about bitwise memories is the notion of Bit and Byte
And also just a side note: Let's say our data is 0xABCDEFGHIJKLMN, let the alphabets represent bit positions, we say that the N position is the LSB(Least Significant Bit), and A position is MSB(Most Significant Bit)
Just for a little bit extension and fun, see how this idea of MSB and LSB extends to the unit of byte and how data is laid out in memory (a concept in 61C)
Overflow - how does java deal with bigger number than type bounds?
Ans: Using finite fields, overflowed numbers will always be "wrapped around".
Complex Two's - How java represents signed integers
So basically the MSB(bit) is used as a sign(0 for positive, 1 for negative).
And when the sign is positive, just interpret it as an unsigned value.
If the sign is negative, you can calculate the value of remaining bits as k, and suppose your integer type has n bit positions, the real value of your memory is
For example, if I have a 16-bit integer, a value of 0b1111111111111111
(16 of 1)
we would interpret it as , and the value of it being
Easier two ways to convert complex two's negative values:
- flip the bits first and add 1 to the final result.
- minus the number by 1 first and then flip the bits.
The two methods both work!
Think about this representation, why not just use first bit as the positive/negative sign and the rest bit as magnitude? Well the "magnitude" representation also exists and you will probably learn it in CS61C. But think about our complex 2's representation - adding 1 to max positive(in binary, a zero bit followed by arbitrary 1 bits) will result in a new sign bit of 1, followed by arbitrary 0 bits and the result will be exactly the min negative representation in our complex two's - so the java overflow automatically happens inside circuit logic - how cool is that! And also think about addition...
Bit operations in Java
Operation | & | | | ^ | ! |
Description | AND(Bitwise Mask) | OR(Bitwise Set) | XOR(Bitwise Flip) | NOT(Flip ALL) |
A | 00101100 | 00101100 | 00101100 | |
B | 10100111 | 10100111 | 10100111 | 10100111 |
A op B | 00100100 | 10101111 | 10001011 | 01011000 |
Operation | << | >> | >>> |
Description | Left Shift | Arithmetic Right (Sign-extended right shift) | Logical Right (Zero-extended right shift) |
A | 10101101 | 10101101 | 10101100 |
B | 3 | 3 | 3 |
A op B | 01101000 | 11110101 | 00010101 |
Algorithms
General
Complexity
We will define cost first.
In engineering scenarios, there are lots of costs, including:
- Operational cost(for programs, time to run, space requirements).
- Development costs: How much engineering time? When delivered?
- Maintenance costs: Upgrades, bug fixes.
- Costs of failure: How robust? How safe?
Currently stuck in lecture16 Complexity
In 61B we use asymptotic bound to determine the time complexity of a program
What does this mean?
We will use notion where is an coefficient-free function based on N, the size of input of function f.
We have being the upper bound (), being the tight bound () and being the lower bound()
Amortized Time 平摊时间
Note that there's also the concept of amortized time: actual costs of a sequence of N operations are , which may differ from each other by arbitrary amounts and where .
Now if there exist another sequence where .
If then we will say the operations all run in amortized time
Tree
Traversal Order
Depth-first traversal (dotted path) of a binary tree:
- Pre-order (node visited at position red ●): F, B, A, D, C, E, G, I, H;
- In-order (node visited at position green ●): A, B, C, D, E, F, G, H, I;
- Post-order (node visited at position blue ●): A, C, E, D, B, H, I, G, F.
Source: https://en.wikipedia.org/wiki/Tree_traversal
Hashing
Characteristics
Takes in a binary stream and outputs a fixed-length interger
- Must output exactly the same integer when the same binary stream is given
- For it to be good:
- Very hard to find two binary streams that outputs the same fixed-length integer (avoid hash collision)
Hash Table
Array called "buckets" to store Linked List of items
Load Factor = where N is the number of items stored and M is the bucket size
Open Addressing
- Linear probes: If there's a collision at , try
- Quadratic Probes: try
- Double hashing: try
Heapify:
Heap can be viewed as array representing a priority queue, which the 0th item of the array is discarded.
Heapify: we have an array already, and we can heapify from the middle index downward moving up an index for each operation
Merge Sort
divide data in 2 equal parts; recursively sort halves ⇒ merge results
Quicksort
Partition data into pieces, everything > a pivot value at the high end of the sequence to be sorted, and everything ≤ on the low end
Usually the pivot is chosen as the median between first, middle, and last element
When the array is small enough, use insertion sort
Insertion Sort
Start with empty sequence of outputs, add each item from input, inserting into output sequence at right point.
Inversion Sort
Shell's Sort
Radix Sort
Sort keys one character at a time
we have
- LSD(Least Significant Digit) radix sort (Direction: ←)
- MSD radix sort (Direction →)
- Must keep lists from each step seperate
The Trie 字典树
Multi-way decision tree with one decision per character of key
Skip List
heights are random
Red-Black Tree 红黑树
Properties:
- Root is black
- Every leaf node is NULL and is black
- Every leaf node has same number of black ancestors
- From any point on graph to every leaf, we pass through same number of black nodes
- Every internal node has two children
- Every red node has two black children.
- Guarentees searches
Dijkstra's Algorithm
Use priority queue to update distance to every node.
A* Search A* 寻路算法
A hueristic distance is needed, but for consistance the heuristic funciton has to be:
- Has to follow triangular inequality
-
- Not too high
- having for every heuristic value makes it a dijkstra
We will search just like what dijkstra, but we will stop when we take target from queue.
Minimum Spanning Trees
Prim's Algorithm
Start from an arbitraty node, at each step we added the shortest edge connecting some node already in the tree to one that isn't yet.
So obtain a subgroup and another subgroup that contains vertices not selected in the first subgroup. Whatever edges that hasn't been selected yet that connects the first group with the second and has the smallest weight will be selected.
Kruskal's Algorithm,
Sort the edges by their weights, everytime check if the smallest edge would add a cycle to the graph. If it doesn't then it gets added to the graph.
Compression
Shannon-Fano Coding
- Count frequencies of all characters in text to be compressed
- Break grouped characters into groups of roughly equal frequency
- Encode left group with leading 1, right group with leading 1
- Repeat until all groups are of size 1
Huffman Coding 哈夫曼编码
- Put each symbol in a node labeled with the symbol's relative frequency
- Repeat the following until there is just one node:
- Combine the two nodes with smallest frequencies as childre of a new single node whose frequency is the sum of those of the two nodes being combined.
- Let the edge to the left child be labeled as "0" and to the right be labeled as "1"
LZW Coding (multiple symbols per codeword)
Encoding:
- Start with a trivial mapping of codewords to single symbols
- After outputting a codeword that matches the longest possible prefix X, of the remaining input, add a new codeword Y that maps to the substring X followed by the next input symbol.
Decoding:
- Reconstruct the table as we process
- ⇒ The symbol encoded by the codeword X
- ⇒ character k of string Y
- For each codeword X, add to our result
- Whenever we decoded two consecutive codewords, X1 and X2, add a new codeword that maps to