1. Precisoin and recall
precision is how precise i am at showing good stuff on my website
recall is how good i am at find all the postive reviews
Predicted y=1 | Predicted y =-1 | |
True label = 1 | true positive | false negative |
True label = -1 | false positive | true negative |
precision = number of true positives / (number of true positives + number of false positives)
recall = number of true positives / (number of true positives + number of false negatives)
Pessimistic model : high precision low recall
Optimistic model: low precision high recall
2. Stochastic ascent
Gradient ascent is slow because every update requires a full pass over data.
Stochastic gradient ascent only use only small subsets of data
Stochastic gradient converges faster than gradient ascent however it is very sensitive to parameters like the step size
Gradient is direction of steepest direction, but any direction that goes up would be useful for ascent.
Stocahstic gradient works for most data points are pointing in an upward direction.
At the end , stochastic ascent oscillates a bit (noisy) around the optimal.
Issues:
1. Systematic order in data can introduce significant bias
- shuffle the data before running stochastic ascent
2. if step size is small, the convergence takes a long time but if large, it oscilate much and behave crazy
- step size that decreases with iterations is very important(Divided by iteration)
3. Never fully converge so do not trust last coefficients
- output the average weghts vector, 1/T(W1+... +WT)