## Andrew Ng machine learning 课程笔记--贝叶斯统计正则化

Source

Bayesian statistics:it turns out if you add this prior term there,it turns out that the authorization objective you end up optimizing turns out to be this.where you add an extra term that,you know ,penalizas your parameter theta as being large.this algorithm tend to keep your parameters small.that strengthening the parameters has the sffect of keeping the functions you fit to be smoother and less likely to overfit.like logistic regresson would be very much prone to overfitting.but it turns out that with this sort of baysian regularation,with Gaussian,logistic regression becomes a very sffective text classification algorithm .

Regularation:

Online learning:in which you have to make predictions even while you are in the process of learning.

Advice for applying machine learning:

Diagnostics for debugging learning algorithms:

Sort of talk briefly about error analyses and ablative analysis:

Advice for how to get started on a machine learning problem:

In case 1,let's say that J of SVM is,indeed,is greater than J of BLR ,but we know that Bayesian logistic regression was trying to maximize J of theta,that's the definition of Bayesian logistic regression.so this means that theta the value of theta output that Bayesian logistic regression actually fails to maximize J because the support back to machine actually returned the value of the theta that you know a better job out-maximizing J.and so,this tells me that Bayesian logistic regression did not actually maximize J correctly,and so the problem is with the optimization algorithm.the optimization algorithm has not converged.the other case is as follows,this means that Bayesian logistic regression  actually attains the higher value for the optimization objective J than does SVM.the svm,what does worse on your optimization problem,actually does better on the weighted accurcy measure.so what this means is that  something that does worse on your optimization objective,on J.can actually do better on the weighted accurcy objective.and this really means that maximizing J of theta,does not really correspond that well to maximizing your weighted accurcy critera.and that is tell you that J of theta is maybe the wrong optimization objective to be maximizing.

 希曼要拍成电影了“赐予我力量吧，我是希曼！” 偷鸡不成蚀把米！澳男行窃JB-HIFI不成 反从屋顶坠落摔成重伤！ 1600人被偷拍直播，30家旅店被装摄像头！韩国又现惊人“偷拍门”… 在海外卖房已经缴过的税，澳洲到底认不认？ “有瑕疵的水果一样好吃！”南澳开启‘冰雹英雄’行动 呼吁消费者别因颜值错失美味！