课讲内容
这门课以8周设计,分成 4个核心问题,每个核心问题约需2周的时间来探讨.每个约2个小时的录影中,每个小时为一个主题,以会各分成4到5个小段落,每个段落里会有一个后多个随堂的练习.我们在探讨每个核心问题的第二周。依上所述,課程的規畫如下:
When Can Machines Learn? [何时可以使用机器学习]
- 第一周:(NTU-Coursera机器学习:机器学习问题与二元分类)
第一讲:The Learning Problem [机器学习问题]
第二讲:Learning to Answer Yes/No [二元分类] - 第二周:
作业一: (NTU-Coursera机器学习:HomeWork 1 Q15-20)
第三讲:Types of Learning [各式機器學習問題] (NTU-Coursera机器学习:Types of Learning )
第四讲:Feasibility of Learning [机器学习的可行性] (NTU-Coursera机器学习:机器学习的可行性 & 训练与测试 )
Why Can Machines Learn? [为什么机器可以学习]
- 第三週:
第五讲:Training versus Testing [训练与测试] (NTU-Coursera机器学习:机器学习的可行性 & 训练与测试)
第六讲:Theory of Generalization [举一反三的一般化理论] - 第四週:
作业二: (NTU-Coursera机器学习:HomeWork 2 Q16-20)
第七讲:The VC Dimension (NTU-Coursera机器学习:VC Bound和VC维度)
第八讲:Noise and Error (NTU-Coursera机器学习:Noise and Error)
How Can Machines Learn? [机器可以怎么样学习]
- 第五週:
第九讲:Linear Regression [線性迴歸]
第十讲:`Soft‘ Classification [軟性分類] - 第六週:
十一讲:Multiclass Classification [多類別分類]
十二讲:Nonlinear Transformation [非線性轉換]
How Can Machines Learn Better? [機器可以怎麼樣學得更好]
- 第七週:
十三讲:Hazard of Overfitting [過度訓練的危險]
十四讲:Regularization [探制調適] - 第八週:
十五讲:Validation [自我檢測]
十六讲:Three Learning Principles [三個機器學習的重要原則]
延伸阅读
預备知识
- 作业零(機率統計、線性代數、微分之基本知識)
参考书籍
- Learning from Data: A Short Course, Abu-Mostafa, Magdon-Ismail, Lin, 2013.
经典文献
- F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386-408, 1958. (第二講:Perceptron 的出處)
- W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. (第四講:Hoeffding‘s Inequality)
- Y. S. Abu-Mostafa, X. Song , A. Nicholson, M. Magdon-ismail. The bin model, 1995. (第四講:bin model 的出處)
- V. Vapnik. The nature of statistical learning theory, 2nd edition, 2000. (第五到八講:VC dimension 與 VC bound 的完整數學推導及延伸)
- Y. S. Abu-Mostafa. The Vapnik-Chervonenkis dimension: information versus complexity in learning. Neural Computation, 1(3):312-317, 1989. (第七講:VC Dimension 的概念與重要性)
參考文獻
- A. Sadilek, S. Brennan, H. Kautz, and V. Silenzio. nEmesis: Which restaurants should you avoid today? First AAAI Conference on Human Computation and Crowdsourcing, 2013. (第一講:ML 在「食」的應用)
- Y. S. Abu-Mostafa. Machines that think for themselves. Scientific American, 289(7):78-81, 2012. (第一講:ML 在「衣」的應用)
- A. Tsanas, A. Xifara. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49: 560-567, 2012. (第一講:ML 在「住」的應用)
- J. Stallkamp, M. Schlipsing, J. Salmen, C. Igel. Introduction to the special issue on machine learning for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems 13(4): 1481-1483, 2012. (第一講:ML 在「行」的應用)
- R. Bell, J. Bennett, Y. Koren, and C. Volinsky. The million dollar programming prize. IEEE Spectrum, 46(5):29-33, 2009. (第一講:Netflix 大賽)
- S. I. Gallant. Perceptron-based learning algorithms. IEEE Transactions on Neural Networks, 1(2):179-191, 1990. (第二講:pocket 的出處,注意到實際的 pocket 演算法比我們介紹的要複雜)
- R. Xu, D. Wunsch II. Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645-678, 2005. (第三講:Clustering)
- X. Zhu. Semi-supervised learning literature survey. University of Wisconsin Madison, 2008. (第三講:Semi-supervised)
- Z. Ghahramani. Unsupervised learning. In Advanced Lectures in Machine Learning (MLSS ’03), pages 72–112, 2004. (第三講:Unsupervised)
- L. Kaelbling, M. Littman, A. Moore. reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4: 237-285. (第三講:Reinforcement)
- A. Blum. On-Line algorithms in machine learning. Carnegie Mellon University,1998. (第三講:Online)
- B. Settles. Active learning literature survey. University of Wisconsin Madison, 2010. (第三講:Active)
- D. Wolpert. The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7): 1341-1390. (第四講:No free lunch 的正式版)
- T. M. Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 14(3):326–334, 1965. (第五到六講:Growth Function)
- B. Zadrozny, J. Langford, N. Abe. Cost sensitive learning by cost-proportionate example weighting. IEEE International Conference on Data Mining, 2003. (第八講:Weighted Classification)
- G. Sever, A. Lee. Linear Regression Analysis, 2nd Edition, Wiley, 2003. (第九講:Linear Regression 由統計學的角度來分析;第十二到十三講:Polynomial Transform 後再做 Linear Regression)
- D. C. Hoaglin, R. E. Welsch. The hat matrix in regression and ANOVA. American Statistician, 32:17–22, 1978. (第九講:Linear Regression 的 Hat Matrix)
- D. W. Hosmer, Jr., S. Lemeshow, R. X. Sturdivant. Applied Logistic Regression, 3rd Edition, Wiley, 2013 (第十講:Logistic Regression 由統計學的角度來分析)
- T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. International Conference on Machine Learning, (第十一講:Stochastic Gradient Descent 用在線性模型的理論分析)
- R. Rifkin, A. Klautau. In Defense of One-Vs-All Classification. Journal of Machine Learning Research, 5: 101-141, 2004. (第十一講:One-versus-all)
- J. Fürnkranz. Round Robin Classification. Journal of Machine Learning Research, 2: 721-747, 2002. (第十一講:One-versus-one)
- L. Li, H.-T. Lin. Optimizing 0/1 loss for perceptrons by random coordinate descent. In Proceedings of the 2007 International Joint Conference on Neural Networks (IJCNN ’07), pages 749–754, 2007. (第十一講:一個由最佳化角度出發的 Perceptron Algorithm)
- G.-X. Yuan, C.-H. Ho, C.-J. Lin. Recent advances of large-scale linear classification. Proceedings of IEEE, 2012. (第十一講:更先進的線性分類方法)
- Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, C.-J. Lin. Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11(2010), 1471-1490. (第十二講:一個使用多項式轉換加上線性分類模型的方法)
- M. Magdon-Ismail, A. Nicholson, Y. S. Abu-Mostafa. Learning in the presence of noise. In Intelligent Signal Processing. IEEE Press, 2001. (第十三講:Noise 和 Learning)
- A. Neumaier, Solving ill-conditioned and singular linear systems: A tutorial on regularization, SIAM Review 40 (1998), 636-666. (第十四講:Regularization)
- T. Poggio, S. Smale. The mathematics of learning: Dealing with data. Notices of the American Mathematical Society, 50(5):537–544, 2003. (第十四講:Regularization)
- P. Burman. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76(3): 503–514, 1989. (第十五講:Cross Validation)
- R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial intelligence (IJCAI ’95), volume 2, 1137–1143, 1995. (第十五講:Cross Validation)
- A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Occam’s razor. Information Processing Letters, 24(6):377–380, 1987. (第十六講:Occam‘s Razor)
关于Machine Learning更多讨论与交流,敬请关注本博客和新浪微博songzi_tea.
时间: 2024-10-12 18:56:44