MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING
Associate Prof. Dr.Sc. Hông Vân Lê
DOWNLOAD PDF of the flyer of the course
DOWNLOAD PDF of the LECTURE NOTES of the course
Machine learning is an interdisciplinary field in the intersection of mathematical statistics and computer sciences. Machine learning studies statistical models and algorithms for deriving predictors, or meaningful patterns from empirical data. Machine learning techniques are applied in search engine, speech recognition and natural language processing, image detection, robotics etc. In our course we address the following questions:
What is the mathematical model of learning? How to quantify the difficulty/hardness/complexity of a learning problem? How to choose a learningmodel and learning algorithm? How to measure success of machine learning?
The syllabus of our course:
Supervised learning, unsupervised learning
Generalization ability of machine learning
Support vector machine, Kernel machine
Neural networks and deep learning
Bayesian machine learning and Bayesian networks.
Recommended Literature.
S. Shalev-Shwart, and S. Ben-David, Understanding Machine Learning:
From Theory to Algorithms, Cambridge University Press, 2014.
Sergios Theodoridis, Machine Learning A Bayesian and Optimization
Perspective, Elsevier, 2015.
M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of Machine
Learning, MIT Press, 2012.
H. V. Lˆe, Mathematical foundations of machine learning, lecture note
http://users.math.cas.cz/hvle/MFML.pdf
During the course we shall discuss topics for term paper assignment which
could be qualified as the exam.
The first meeting shall take place at 10:40 AM Thursday October 2019, in the seminar room MU MFF UK (3rd floor). Anybody
interested in the lecture course please contact me per email hvle [ at] math.cas.cz
for arranging more suitable lecture time.
Location : Address: Institute of Mathematics of Czech Academy of Sciences, Zitna 25, 11567 Praha 1, Czech Republic
CALENDAR OF FUTURE COURSE
Lecture course (NMAG 469, Fall term 2019-2020)
Mathematical foundations of machine learning The first meeting: Octobber 03, Thursday, 10.40-12.10, in the seminar room MU MFF UK (3rd floor).
CONTENTS
Learning, machine learning and artificial intelligence
1.1. Learning, inductive learning and machine learning
1.2. A brief history of machine learning
1.3. Current tasks and types of machine learning
1.4. Basic questions in mathematical foundations of machine
learning
1.5. Conclusion
Statistical models and frameworks for supervised learning
2.1. Discriminative model of supervised learning
2.2. Generative model of supervised learning
2.3. Empirical Risk Minimization and overfittig
2.4. Conclusion
Statistical models and frameworks for unsupervised learning and
reinforcement learning
3.1. Statistical models and frameworks for density estimation
3.2. Statistical models and frameworks for clustering
3.3. Statistical models and frameworks for dimension reduction and
manifold learning
3.4. Statistical model and framework for reinforcement learning
3.5. Conclusion
Fisher metric and maximum likelihood estimator
4.1. The space of all probability measures and total variation norm
4.2. Fisher metric on a statistical model
4.3. The Fisher metric, MSE and Cram´er-Rao inequality
4.4. Efficient estimators and MLE
4.5. Consistency of MLE
4.6. Conclusion
Consistency of a learning algorithm
5.1. Consistent learning algorithm and its sample complexity
5.2. Uniformly consistent learning and VC-dimension
5.3. Fundamental theorem of binary classification
5.4. Conclusions
Generalization ability of a learning machine and model selection
6.1. Covering number and sample complexity
6.2. Rademacher complexities and sample complexity
6.3. Model selection
6.4. Conclusion
Support vector machines
7.1. Linear classifier and hard SVM
7.2. Soft SVM
7.3. Sample complexities of SVM
7.4. Conclusion
Kernel based SVMs
8.1. Kernel trick
8.2. PSD kernels and reproducing kernel Hilbert spaces
8.3. Kernel based SVMs and their generalization ability
8.4. Conclusion
Neural networks
9.1. Neural networks as computing devices
9.2. The expressive power of neural networks
9.3. Sample complexities of neural networks
9.4. Conclusion
Training neural networks
10.1. Gradient and subgradient descend
10.2. Stochastic gradient descend (SGD)
10.3. Online gradient descend and online learnability
10.4. Conclusion
Bayesian machine learning
11.1. Bayesian concept of learning
11.2. Estimating decisions using posterior distributions
11.3. Bayesian model selection
11.4. Conclusion
Appendix A. Some basic notions in probability theory
A.1. Dominating measures and the Radon-Nikodym theorem
A.2. Conditional expectation and regular conditional measure
A.3. Joint distribution and Bayes’ theorem
A.4. Transition measure, Markov kernel, and parameterized
statistical model
Appendix B. Concentration-of-measure inequalities
B.1. Markov’s inequality
B.2. Hoeffding’s inequality
B.3. Bernstein’s inequality
B.4. McDiarmid’s inequality
References