MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING

Associate Prof. Dr.Sc. Hông Vân Lê

DOWNLOAD PDF of the flyer of the course

DOWNLOAD PDF of the LECTURE NOTES of the course

Machine learning is an interdisciplinary field in the intersection of mathematical statistics and computer sciences. Machine learning studies statistical models and algorithms for deriving predictors, or meaningful patterns from empirical data. Machine learning techniques are applied in search engine, speech recognition and natural language processing, image detection, robotics etc. In our course we address the following questions:

What is the mathematical model of learning? How to quantify the difficulty/hardness/complexity of a learning problem? How to choose a learningmodel and learning algorithm? How to measure success of machine learning?

The syllabus of our course:

Supervised learning, unsupervised learning
Generalization ability of machine learning
Support vector machine, Kernel machine
Neural networks and deep learning
Bayesian machine learning and Bayesian networks.

Recommended Literature.

S. Shalev-Shwart, and S. Ben-David, Understanding Machine Learning:

From Theory to Algorithms, Cambridge University Press, 2014.
Sergios Theodoridis, Machine Learning A Bayesian and Optimization

Perspective, Elsevier, 2015.
M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of Machine

Learning, MIT Press, 2012.
H. V. Lˆe, Mathematical foundations of machine learning, lecture note

http://users.math.cas.cz/hvle/MFML.pdf
During the course we shall discuss topics for term paper assignment which

could be qualified as the exam.

The first meeting shall take place at 10:40 AM Thursday October 2019, in the seminar room MU MFF UK (3rd floor). Anybody

interested in the lecture course please contact me per email hvle [ at] math.cas.cz

for arranging more suitable lecture time.

Location : Address: Institute of Mathematics of Czech Academy of Sciences, Zitna 25, 11567 Praha 1, Czech Republic

CALENDAR OF FUTURE COURSE

Lecture course (NMAG 469, Fall term 2019-2020)

Mathematical foundations of machine learning The first meeting: Octobber 03, Thursday, 10.40-12.10, in the seminar room MU MFF UK (3rd floor).

CONTENTS

Learning, machine learning and artificial intelligence

1.1. Learning, inductive learning and machine learning

1.2. A brief history of machine learning

1.3. Current tasks and types of machine learning

1.4. Basic questions in mathematical foundations of machine

learning

1.5. Conclusion
Statistical models and frameworks for supervised learning

2.1. Discriminative model of supervised learning

2.2. Generative model of supervised learning

2.3. Empirical Risk Minimization and overfittig

2.4. Conclusion
Statistical models and frameworks for unsupervised learning and

reinforcement learning

3.1. Statistical models and frameworks for density estimation

3.2. Statistical models and frameworks for clustering

3.3. Statistical models and frameworks for dimension reduction and

manifold learning

3.4. Statistical model and framework for reinforcement learning

3.5. Conclusion
Fisher metric and maximum likelihood estimator

4.1. The space of all probability measures and total variation norm

4.2. Fisher metric on a statistical model

4.3. The Fisher metric, MSE and Cram´er-Rao inequality

4.4. Efficient estimators and MLE

4.5. Consistency of MLE

4.6. Conclusion
Consistency of a learning algorithm

5.1. Consistent learning algorithm and its sample complexity

5.2. Uniformly consistent learning and VC-dimension

5.3. Fundamental theorem of binary classification

5.4. Conclusions
Generalization ability of a learning machine and model selection

6.1. Covering number and sample complexity

6.2. Rademacher complexities and sample complexity

6.3. Model selection

6.4. Conclusion
Support vector machines

7.1. Linear classifier and hard SVM

7.2. Soft SVM

7.3. Sample complexities of SVM

7.4. Conclusion
Kernel based SVMs

8.1. Kernel trick

8.2. PSD kernels and reproducing kernel Hilbert spaces

8.3. Kernel based SVMs and their generalization ability

8.4. Conclusion
Neural networks

9.1. Neural networks as computing devices

9.2. The expressive power of neural networks

9.3. Sample complexities of neural networks

9.4. Conclusion
Training neural networks

10.1. Gradient and subgradient descend

10.2. Stochastic gradient descend (SGD)

10.3. Online gradient descend and online learnability

10.4. Conclusion
Bayesian machine learning

11.1. Bayesian concept of learning

11.2. Estimating decisions using posterior distributions

11.3. Bayesian model selection

11.4. Conclusion

Appendix A. Some basic notions in probability theory

A.1. Dominating measures and the Radon-Nikodym theorem

A.2. Conditional expectation and regular conditional measure

A.3. Joint distribution and Bayes’ theorem

A.4. Transition measure, Markov kernel, and parameterized

statistical model

Appendix B. Concentration-of-measure inequalities

B.1. Markov’s inequality

B.2. Hoeffding’s inequality

B.3. Bernstein’s inequality

B.4. McDiarmid’s inequality

References