**MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING**

**Associate Prof. Dr.Sc. Hông Vân Lê**

**DOWNLOAD PDF of the flyer of the course**

**DOWNLOAD PDF of the LECTURE NOTES of the course**

Machine learning is an interdisciplinary field in the intersection of mathematical statistics and computer sciences. Machine learning studies statistical models and algorithms for deriving predictors, or meaningful patterns from empirical data. Machine learning techniques are applied in search engine, speech recognition and natural language processing, image detection, robotics etc. In our course we address the following questions:

What is the mathematical model of learning? How to quantify the difficulty/hardness/complexity of a learning problem? How to choose a learningmodel and learning algorithm? How to measure success of machine learning?

**The syllabus of our course:**

- Supervised learning, unsupervised learning
- Generalization ability of machine learning
- Support vector machine, Kernel machine
- Neural networks and deep learning
- Bayesian machine learning and Bayesian networks.

**Recommended Literature.**

- S. Shalev-Shwart, and S. Ben-David, Understanding Machine Learning:

From Theory to Algorithms, Cambridge University Press, 2014. - Sergios Theodoridis, Machine Learning A Bayesian and Optimization

Perspective, Elsevier, 2015. - M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of Machine

Learning, MIT Press, 2012. - H. V. Lˆe, Mathematical foundations of machine learning, lecture note

http://users.math.cas.cz/hvle/MFML.pdf

During the course we shall discuss topics for term paper assignment which

could be qualified as the exam.

The first meeting shall take place at 10:40 AM Thursday October 2019, in the seminar room MU MFF UK (3rd floor). Anybody

interested in the lecture course please contact me per email hvle [ at] math.cas.cz

for arranging more suitable lecture time.

**Location :** Address: Institute of Mathematics of Czech Academy of Sciences, Zitna 25, 11567 Praha 1, Czech Republic

Lecture course (NMAG 469, Fall term 2019-2020)

- Mathematical foundations of machine learning The first meeting: Octobber 03, Thursday, 10.40-12.10, in the seminar room MU MFF UK (3rd floor).

**CONTENTS**

- Learning, machine learning and artificial intelligence

1.1. Learning, inductive learning and machine learning

1.2. A brief history of machine learning

1.3. Current tasks and types of machine learning

1.4. Basic questions in mathematical foundations of machine

learning

1.5. Conclusion - Statistical models and frameworks for supervised learning

2.1. Discriminative model of supervised learning

2.2. Generative model of supervised learning

2.3. Empirical Risk Minimization and overfittig

2.4. Conclusion - Statistical models and frameworks for unsupervised learning and

reinforcement learning

3.1. Statistical models and frameworks for density estimation

3.2. Statistical models and frameworks for clustering

3.3. Statistical models and frameworks for dimension reduction and

manifold learning

3.4. Statistical model and framework for reinforcement learning

3.5. Conclusion - Fisher metric and maximum likelihood estimator

4.1. The space of all probability measures and total variation norm

4.2. Fisher metric on a statistical model

4.3. The Fisher metric, MSE and Cram´er-Rao inequality

4.4. Efficient estimators and MLE

4.5. Consistency of MLE

4.6. Conclusion - Consistency of a learning algorithm

5.1. Consistent learning algorithm and its sample complexity

5.2. Uniformly consistent learning and VC-dimension

5.3. Fundamental theorem of binary classification

5.4. Conclusions - Generalization ability of a learning machine and model selection

6.1. Covering number and sample complexity

6.2. Rademacher complexities and sample complexity

6.3. Model selection

6.4. Conclusion - Support vector machines

7.1. Linear classifier and hard SVM

7.2. Soft SVM

7.3. Sample complexities of SVM

7.4. Conclusion - Kernel based SVMs

8.1. Kernel trick

8.2. PSD kernels and reproducing kernel Hilbert spaces

8.3. Kernel based SVMs and their generalization ability

8.4. Conclusion - Neural networks

9.1. Neural networks as computing devices

9.2. The expressive power of neural networks

9.3. Sample complexities of neural networks

9.4. Conclusion - Training neural networks

10.1. Gradient and subgradient descend

10.2. Stochastic gradient descend (SGD)

10.3. Online gradient descend and online learnability

10.4. Conclusion - Bayesian machine learning

11.1. Bayesian concept of learning

11.2. Estimating decisions using posterior distributions

11.3. Bayesian model selection

11.4. Conclusion

Appendix A. Some basic notions in probability theory

A.1. Dominating measures and the Radon-Nikodym theorem

A.2. Conditional expectation and regular conditional measure

A.3. Joint distribution and Bayes’ theorem

A.4. Transition measure, Markov kernel, and parameterized

statistical model

Appendix B. Concentration-of-measure inequalities

B.1. Markov’s inequality

B.2. Hoeffding’s inequality

B.3. Bernstein’s inequality

B.4. McDiarmid’s inequality

References

**MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING**

**Associate Prof. Dr.Sc. Hông Vân Lê**

**DOWNLOAD PDF of the flyer of the course**

**DOWNLOAD PDF of the LECTURE NOTES of the course**

Machine learning is an interdisciplinary field in the intersection of mathematical statistics and computer sciences. Machine learning studies statistical models and algorithms for deriving predictors, or meaningful patterns from empirical data. Machine learning techniques are applied in search engine, speech recognition and natural language processing, image detection, robotics etc. In our course we address the following questions:

What is the mathematical model of learning? How to quantify the difficulty/hardness/complexity of a learning problem? How to choose a learningmodel and learning algorithm? How to measure success of machine learning?

**The syllabus of our course:**

- Supervised learning, unsupervised learning
- Generalization ability of machine learning
- Support vector machine, Kernel machine
- Neural networks and deep learning
- Bayesian machine learning and Bayesian networks.

**Recommended Literature.**

- S. Shalev-Shwart, and S. Ben-David, Understanding Machine Learning:

From Theory to Algorithms, Cambridge University Press, 2014. - Sergios Theodoridis, Machine Learning A Bayesian and Optimization

Perspective, Elsevier, 2015. - M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations of Machine

Learning, MIT Press, 2012. - H. V. Lˆe, Mathematical foundations of machine learning, lecture note

http://users.math.cas.cz/hvle/MFML.pdf

During the course we shall discuss topics for term paper assignment which

could be qualified as the exam.

The first meeting shall take place at 10:40 AM Thursday October 2019, in the seminar room MU MFF UK (3rd floor). Anybody

interested in the lecture course please contact me per email hvle [ at] math.cas.cz

for arranging more suitable lecture time.

**Location :** Address: Institute of Mathematics of Czech Academy of Sciences, Zitna 25, 11567 Praha 1, Czech Republic

Lecture course (NMAG 469, Fall term 2019-2020)

- Mathematical foundations of machine learning The first meeting: Octobber 03, Thursday, 10.40-12.10, in the seminar room MU MFF UK (3rd floor).

**CONTENTS**

- Learning, machine learning and artificial intelligence

1.1. Learning, inductive learning and machine learning

1.2. A brief history of machine learning

1.3. Current tasks and types of machine learning

1.4. Basic questions in mathematical foundations of machine

learning

1.5. Conclusion - Statistical models and frameworks for supervised learning

2.1. Discriminative model of supervised learning

2.2. Generative model of supervised learning

2.3. Empirical Risk Minimization and overfittig

2.4. Conclusion - Statistical models and frameworks for unsupervised learning and

reinforcement learning

3.1. Statistical models and frameworks for density estimation

3.2. Statistical models and frameworks for clustering

3.3. Statistical models and frameworks for dimension reduction and

manifold learning

3.4. Statistical model and framework for reinforcement learning

3.5. Conclusion - Fisher metric and maximum likelihood estimator

4.1. The space of all probability measures and total variation norm

4.2. Fisher metric on a statistical model

4.3. The Fisher metric, MSE and Cram´er-Rao inequality

4.4. Efficient estimators and MLE

4.5. Consistency of MLE

4.6. Conclusion - Consistency of a learning algorithm

5.1. Consistent learning algorithm and its sample complexity

5.2. Uniformly consistent learning and VC-dimension

5.3. Fundamental theorem of binary classification

5.4. Conclusions - Generalization ability of a learning machine and model selection

6.1. Covering number and sample complexity

6.2. Rademacher complexities and sample complexity

6.3. Model selection

6.4. Conclusion - Support vector machines

7.1. Linear classifier and hard SVM

7.2. Soft SVM

7.3. Sample complexities of SVM

7.4. Conclusion - Kernel based SVMs

8.1. Kernel trick

8.2. PSD kernels and reproducing kernel Hilbert spaces

8.3. Kernel based SVMs and their generalization ability

8.4. Conclusion - Neural networks

9.1. Neural networks as computing devices

9.2. The expressive power of neural networks

9.3. Sample complexities of neural networks

9.4. Conclusion - Training neural networks

10.1. Gradient and subgradient descend

10.2. Stochastic gradient descend (SGD)

10.3. Online gradient descend and online learnability

10.4. Conclusion - Bayesian machine learning

11.1. Bayesian concept of learning

11.2. Estimating decisions using posterior distributions

11.3. Bayesian model selection

11.4. Conclusion

Appendix A. Some basic notions in probability theory

A.1. Dominating measures and the Radon-Nikodym theorem

A.2. Conditional expectation and regular conditional measure

A.3. Joint distribution and Bayes’ theorem

A.4. Transition measure, Markov kernel, and parameterized

statistical model

Appendix B. Concentration-of-measure inequalities

B.1. Markov’s inequality

B.2. Hoeffding’s inequality

B.3. Bernstein’s inequality

B.4. McDiarmid’s inequality

References