Control Theory: MATH4406 / MATH7406

Taught by: Yoni Nazarathy , Julia Kuhn (tutor), Brendan Patch (tutor)

Semester 2, 2014.

This is the 2014 course web-site.
The current course web-site (2016) is here.

Overview:

This is a course designed to introduce several aspects of mathematical control theory with a focus on Markov Decision Processes (MDP), also known as Discrete Stochastic Dynamic Programming. In this edition of the course (2014), the course mostly follows selected parts of Martin Puterman’s book, “Markov Decision Processes”. After understanding basic ideas of dynamic programming and control theory in general, the emphasis is shifted towards mathematical detail associated with MDP. The course begins with a background unit, ensuring students have enough knowledge of probability and Markov chains. This unit is followed by 9 additional units, of which 6 units focus on MDP and the other 3 units are aimed to giving the student a broader view of mathematical problems associated with control theory.

This web-page contains a detailed plan of the course as well as links to homework (HW) assignments and other resources. The course profile page, available to UQ students can be accessed through here. Some HW assignments may require computation. In that case, MATLAB is a natural tool for control theory, yet other software packages may be used as well (see software).

 

Key books and resources:

·        [Put94] Puterman ML. Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. UQ Library link (online).

·         [AstKum14] Åström KJ, Kumar PR. "Control: A perspective." Automatica 50.1 (2014): 3-43. Link to paper.

·         [BaurRied11] Bäuerle N,  Rieder U. Markov Decision Processes with Applications to Finance. 2011. UQ Library link (online).

·         [Ber00] Bertsekas DP. Dynamic Programming and Optimal Control. Vol 1. And Vol 2.  Belmont, Mass: Athena Scientific; 2000.   UQ Library link.

·         [KaeLitCass98] Kaelbling LP, Littman ML, Cassandra AR, Planning and acting in partially observable stochastic domains. Artificial intelligence 101.1 (1998): 99-134. Link to paper.

·         [Lue79] Luenberger DG. Introduction to Dynamic Systems: Theory, Models, and Applications. New York: Wiley; 1979.   UQ Library link.

·         [Mon82] Monahan GE. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms. Management Science 28.1 (1982): 1-16. Link to paper.

·         [PerMar10] Albertos Pérez P, Mareels I. Feedback and Control for Everyone. Berlin: Springer Berlin Heidelberg; 2010.   UQ Library link (online).

 

Other useful books and resources:

·         The 2012 Edition of this course.

·         [AntMich07] Antsaklis PJ, Michel AN. A Linear Systems Primer. Boston, Mass: Birkhäuser Boston; 2007.   UQ Library link (online).

·         [Kir70] Kirk DE. Optimal Control Theory: An Introduction. Englewood Cliffs, N.J: Prentice-Hall; 1970.   UQ Library link.

·         [KwaSiv91] Kwakernaak H, Sivan R, Modern Signals and Systems. UQ Library link

·         [PolWil98] Jan Willem Polderman, Jan C. Willems, Introduction to Mathematical Systems Theory. Vol 26. New York; Berlin: Springer-Verlag; 1998. UQ Library link.

·         [SivKwa72] Sivan R, Kwakernaak H. Linear Optimal Control Systems. New York: Wiley Interscience; 1972.   UQ Library link   On-line through IEEE

·         [Son90] Sontag ED. Mathematical Control Theory: Deterministic Finite Dimensional Systems. Vol 6. New York ; Berlin: Springer-Verlag; 1990.   UQ Library link.

 

Schedule:

·        The course is split up into 10 units (0—9). Unit 0 is probability background. The other units are about control.

·        There are 13 weeks and 4 hourly meetings per week, they are referenced as x.y, where x is in {1,…,13} and y is in {1,2,3,4}.   Note that units are NOT directly mapped to weeks. It is often that the HW and/or quiz associated with a part, is handled a few weeks later.

·        There are 4 in class quizzes (45 min each), 7 HW assignments and an additional “course summary” assignment.

·        Meetings of the form x.4 are typically (but not always) tutorial meetings where students may either ask questions about the HW or participate in a quiz.

·        The teaching format varies based on the material and the speed and depth in which it is to be covered.

 

Unit

Teaching Format and Key Resources

Outline

Relevant Literature

Lecture Meetings

HW/Quiz
Meetings

Home Assignments

Quizzes

Comments
and Additional Resources

0

Lecture notes, worked on board,

First 3 hours by Dirk Kroese, because Yoni is away.

 

Basic Probability and Markov Chains Notes (now it is v8)

 

Students who did not study probability previously will each receive an hour of one on one support from Brendan Patch .

 

From Class:

SimpleDiscreteQueue.nb (Mathematica file).

 

Photos from 2.3

Probability and Markov Chain Background

 

Kroese: 1.1, 1.2, 1.3, 

 Nazarathy: 2.1, 2.2, 2.3

HW1 pre-submit: 1.4, 2.4
HW1 post-submit: 3.4
Quiz1: 4.4 (Aug 19)

HW1
Due: Aug 12

 

Partial Solutions by Brendan: BrendanHW1partialSolutions.pdf

 

Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):


MadeleineNargar1.pdf

 

DanielSutherland1.pdf

 

PatrickLaub1.pdf

Quiz 1 with solution

First (probability) chapters of a book by Kroese and Chan.

 

Illustrated probability notes and exercises by Kroese

 

Probability Notes by Richard Weber (Cambridge)

 

Markov Chain Notes by Richard Weber (Cambridge)

 

Phil Pollett’s STAT3004 (intro to stoch. proc.) course at UQ

1

Lecture slides
Lecture slides

From Class:
PhotosFrom3.pdf

Introduction and overview to the various concepts control theory

[AstKum14]: All
[PerMar10]: All
[Put94]: 1

3.1, 3.2

Assessed only through part of course summary.

 

 

Flyball governor video

 

Wolfram demo - PID InvertedPendulumControls

 

Wolrfram demo - PID spring mass

 

2

On board, NOT following references closely.

PromotionMDPStateSpace.pdf

From Class:
Lects4.pdf (big ugly file).

MDP Model Formulation and basic examples and computation

[Put94]: 2.1, 2.2, 3
[Ber00]:1.1, 1.2

3.3, 4.1, 4.2

HW2: 5.4

HW2.pdf
Due: Sep 2

Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):

JoshuaSong2.pdf

 

MadeleineNargar2.pdf

 

SeanWatson2.pdf

 

 

3

On board, following [Put94] closely.


Finite Horizon MDP

[Put94]: 4.1.2, 4.2, 4.3, 4.4, 4.5, 4.6.1, 4.6.4

4.3, 5.1, 5.2, 5.3

HW3: 6.4
Quiz 2: 7.4
(Sep 9)

HW3.pdf
Due: Sep 9

 

Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):

 

JoshuaSong3.pdf

 

MadeleineNargar3.pdf

 

SeanWatson3.pdf

 

TrentSkorka3.pdf

Quiz 2 with solution

Richard Weber's Optimization and Control Notes

4

On board, following [Put94] closely.


Infinite Horizon MDP

[Put94]: 5.1, 5.2, 5.3, 5.4.1, 5.4.3, 5.5,

6.1, 6.2, 6.3

 

 

5

On board, following [Put94] closely.

Discounted Rewards

[Put94]: 5.6, AppC, 6.1, 6.2, 6.3, 6.4, 6.9

7.1, 7.2, 7.3,
8.1, 8.2, 8.3,
 9.1

HW4: 8.4
HW5: 9.4 (yoni)


Quiz 3: 10.4
 (Oct 7)

HW4.pdf
Due: Sep 26

 

Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):

 

AdamJamesMurray4.pdf

 

JacobRogers4.pdf

 

PatrickLaub4.pdf

 

SeanWatson4.pdf

 

StephenLynch4.pdf

 

TrentSkorka4.pdf

 

RyanHeneghan4.pdf

 

MadeleineNargar4.pdf

 

 

 

HW5.pdf
Due: Oct 7

 

RyanHeneghan5.pdf

 

PatrickLaub5.pdf

 

MadeleineNargar5.pdf

 

 

Quiz 3 with solution

 

6

On board, based on parts of [Put94].

Also lecture notes:  Advanced Regular Finite Markov Chains

Average Rewards

[Put94]: 8.1, 8.2, 8.3, 8.4, 8.6

9.2, 9.3,
10.1, 10.2, 10.3, (10.1—10.3 were cancelled)


Oct14: 11.1, 11.2, 11.3


Oct21: 12.1

HW6:
Oct 14: during lecture (no tut).

Quiz 4:
 Oct 28: 13.2 (this is 13-13:50)


HW6.pdf


Due: Oct 28

 

 

PatrickLaub6.pdf

<< Solution to Quiz 4 on Average Rewards >>

 

7

NOT following references closely. Taught by Julia Kuhn.

 

Also guest Lecture: Hanna Kurniawati


Partially Observable MDP

[Mon82]
[KaeLitCass98]
[BaurRied11]: 5

Oct 21: 12.2, 12.3,
Oct 28: 13.1 (Guest lecture)

HW7:
Oct 21: 12.4

HW7.pdf
Due: Oct 31

 

8

Cancelled.

Deterministic Continuous Time and State Optimal Control

[Ber00]: 3

Cancelled – to be merged with Unit 9

Assessed only through part of course summary.
Due: Nov 7

 

 

 

9

Lecture slides

<< Lecture slides >>

Summary and outlook (other aspects of control theory).

-

13.3, 13.4

 

JoshVolkmannSummary.pdf