Control Theory: MATH4406 / MATH7406

Taught by: Yoni Nazarathy , Julia Kuhn (tutor), Brendan Patch (tutor)

Semester 2, 2014.

This is the 2014 course web-site.
The current course web-site (2016) is here.


This is a course designed to introduce several aspects of mathematical control theory with a focus on Markov Decision Processes (MDP), also known as Discrete Stochastic Dynamic Programming. In this edition of the course (2014), the course mostly follows selected parts of Martin Puterman’s book, “Markov Decision Processes”. After understanding basic ideas of dynamic programming and control theory in general, the emphasis is shifted towards mathematical detail associated with MDP. The course begins with a background unit, ensuring students have enough knowledge of probability and Markov chains. This unit is followed by 9 additional units, of which 6 units focus on MDP and the other 3 units are aimed to giving the student a broader view of mathematical problems associated with control theory.

This web-page contains a detailed plan of the course as well as links to homework (HW) assignments and other resources. The course profile page, available to UQ students can be accessed through here. Some HW assignments may require computation. In that case, MATLAB is a natural tool for control theory, yet other software packages may be used as well (see software).


Key books and resources:

·        [Put94] Puterman ML. Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. UQ Library link (online).

·         [AstKum14] Åström KJ, Kumar PR. "Control: A perspective." Automatica 50.1 (2014): 3-43. Link to paper.

·         [BaurRied11] Bäuerle N,  Rieder U. Markov Decision Processes with Applications to Finance. 2011. UQ Library link (online).

·         [Ber00] Bertsekas DP. Dynamic Programming and Optimal Control. Vol 1. And Vol 2.  Belmont, Mass: Athena Scientific; 2000.   UQ Library link.

·         [KaeLitCass98] Kaelbling LP, Littman ML, Cassandra AR, Planning and acting in partially observable stochastic domains. Artificial intelligence 101.1 (1998): 99-134. Link to paper.

·         [Lue79] Luenberger DG. Introduction to Dynamic Systems: Theory, Models, and Applications. New York: Wiley; 1979.   UQ Library link.

·         [Mon82] Monahan GE. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms. Management Science 28.1 (1982): 1-16. Link to paper.

·         [PerMar10] Albertos Pérez P, Mareels I. Feedback and Control for Everyone. Berlin: Springer Berlin Heidelberg; 2010.   UQ Library link (online).


Other useful books and resources:

·         The 2012 Edition of this course.

·         [AntMich07] Antsaklis PJ, Michel AN. A Linear Systems Primer. Boston, Mass: Birkhäuser Boston; 2007.   UQ Library link (online).

·         [Kir70] Kirk DE. Optimal Control Theory: An Introduction. Englewood Cliffs, N.J: Prentice-Hall; 1970.   UQ Library link.

·         [KwaSiv91] Kwakernaak H, Sivan R, Modern Signals and Systems. UQ Library link

·         [PolWil98] Jan Willem Polderman, Jan C. Willems, Introduction to Mathematical Systems Theory. Vol 26. New York; Berlin: Springer-Verlag; 1998. UQ Library link.

·         [SivKwa72] Sivan R, Kwakernaak H. Linear Optimal Control Systems. New York: Wiley Interscience; 1972.   UQ Library link   On-line through IEEE

·         [Son90] Sontag ED. Mathematical Control Theory: Deterministic Finite Dimensional Systems. Vol 6. New York ; Berlin: Springer-Verlag; 1990.   UQ Library link.



·        The course is split up into 10 units (0—9). Unit 0 is probability background. The other units are about control.

·        There are 13 weeks and 4 hourly meetings per week, they are referenced as x.y, where x is in {1,…,13} and y is in {1,2,3,4}.   Note that units are NOT directly mapped to weeks. It is often that the HW and/or quiz associated with a part, is handled a few weeks later.

·        There are 4 in class quizzes (45 min each), 7 HW assignments and an additional “course summary” assignment.

·        Meetings of the form x.4 are typically (but not always) tutorial meetings where students may either ask questions about the HW or participate in a quiz.

·        The teaching format varies based on the material and the speed and depth in which it is to be covered.



Teaching Format and Key Resources


Relevant Literature

Lecture Meetings


Home Assignments


and Additional Resources


Lecture notes, worked on board,

First 3 hours by Dirk Kroese, because Yoni is away.


Basic Probability and Markov Chains Notes (now it is v8)


Students who did not study probability previously will each receive an hour of one on one support from Brendan Patch .


From Class:

SimpleDiscreteQueue.nb (Mathematica file).


Photos from 2.3

Probability and Markov Chain Background


Kroese: 1.1, 1.2, 1.3, 

 Nazarathy: 2.1, 2.2, 2.3

HW1 pre-submit: 1.4, 2.4
HW1 post-submit: 3.4
Quiz1: 4.4 (Aug 19)

Due: Aug 12


Partial Solutions by Brendan: BrendanHW1partialSolutions.pdf


Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):






Quiz 1 with solution

First (probability) chapters of a book by Kroese and Chan.


Illustrated probability notes and exercises by Kroese


Probability Notes by Richard Weber (Cambridge)


Markov Chain Notes by Richard Weber (Cambridge)


Phil Pollett’s STAT3004 (intro to stoch. proc.) course at UQ


Lecture slides
Lecture slides

From Class:

Introduction and overview to the various concepts control theory

[AstKum14]: All
[PerMar10]: All
[Put94]: 1

3.1, 3.2

Assessed only through part of course summary.



Flyball governor video


Wolfram demo - PID InvertedPendulumControls


Wolrfram demo - PID spring mass



On board, NOT following references closely.


From Class:
Lects4.pdf (big ugly file).

MDP Model Formulation and basic examples and computation

[Put94]: 2.1, 2.2, 3
[Ber00]:1.1, 1.2

3.3, 4.1, 4.2

HW2: 5.4

Due: Sep 2

Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):









On board, following [Put94] closely.

Finite Horizon MDP

[Put94]: 4.1.2, 4.2, 4.3, 4.4, 4.5, 4.6.1, 4.6.4

4.3, 5.1, 5.2, 5.3

HW3: 6.4
Quiz 2: 7.4
(Sep 9)

Due: Sep 9


Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):









Quiz 2 with solution

Richard Weber's Optimization and Control Notes


On board, following [Put94] closely.

Infinite Horizon MDP

[Put94]: 5.1, 5.2, 5.3, 5.4.1, 5.4.3, 5.5,

6.1, 6.2, 6.3




On board, following [Put94] closely.

Discounted Rewards

[Put94]: 5.6, AppC, 6.1, 6.2, 6.3, 6.4, 6.9

7.1, 7.2, 7.3,
8.1, 8.2, 8.3,

HW4: 8.4
HW5: 9.4 (yoni)

Quiz 3: 10.4
 (Oct 7)

Due: Sep 26


Here are some solutions by students (none are perfect, but all are very nice – there were other good solutions by other students also):




















Due: Oct 7









Quiz 3 with solution



On board, based on parts of [Put94].

Also lecture notes:  Advanced Regular Finite Markov Chains

Average Rewards

[Put94]: 8.1, 8.2, 8.3, 8.4, 8.6

9.2, 9.3,
10.1, 10.2, 10.3, (10.1—10.3 were cancelled)

Oct14: 11.1, 11.2, 11.3

Oct21: 12.1

Oct 14: during lecture (no tut).

Quiz 4:
 Oct 28: 13.2 (this is 13-13:50)


Due: Oct 28




<< Solution to Quiz 4 on Average Rewards >>



NOT following references closely. Taught by Julia Kuhn.


Also guest Lecture: Hanna Kurniawati

Partially Observable MDP

[BaurRied11]: 5

Oct 21: 12.2, 12.3,
Oct 28: 13.1 (Guest lecture)

Oct 21: 12.4

Due: Oct 31




Deterministic Continuous Time and State Optimal Control

[Ber00]: 3

Cancelled – to be merged with Unit 9

Assessed only through part of course summary.
Due: Nov 7





Lecture slides

<< Lecture slides >>

Summary and outlook (other aspects of control theory).


13.3, 13.4