Semi-Markov Decision Processes:
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming, Chapter 11. John Wiley & Sons, Inc. 1994.
Linear Programming for Constrained Problems:
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming, Sections 6.9, 8.8, 9.3. John Wiley & Sons, Inc. 1994.
Approximate Linear Programming and Case Studies:
D. P. de Farias and B. Van Roy, "The Linear Programming Approach to Approximate Dynamic Programming," Operations Research, Vol. 51, No. 6, pp. 850-856, 2003. >>
Carlos Guestrin, Daphne Koller, Ronald Parr and Shobha Venkataraman. "Efficient Solution Algorithms for Factored MDPs," Journal of Artificial Intelligence Research (JAIR), Volume 19, pp. 399-468, 2003. >>
V. F. Farias and B. Van Roy. "Tetris: A Study of Randomized
Constraint Sampling," in Probabilistic and Randomized Methods for
Design Under Uncertainty, G. Calafiore and F. Dabbene, eds., Springer-Verlag,
2006. >>
Dynamic Games and Robust Control:
A. Nilim and L. El Ghaoui. "Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices," EECS ERL Memo MO 04/26, January 2004. >>
D. P. Bertsekas and J. N. Tsitsiklis. Neuro-Dynamic Programming. Chapter 7. Athena Scientific, 1996.
Simulation-based methods and reinforcement learning:
J. N. Tsitsiklis and B. Van Roy, "An Analysis of Temporal-Difference
Learning with Function Approximation," IEEE Transactions on Automatic
Control, Vol. 42, No. 5, May 1997, pp. 674-690. >>
(and other papers on the same site)
Andrew Y. Ng and Stuart Russell. "Algorithms for Inverse Reinforcement
Learning," In Proceedings of the Seventeenth International Conference
on Machine Learning, 2000. >>
Partially Observable Markov Decision Processes:
W. S. Lovejoy. "Computationally Feasible Bounds for
Partially Observed Markov Decision Processes," Operations Research,
39, pp.162-175, 1991.
D. Aberdeen and J. Baxter. "Internal-state Policy-gradient Algorithms
for Infinite-horizon POMDPs," Technical report, RSISE, Australian
National University, 2002.