GOFAI GOOD OLD-FASHIONED ARTIFICIAL INTELLIGENCE
Supplementary reading:
K. Gurney,
An Introduction to Neural Networks CRC (August 5, 1997), ISBN-10: 1857285034
M.A. Arbib, 1989, The Metaphorical Brain 2: Neural Networks and Beyond, Wiley-Interscience.
M.A. Arbib, Ed., 1995, The Handbook of Brain Theory and Neural Networks, MIT Press (paperback).
Michael A. Arbib, and Jeffrey Grethe, Editors, 2001, Computing the Brain:
A Guide to Neuroinformatics, and the Project Team of the University of Southern
California Brain Project, San Diego: Academic Press.
A. Weitzenfeld, M.A. Arbib and A. Alexander, 2000, NSL Neural Simulation
Language, MIT Press (in press). [http://www-hbp.usc.edu/_Documentation/NSL/Book/TOC.htm]
Baxter, J., Tridgell, A., Weaver, L. (1998). KnightCap:
A chess program that learns by combining TD() with game-tree search. Proceedings
of the Fifteenth International Conference on Machine Learning, pp. 28-36.
Bertsekas, D. P., and Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific, Belmont, MA.
Crites, R. H., and Barto, A. G. (1996). Improving elevator performance using
reinforcement learning. In Advances in Neural Information Processing Systems
9, pp. 1017-1023. MIT Press, Cambridge, MA.
McCallum, A. K. (1995) Reinforcement Learning with Selective Perception and Hidden State. University of Rochester PhD. thesis.
Nie, J., and Haykin, S. (1996). A dynamic channel assignment policy through
Q-learning. CRL Report 334. Communications Research Laboratory, McMaster
University, Hamilton, Ontario.
Precup, D., Sutton, R.S. (1998). Multi-time models for temporally abstract
planning. Advances in Neural Information Processing Systems 11. MIT Press,
Cambridge, MA.
Singh, S. P., and Bertsekas, D. (1997). Reinforcement learning for dynamic
channel allocation in cellular telephone systems. In Advances in Neural Information
Processing Systems 10, pp. 974-980. MIT Press, Cambridge, MA.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44.
Sutton, R. S., and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
Sutton, R. S., Precup, D., Singh, S. (1998). Between MDPs and semi-MDPs:
Learning, planning, and representing knowledge at multiple temporal scales.
Technical Report 98-74, Department of Computer Science, University of Massachusetts.
Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38:58-68.
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. Ph.D. thesis, Cambridge University.
Zhang, W., and Dietterich, T. G. (1996). High-performance job-shop scheduling
with a time-delay TD network. In Advances in Neural Information Processing
Systems 9, pp. 1024-1030. MIT Press, Cambridge, MA.