From: saswss@unx.sas.com (Warren Sarle)
Newsgroups: comp.ai.neural-nets,comp.answers,news.answers
Subject: comp.ai.neural-nets FAQ, Part 4 of 7: Books, data, etc.
Supersedes: <nn4.posting_1027964396@hotellng.unx.sas.com>
Followup-To: comp.ai.neural-nets
Date: 30 Dec 2002 21:40:01 GMT
Organization: SAS Institute Inc., Cary, NC, USA
Lines: 2900
Approved: news-answers-request@MIT.EDU
Expires: 3 Feb 2003 21:40:00 GMT
Message-ID: <nn4.posting_1041284400@hotellng.unx.sas.com>
Reply-To: saswss@unx.sas.com (Warren Sarle)
NNTP-Posting-Host: hotellng.unx.sas.com
X-Trace: license1.unx.sas.com 1041284401 6115 10.28.2.188 (30 Dec 2002 21:40:01 GMT)
X-Complaints-To: usenet@unx.sas.com
NNTP-Posting-Date: 30 Dec 2002 21:40:01 GMT
Keywords: frequently asked questions, answers
Originator: saswss@hotellng.unx.sas.com
Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!newsfeed.utk.edu!news-hog.berkeley.edu!ucberkeley!newshub.sdsu.edu!news-xfer.cox.net!news.lightlink.com!vienna7.his.com!attws1!ip.att.net!lamb.sas.com!newshost!hotellng.unx.sas.com!saswss
Xref: senator-bedfellow.mit.edu comp.ai.neural-nets:64339 comp.answers:52357 news.answers:243496

Archive-name: ai-faq/neural-nets/part4
Last-modified: 2002-11-24
URL: ftp://ftp.sas.com/pub/neural/FAQ4.html
Maintainer: saswss@unx.sas.com (Warren S. Sarle)

Copyright 1997, 1998, 1999, 2000, 2001, 2002 by Warren S. Sarle, Cary, NC,
USA. Reviews provided by other authors as cited below are copyrighted by
those authors, who by submitting the reviews for the FAQ give permission for
the review to be reproduced as part of the FAQ in any of the ways specified
in part 1 of the FAQ. 

This is part 4 (of 7) of a monthly posting to the Usenet newsgroup
comp.ai.neural-nets. See the part 1 of this posting for full information
what it is all about.

========== Questions ========== 
********************************

Part 1: Introduction
Part 2: Learning
Part 3: Generalization
Part 4: Books, data, etc.

   Books and articles about Neural Networks?
      The Best
         The best of the best
         The best popular introduction to NNs
         The best introductory book for business executives
         The best elementary textbooks
         The best books on using and programming NNs
         The best intermediate textbooks on NNs
         The best advanced textbook covering NNs
         The best book on neurofuzzy systems
         The best comparison of NNs with other classification methods
      Other notable books
         Introductory
         Bayesian learning
         Biological learning and neurophysiology
         Collections
         Combining networks
         Connectionism
         Feedforward networks
         Fuzzy logic and neurofuzzy systems
         General (including SVMs and Fuzzy Logic)
         History
         Knowledge, rules, and expert systems
         Learning theory
         Object oriented programming
         On-line and incremental learning
         Optimization
         Pulsed/Spiking networks
         Recurrent
         Reinforcement learning
         Speech recognition
         Statistics
         Time-series forecasting
         Unsupervised learning
      Books for the Beginner
      Not-quite-so-introductory Literature
      Books with Source Code (C, C++)
      The Worst
   Journals and magazines about Neural Networks?
   Conferences and Workshops on Neural Networks?
   Neural Network Associations?
   Mailing lists, BBS, CD-ROM?
   How to benchmark learning methods?
   Databases for experimentation with NNs?
      UCI machine learning database
      UCI KDD Archive
      The neural-bench Benchmark collection
      Proben1
      Delve: Data for Evaluating Learning in Valid Experiments
      Bilkent University Function Approximation Repository
      NIST special databases of the National Institute Of Standards And
      Technology:
      CEDAR CD-ROM 1: Database of Handwritten Cities, States, ZIP Codes,
      Digits, and Alphabetic Characters
      AI-CD-ROM
      Time series
      Financial data
      USENIX Faces
      Linguistic Data Consortium
      Otago Speech Corpus
      Astronomical Time Series
      Miscellaneous Images
      StatLib

Part 5: Free software
Part 6: Commercial software
Part 7: Hardware and miscellaneous

------------------------------------------------------------------------

Subject: Books and articles about Neural Networks?
==================================================

The following search engines will search many bookstores for new and used
books and return information on availability, price, and shipping charges:

   AddAll: http://www.addall.com/
   Bookfinder: http://www.bookfinder.com/

Clicking on the author and title of most of the books listed in the "Best"
and "Notable" sections will do a search using AddAll.

There are many on-line bookstores, such as:

   Amazon: http://www.amazon.com/
   Amazon, UK: http://www.amazon.co.uk/
   Amazon, Germany: http://www.amazon.de/
   Barnes & Noble: http://www.bn.com/
   Bookpool: http://www.bookpool.com/
   Borders: http://www.borders.com/
   Fatbrain: http://www.fatbrain.com/

The neural networks reading group at the University of Illinois at
Urbana-Champaign, the Artifical Neural Networks and Computational Brain
Theory (ANNCBT) forum, has compiled a large number of book and paper reviews
at http://anncbt.ai.uiuc.edu/, with an emphasis more on cognitive science
rather than practical applications of NNs. 

The Best
++++++++

The best of the best
--------------------

Bishop (1995) is clearly the single best book on artificial NNs. This book
excels in organization and choice of material, and is a close runner-up to
Ripley (1996) for accuracy. If you are new to the field, read it from cover
to cover. If you have lots of experience with NNs, it's an excellent
reference. If you don't know calculus, take a class. I hope a second edition
comes out soon! For more information, see The best intermediate textbooks on
NNs below. 

If you have questions on feedforward nets that aren't answered by Bishop,
try Masters (1993) or Reed and Marks (1999) for practical issues or Ripley
(1996) for theortical issues, all of which are reviewed below. 

The best popular introduction to NNs
------------------------------------

Hinton, G.E. (1992), "How Neural Networks Learn from Experience", Scientific
American, 267 (September), 144-151 (page numbers are for the US edition).
Author's Webpage: http://www.cs.utoronto.ca/DCS/People/Faculty/hinton.html
(official)
and http://www.cs.toronto.edu/~hinton (private)
Journal Webpage: http://www.sciam.com/
Additional Information: Unfortunately that article is not available there.

The best introductory book for business executives
--------------------------------------------------

Bigus, J.P. (1996), Data Mining with Neural Networks: Solving Business
Problems--from Application Development to Decision Support, NY:
McGraw-Hill, ISBN 0-07-005779-6, xvii+221 pages.
The stereotypical business executive (SBE) does not want to know how or why
NNs work--he (SBEs are usually male) just wants to make money. The SBE may
know what an average or percentage is, but he is deathly afraid of
"statistics". He understands profit and loss but does not want to waste his
time learning things involving complicated math, such as high-school
algebra. For further information on the SBE, see the "Dilbert" comic strip. 

Bigus has written an excellent introduction to NNs for the SBE. Bigus says
(p. xv), "For business executives, managers, or computer professionals, this
book provides a thorough introduction to neural network technology and the
issues related to its application without getting bogged down in complex
math or needless details. The reader will be able to identify common
business problems that are amenable to the neural netwrk approach and will
be sensitized to the issues that can affect successful completion of such
applications." Bigus succeeds in explaining NNs at a practical, intuitive,
and necessarily shallow level without formulas--just what the SBE needs.
This book is far better than Caudill and Butler (1990), a popular but
disastrous attempt to explain NNs without formulas. 

Chapter 1 introduces data mining and data warehousing, and sketches some
applications thereof. Chapter 2 is the semi-obligatory
philosophico-historical discussion of AI and NNs and is well-written,
although the SBE in a hurry may want to skip it. Chapter 3 is a very useful
discussion of data preparation. Chapter 4 describes a variety of NNs and
what they are good for. Chapter 5 goes into practical issues of training and
testing NNs. Chapters 6 and 7 explain how to use the results from NNs.
Chapter 8 discusses intelligent agents. Chapters 9 through 12 contain case
histories of NN applications, including market segmentation, real-estate
pricing, customer ranking, and sales forecasting. 

Bigus provides generally sound advice. He briefly discusses overfitting and
overtraining without going into much detail, although I think his advice on
p. 57 to have at least two training cases for each connection is somewhat
lenient, even for noise-free data. I do not understand his claim on pp. 73
and 170 that RBF networks have advantages over backprop networks for
nonstationary inputs--perhaps he is using the word "nonstationary" in a
sense different from the statistical meaning of the term. There are other
things in the book that I would quibble with, but I did not find any of the
flagrant errors that are common in other books on NN applications such as
Swingler (1996). 

The one serious drawback of this book is that it is more than one page long
and may therefore tax the attention span of the SBE. But any SBE who
succeeds in reading the entire book should learn enough to be able to hire a
good NN expert to do the real work. 

The best elementary textbooks
-----------------------------

Fausett, L. (1994), Fundamentals of Neural Networks: Architectures,
Algorithms, and Applications, Englewood Cliffs, NJ: Prentice Hall, ISBN
0-13-334186-0. Also published as a Prentice Hall International Edition, ISBN
0-13-042250-9. Sample software (source code listings in C and Fortran) is
included in an Instructor's Manual. 
Book Webpage (Publisher): http://www.prenhall.com/books/esm_0133341860.html
Additional Information: The mentioned programs / additional support is not
available. Contents:
Ch. 1 Introduction, 1.1 Why Neural Networks and Why Now?, 1.2 What Is a
Neural Net?, 1.3 Where Are Neural Nets Being Used?, 1.4 How Are Neural
Networks Used?, 1.5 Who Is Developing Neural Networks?, 1.6 When Neural Nets
Began: the McCulloch-Pitts Neuron;
Ch. 2 Simple Neural Nets for Pattern Classification, 2.1 General Discussion,
2.2 Hebb Net, 2.3 Perceptron, 2.4 Adaline;
Ch. 3 Pattern Association, 3.1 Training Algorithms for Pattern Association,
3.2 Heteroassociative Memory Neural Network, 3.3 Autoassociative Net, 3.4
Iterative Autoassociative Net, 3.5 Bidirectional Associative Memory (BAM);
Ch. 4 Neural Networks Based on Competition, 4.1 Fixed-Weight Competitive
Nets, 4.2 Kohonen Self-Organizing Maps, 4.3 Learning Vector Quantization,
4.4 Counterpropagation;
Ch. 5 Adaptive Resonance Theory, 5.1 Introduction, 5.2 Art1, 5.3 Art2; 
Ch. 6 Backpropagation Neural Net, 6.1 Standard Backpropagation, 6.2
Variations, 6.3 Theoretical Results;
Ch. 7 A Sampler of Other Neural Nets, 7.1 Fixed Weight Nets for Constrained
Optimization, 7.2 A Few More Nets that Learn, 7.3 Adaptive Architectures,
7.4 Neocognitron; Glossary. 

Review by Ian Cresswell: 

   What a relief! As a broad introductory text this is without any doubt
   the best currently available in its area. It doesn't include source
   code of any kind (normally this is badly written and compiler
   specific). The algorithms for many different kinds of simple neural
   nets are presented in a clear step by step manner in plain English. 

   Equally, the mathematics is introduced in a relatively gentle manner.
   There are no unnecessary complications or diversions from the main
   theme. 

   The examples that are used to demonstrate the various algorithms are
   detailed but (perhaps necessarily) simple. 

   There are bad things that can be said about most books. There are
   only a small number of minor criticisms that can be made about this
   one. More space should have been given to backprop and its variants
   because of the practical importance of such methods. And while the
   author discusses early stopping in one paragraph, the treatment of
   generalization is skimpy compared to the books by Weiss and
   Kulikowski or Smith listed above. 

   If you're new to neural nets and you don't want to be swamped by
   bogus ideas, huge amounts of intimidating looking mathematics, a
   programming language that you don't know etc. etc. then this is the
   book for you. 

   In summary, this is the best starting point for the outsider and/or
   beginner... a truly excellent text. 

Smith, M. (1996). Neural Networks for Statistical Modeling, NY: Van Nostrand
Reinhold, ISBN 0-442-01310-8. 
Apparently there is a new edition I haven't seen yet:
Smith, M. (1996). Neural Networks for Statistical Modeling, Boston:
International Thomson Computer Press, ISBN 1-850-32842-0.
Book Webpage (Publisher): http://www.thompson.com/
Publisher's address: 20 Park Plaza, Suite 1001, Boston, MA 02116, USA.
Smith is not a statistician, but he has made an impressive effort to convey
statistical fundamentals applied to neural networks. The book has entire
brief chapters on overfitting and validation (early stopping and
split-sample validation, which he incorrectly calls cross-validation),
putting it a rung above most other introductions to NNs. There are also
brief chapters on data preparation and diagnostic plots, topics usually
ignored in elementary NN books. Only feedforward nets are covered in any
detail.
Chapter headings: Mapping Functions; Basic Concepts; Error Derivatives;
Learning Laws; Weight Initialization; The Course of Learning: An Example;
Overfitting; Cross Validation; Preparing the Data; Representing Variables;
Using the Model. 

Weiss, S.M. and Kulikowski, C.A. (1991), Computer Systems That Learn, 
Morgan Kaufmann. ISBN 1-55860-065-5. 
Author's Webpage: Kulikowski: 
http://ruccs.rutgers.edu/faculty/kulikowski.html
Book Webpage (Publisher): http://www.mkp.com/books_catalog/1-55860-065-5.asp
Additional Information: Information of Weiss, S.M. are not available.
Briefly covers at a very elementary level feedforward nets, linear and
nearest-neighbor discriminant analysis, trees, and expert sytems,
emphasizing practical applications. For a book at this level, it has an
unusually good chapter on estimating generalization error, including
bootstrapping.

1 Overview of Learning Systems 
    1.1 What is a Learning System? 
    1.2 Motivation for Building Learning Systems 
    1.3 Types of Practical Empirical Learning Systems 
        1.3.1 Common Theme: The Classification Model 
        1.3.2 Let the Data Speak
    1.4 What's New in Learning Methods 
        1.4.1 The Impact of New Technology
    1.5 Outline of the Book 
    1.6 Bibliographical and Historical Remarks

2 How to Estimate the True Performance of a Learning System 
    2.1 The Importance of Unbiased Error Rate Estimation 
    2.2. What is an Error? 
        2.2.1 Costs and Risks
    2.3 Apparent Error Rate Estimates 
    2.4 Too Good to Be True: Overspecialization 
    2.5 True Error Rate Estimation 
        2.5.1 The Idealized Model for Unlimited Samples 
        2.5.2 Train-and Test Error Rate Estimation 
        2.5.3 Resampling Techniques 
        2.5.4 Finding the Right Complexity Fit
    2.6 Getting the Most Out of the Data 
    2.7 Classifier Complexity and Feature Dimensionality 
        2.7.1 Expected Patterns of Classifier Behavior
    2.8 What Can Go Wrong? 
        2.8.1 Poor Features, Data Errors, and Mislabeled Classes 
        2.8.2 Unrepresentative Samples
    2.9 How Close to the Truth? 
    2.10 Common Mistakes in Performance Analysis 
    2.11 Bibliographical and Historical Remarks

3 Statistical Pattern Recognition 
    3.1 Introduction and Overview 
    3.2 A Few Sample Applications 
    3.3 Bayesian Classifiers 
        3.3.1 Direct Application of the Bayes Rule
    3.4 Linear Discriminants 
        3.4.1 The Normality Assumption and Discriminant Functions 
        3.4.2 Logistic Regression
    3.5 Nearest Neighbor Methods 
    3.6 Feature Selection 
    3.7 Error Rate Analysis 
    3.8 Bibliographical and Historical Remarks

4 Neural Nets 
    4.1 Introduction and Overview 
    4.2 Perceptrons 
        4.2.1 Least Mean Square Learning Systems 
        4.2.2 How Good Is a Linear Separation Network?
    4.3 Multilayer Neural Networks 
        4.3.1 Back-Propagation 
        4.3.2 The Practical Application of Back-Propagation
    4.4 Error Rate and Complexity Fit Estimation 
    4.5 Improving on Standard Back-Propagation 
    4.6 Bibliographical and Historical Remarks

5 Machine Learning: Easily Understood Decision Rules 
    5.1 Introduction and Overview 
    5.2 Decision Trees 
        5.2.1 Finding the Perfect Tree 
        5.2.2 The Incredible Shrinking Tree 
        5.2.3 Limitations of Tree Induction Methods
    5.3 Rule Induction 
        5.3.1 Predictive Value Maximization
    5.4 Bibliographical and Historical Remarks

6 Which Technique is Best? 
    6.1 What's Important in Choosing a Classifier? 
        6.1.1 Prediction Accuracy 
        6.1.2 Speed of Learning and Classification 
        6.1.3 Explanation and Insight
    6.2 So, How Do I Choose a Learning System? 
    6.3 Variations on the Standard Problem 
        6.3.1 Missing Data 
        6.3.2 Incremental Learning
    6.4 Future Prospects for Improved Learning Methods 
    6.5 Bibliographical and Historical Remarks

7 Expert Systems 
    7.1 Introduction and Overview 
        7.1.1 Why Build Expert Systems? New vs. Old Knowledge
    7.2 Estimating Error Rates for Expert Systems 
    7.3 Complexity of Knowledge Bases 
        7.3.1 How Many Rules Are Too Many?
    7.4 Knowledge Base Example 
    7.5 Empirical Analysis of Knowledge Bases 
    7.6 Future: Combined Learning and Expert Systems 
    7.7 Bibliographical and Historical Remarks

Reed, R.D., and Marks, R.J, II (1999), Neural Smithing: Supervised Learning
in Feedforward Artificial Neural Networks, Cambridge, MA: The MIT Press,
ISBN 0-262-18190-8.
Author's Webpage: Marks: http://cialab.ee.washington.edu/Marks.html
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262181908
After you have read Smith (1996) or Weiss and Kulikowski (1991), consult
Reed and Marks for practical details on training MLPs (other types of neural
nets such as RBF networks are barely even mentioned). They provide extensive
coverage of backprop and its variants, and they also survey conventional
optimization algorithms. Their coverage of initialization methods,
constructive networks, pruning, and regularization methods is unusually
thorough. Unlike the vast majority of books on neural nets, this one has
lots of really informative graphs. The chapter on generalization assessment
is slightly weak, which is why you should read Smith (1996) or Weiss and
Kulikowski (1991) first. Also, there is little information on data
preparation, for which Smith (1996) and Masters (1993; see below) should be
consulted. There is some elementary calculus, but not enough that it should
scare off anybody. Many second-rate books treat neural nets as mysterious
black boxes, but Reed and Marks open up the box and provide genuine insight
into the way neural nets work. 

One problem with the book is that the terms "validation set" and "test set"
are used inconsistently. 

Chapter headings: Supervised Learning; Single-Layer Networks; MLP
Representational Capabilities; Back-Propagation; Learning Rate and Momentum;
Weight-Initialization Techniques; The Error Surface; Faster Variations of
Back-Propagation; Classical Optimization Techniques; Genetic Algorithms and
Neural Networks; Constructive Methods; Pruning Algorithms; Factors
Influencing Generalization; Generalization Prediction and Assessment;
Heuristics for Improving Generalization; Effects of Training with Noisy
Inputs; Linear Regression; Principal Components Analysis; Jitter
Calculations; Sigmoid-like Nonlinear Functions 

The best books on using and programming NNs
-------------------------------------------

Masters, T. (1993), Practical Neural Network Recipes in C++, Academic
Press, ISBN 0-12-479040-2, US $45 incl. disks.
Book Webpage (Publisher): 
http://www.apcatalog.com/cgi-bin/AP?ISBN=0124790402&LOCATION=US&FORM=FORM2
Masters has written three exceptionally good books on NNs (the two others
are listed below). He combines generally sound practical advice with some
basic statistical knowledge to produce a programming text that is far
superior to the competition (see "The Worst" below). Not everyone likes his
C++ code (the usual complaint is that the code is not sufficiently OO) but,
unlike the code in some other books, Masters's code has been successfully
compiled and run by some readers of comp.ai.neural-nets. Masters's books are
well worth reading even for people who have no interest in programming. 
Chapter headings: Foundations; Classification; Autoassociation; Time-Series
Prediction; Function Approximation; Multilayer Feedforward Networks; Eluding
Local Minima I: Simulated Annealing; Eluding Local Minima II: Genetic
Optimization; Regression and Neural Networks; Designing Feedforward Network
Architectures; Interpreting Weights: How Does This Thing Work; Probabilistic
Neural Networks; Functional Link Networks; Hybrid Networks; Designing the
Training Set; Preparing Input Data; Fuzzy Data and Processing; Unsupervised
Training; Evaluating Performance of Neural Networks; Confidence Measures;
Optimizing the Decision Threshold; Using the NEURAL Program. 

Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++
Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Clear explanations of conjugate gradient and Levenberg-Marquardt
optimization algorithms, simulated annealing, kernel regression (GRNN) and
discriminant analysis (PNN), Gram-Charlier networks, dimensionality
reduction, cross-validation, and bootstrapping. 

Masters, T. (1994), Signal and Image Processing with Neural Networks: A
C++ Sourcebook, NY: Wiley, ISBN 0-471-04963-8.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.

The best intermediate textbooks on NNs
--------------------------------------

Bishop, C.M. (1995). Neural Networks for Pattern Recognition, Oxford:
Oxford University Press. ISBN 0-19-853849-9 (hardback) or 0-19-853864-2
(paperback), xvii+482 pages.
Book Webpage (Author): http://research.microsoft.com/~cmbishop/nnpr.htm
Book Webpage (Publisher): http://www.oup.co.uk/isbn/0-19-853864-2
This is definitely the best book on feedforward neural nets for readers
comfortable with calculus. The book is exceptionally well organized,
presenting topics in a logical progression ideal for conceptual
understanding. 

Geoffrey Hinton writes in the foreword:
"Bishop is a leading researcher who has a deep understanding of the material
and has gone to great lengths to organize it in a sequence that makes sense.
He has wisely avoided the temptation to try to cover everything and has
therefore omitted interesting topics like reinforcement learning, Hopfield
networks, and Boltzmann machines in order to focus on the types of neural
networks that are most widely used in practical applications. He assumes
that the reader has the basic mathematical literacy required for an
undergraduate science degree, and using these tools he explains everything
from scratch. Before introducing the multilayer perceptron, for example, he
lays a solid foundation of basic statistical concepts. So the crucial
concept of overfitting is introduced using easily visualized examples of
one-dimensional polynomials and only later applied to neural networks. An
impressive aspect of this book is that it takes the reader all the way from
the simplest linear models to the very latest Bayesian multilayer neural
networks without ever requiring any great intellectual leaps." 

Chapter headings: Statistical Pattern Recognition; Probability Density
Estimation; Single-Layer Networks; The Multi-layer Perceptron; Radial Basis
Functions; Error Functions; Parameter Optimization Algorithms;
Pre-processing and Feature Extraction; Learning and Generalization; Bayesian
Techniques; Symmetric Matrices; Gaussian Integrals; Lagrange Multipliers;
Calculus of Variations; Principal Components. 

Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of
Neural Computation. Redwood City, CA: Addison-Wesley, ISBN 0-201-50395-6
(hardbound) and 0-201-51560-1 (paperbound)
Book Webpage (Publisher): http://www2.awl.com/gb/abp/sfi/computer.html
This is an excellent classic work on neural nets from the perspective of
physics covering a wide variety of networks. Comments from readers of
comp.ai.neural-nets: "My first impression is that this one is by far the
best book on the topic. And it's below $30 for the paperback."; "Well
written, theoretical (but not overwhelming)"; It provides a good balance of
model development, computational algorithms, and applications. The
mathematical derivations are especially well done"; "Nice mathematical
analysis on the mechanism of different learning algorithms"; "It is NOT for
mathematical beginner. If you don't have a good grasp of higher level math,
this book can be really tough to get through."

The best advanced textbook covering NNs
---------------------------------------

Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge:
Cambridge University Press, ISBN 0-521-46086-7 (hardback), xii+403 pages.
Author's Webpage: http://www.stats.ox.ac.uk/~ripley/
Book Webpage (Publisher): http://www.cup.cam.ac.uk/
Additional Information: The Webpage includes errata and additional
information, which hasn't been available at publishing time, for this book.
Brian Ripley's book is an excellent sequel to Bishop (1995). Ripley starts
up where Bishop left off, with Bayesian inference and statistical decision
theory, and then covers some of the same material on NNs as Bishop but at a
higher mathematical level. Ripley also covers a variety of methods that are
not discussed, or discussed only briefly, by Bishop, such as tree-based
methods and belief networks. While Ripley is best appreciated by people with
a background in mathematical statistics, the numerous realistic examples in
his book will be of interest even to beginners in neural nets.
Chapter headings: Introduction and Examples; Statistical Decision Theory;
Linear Discriminant Analysis; Flexible Discriminants; Feed-forward Neural
Networks; Non-parametric Methods; Tree-structured Classifiers; Belief
Networks; Unsupervised Methods; Finding Good Pattern Features; Statistical
Sidelines. 

Devroye, L., Gy�rfi, L., and Lugosi, G. (1996), A Probabilistic Theory of
Pattern Recognition, NY: Springer, ISBN 0-387-94618-7, vii+636 pages.
This book has relatively little material explicitly about neural nets, but
what it has is very interesting and much of it is not found in other texts.
The emphasis is on statistical proofs of universal consistency for a wide
variety of methods, including histograms, (k) nearest neighbors, kernels
(PNN), trees, generalized linear discriminants, MLPs, and RBF networks.
There is also considerable material on validation and cross-validation. The
authors say, "We did not scar the pages with backbreaking simulations or
quick-and-dirty engineering solutions" (p. 7). The formula-to-text ratio is
high, but the writing is quite clear, and anyone who has had a year or two
of mathematical statistics should be able to follow the exposition.
Chapter headings: The Bayes Error; Inequalities and Alternate Distance
Measures; Linear Discrimination; Nearest Neighbor Rules; Consistency; Slow
Rates of Convergence; Error Estimation; The Regular Histogram Rule; Kernel
Rules; Consistency of the k-Nearest Neighbor Rule; Vapnik-Chervonenkis
Theory; Combinatorial Aspects of Vapnik-Chervonenkis Theory; Lower Bounds
for Empirical Classifier Selection; The Maximum Likelihood Principle;
Parametric Classification; Generalized Linear Discrimination; Complexity
Regularization; Condensed and Edited Nearest Neighbor Rules; Tree
Classifiers; Data-Dependent Partitioning; Splitting the Data; The
Resubstitution Estimate; Deleted Estimates of the Error Probability;
Automatic Kernel Rules; Automatic Nearest Neighbor Rules; Hypercubes and
Discrete Spaces; Epsilon Entropy and Totally Bounded Sets; Uniform Laws of
Large Numbers; Neural Networks; Other Error Estimates; Feature Extraction. 

The best books on neurofuzzy systems
------------------------------------

Brown, M., and Harris, C. (1994), Neurofuzzy Adaptive Modelling and
Control, NY: Prentice Hall, ISBN 0-13-134453-6.
Author's Webpage: http://www.isis.ecs.soton.ac.uk/people/m_brown.html
and http://www.ecs.soton.ac.uk/~cjh/
Book Webpage (Publisher): http://www.prenhall.com/books/esm_0131344536.html
Additional Information: Additional page at: 
http://www.isis.ecs.soton.ac.uk/publications/neural/mqbcjh94e.html and an
abstract can be found at: 
http://www.isis.ecs.soton.ac.uk/publications/neural/mqb93.html
Brown and Harris rely on the fundamental insight that that a fuzzy system is
a nonlinear mapping from an input space to an output space that can be
parameterized in various ways and therefore can be adapted to data using the
usual neural training methods (see "What is backprop?") or conventional
numerical optimization algorithms (see "What are conjugate gradients,
Levenberg-Marquardt, etc.?"). Their approach makes clear the intimate
connections between fuzzy systems, neural networks, and statistical methods
such as B-spline regression. 

The best comparison of NNs with other classification methods
------------------------------------------------------------

Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994), Machine Learning,
Neural and Statistical Classification, Ellis Horwood. Author's Webpage:
Donald Michie: http://www.aiai.ed.ac.uk/~dm/dm.html
Additional Information: This book is out of print but available online at 
http://www.amsta.leeds.ac.uk/~charles/statlog/ 

Other notable books
+++++++++++++++++++

Introductory
------------

Anderson, J.A. (1995), An Introduction to Neural Networks, Cambridge,MA:
The MIT Press, ISBN 0-262-01144-1. 
Author's Webpage: http://www.cog.brown.edu/~anderson
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262510812 or
http://mitpress.mit.edu/book-home.tcl?isbn=0262011441 (hardback)
Additional Information: Programs and additional information can be found at:
ftp://mitpress.mit.edu/pub/Intro-to-NeuralNets/
Anderson provides an accessible introduction to the AI and
neurophysiological sides of NN research, although the book is weak regarding
practical aspects of using NNs.
Chapter headings: Properties of Single Neurons; Synaptic Integration and
Neuron Models; Essential Vector Operations; Lateral Inhibition and Sensory
Processing; Simple Matrix Operations; The Linear Associator: Background and
Foundations; The Linear Associator: Simulations; Early Network Models: The
Perceptron; Gradient Descent Algorithms; Representation of Information;
Applications of Simple Associators: Concept Formation and Object Motion;
Energy and Neural Networks: Hopfield Networks and Boltzmann Machines;
Nearest Neighbor Models; Adaptive Maps; The BSB Model: A Simple Nonlinear
Autoassociative Neural Network; Associative Computation; Teaching Arithmetic
to a Neural Network. 

Hagan, M.T., Demuth, H.B., and Beale, M. (1996), Neural Network Design, 
Boston: PWS, ISBN 0-534-94332-2. 
It doesn't really say much about design, but this book provides formulas and
examples in excruciating detail for a wide variety of networks. It also
includes some mathematical background material.
Chapter headings: Neuron Model and Network Architectures; An Illustrative
Example; Perceptron Learning Rule; Signal and Weight Vector Spaces; Linear
Transformations for Neural; Networks; Supervised Hebbian Learning;
Performance Surfaces and Optimum Points; Performance Optimization;
Widrow-Hoff Learning; Backpropagation; Variations on Backpropagation;
Associative Learning; Competitive Networks; Grossberg Network; Adaptive
Resonance Theory; Stability; Hopfield Network. 

Abdi, H., Valentin, D., and Edelman, B. (1999), Neural Networks, Sage
University Papers Series on Quantitative Applications in the Social
Sciences, 07-124, Thousand Oaks, CA: Sage, ISBN 0-7619-1440-4.
Inexpensive, brief (89 pages) but very detailed explanations of linear
networks and the basics of backpropagation.
Chapter headings: 1. Introduction 2. The Perceptron 3. Linear
Autoassociative Memories 4. Linear Heteroassociative Memories 5. Error
Backpropagation 6. Useful References. 

Bayesian learning
-----------------

Neal, R. M. (1996) Bayesian Learning for Neural Networks, New York:
Springer-Verlag, ISBN 0-387-94724-8. 

Biological learning and neurophysiology
---------------------------------------

Koch, C., and Segev, I., eds. (1998) Methods in Neuronal Modeling: From
Ions to Networks, 2nd ed., Cambridge, MA: The MIT Press, ISBN
0-262-11231-0.
Book Webpage: http://goethe.klab.caltech.edu/MNM/

Rolls, E.T., and Treves, A. (1997), Neural Networks and Brain Function, 
Oxford: Oxford University Press, ISBN: 0198524323.
Chapter headings: Introduction; Pattern association memory; Autoassociation
memory; Competitive networks, including self-organizing maps;
Error-correcting networks: perceptrons, the delta rule, backpropagation of
error in multilayer networks, and reinforcement learning algorithms; The
hippocampus and memory; Pattern association in the brain: amygdala and
orbitofrontal cortex; Cortical networks for invariant pattern recognition;
Motor systems: cerebellum and basal ganglia; Cerebral neocortex. 

Schmajuk, N.A. (1996) Animal Learning and Cognition: A Neural Network
Approach, Cambridge: Cambridge University Press, ISBN 0521456967.
Chapter headings: Neural networks and associative learning Classical
conditioning: data and theories; Cognitive mapping; Attentional processes;
Storage and retrieval processes; Configural processes; Timing; Operant
conditioning and animal communication: data, theories, and networks; Animal
cognition: data and theories; Place learning and spatial navigation; Maze
learning and cognitive mapping; Learning, cognition, and the hippocampus:
data and theories; Hippocampal modulation of learning and cognition; The
character of the psychological law. 

Collections
-----------

Orr, G.B., and Mueller, K.-R., eds. (1998), Neural Networks: Tricks of the
Trade, Berlin: Springer, ISBN 3-540-65311-2.
Articles: Efficient BackProp; Early Stopping - But When? A Simple Trick for
Estimating the Weight Decay Parameter; Controling the Hyperparameter Search
in MacKay's Bayesian Neural Network Framework; Adaptive Regularization in
Neural Network Modeling; Large Ensemble Averaging; Square Unit Augmented,
Radially Extended, Multilayer Perceptrons; A Dozen Tricks with Multitask
Learning; Solving the Ill-Conditioning in Neural Network Learning; Centering
Neural Network Gradient Factors; Avoiding Roundoff Error in Backpropagating
Derivatives; Transformation Invariance in Pattern Recognition - Tangent
Distance and Tangent Propagation; Combining Neural Networks and
Context-Driven Search for On-Line, Printed Handwriting Recognition in the
Newton; Neural Network Classification and Prior Class Probabilities;
Applying Divide and Conquer to Large Scale Pattern Recognition Tasks;
Forecasting the Economy with Neural Nets: A Survey of Challenges and
Solutions; How to Train Neural Networks. 

Arbib, M.A., ed. (1995), The Handbook of Brain Theory and Neural
Networks, Cambridge, MA: The MIT Press, ISBN 0-262-51102-9.
From The Publisher: The heart of the book, part III, comprises of 267
original articles by leaders in the various fields, arranged alphabetically
by title. Parts I and II, written by the editor, are designed to help
readers orient themselves to this vast range of material. Part I,
Background, introduces several basic neural models, explains how the present
study of brain theory and neural networks integrates brain theory,
artificial intelligence, and cognitive psychology, and provides a tutorial
on the concepts essential for understanding neural networks as dynamic,
adaptive systems. Part II, Road Maps, provides entry into the many articles
of part III through an introductory "Meta-Map" and twenty-three road maps,
each of which tours all the Part III articles on the chosen theme. 

Touretzky, D., Hinton, G, and Sejnowski, T., eds., (1989) Proceedings of the
1988 Connectionist Models Summer School, San Mateo, CA: Morgan Kaufmann,
ISBN: 1558600337 

NIPS:

1. Touretzky, D.S., ed. (1989), Advances in Neural Information Processing
   Systems 1, San Mateo, CA: Morgan Kaufmann, ISBN: 1558600159 
2. Touretzky, D. S., ed. (1990), Advances in Neural Information Processing
   Systems 2, San Mateo, CA: Morgan Kaufmann, ISBN: 1558601007 
3. Lippmann, R.P., Moody, J.E., and Touretzky, D. S., eds. (1991) Advances
   in Neural Information Processing Systems 3, San Mateo, CA: Morgan
   Kaufmann, ISBN: 1558601848 
4. Moody, J.E., Hanson, S.J., and Lippmann, R.P., eds. (1992) Advances in
   Neural Information Processing Systems 4, San Mateo, CA: Morgan Kaufmann,
   ISBN: 1558602224 
5. Hanson, S.J., Cowan, J.D., and Giles, C.L. eds. (1993) Advances in
   Neural Information Processing Systems 5, San Mateo, CA: Morgan Kaufmann,
   ISBN: 1558602747 
6. Cowan, J.D., Tesauro, G., and Alspector, J., eds. (1994) Advances in
   Neural Information Processing Systems 6, San Mateo, CA: Morgan Kaufman,
   ISBN: 1558603220 
7. Tesauro, G., Touretzky, D., and Leen, T., eds. (1995) Advances in Neural
   Information Processing Systems 7, Cambridge, MA: The MIT Press, ISBN:
   0262201046 
8. Touretzky, D. S., Mozer, M.C., and Hasselmo, M.E., eds. (1996) Advances
   in Neural Information Processing Systems 8, Cambridge, MA: The MIT Press,
   ISBN: 0262201070 
9. Mozer, M.C., Jordan, M.I., and Petsche, T., eds. (1997) Advances in
   Neural Information Processing Systems 9, Cambridge, MA: The MIT Press,
   ISBN: 0262100657 
10. Jordan, M.I., Kearns, M.S., and Solla, S.A., eds. (1998) Advances in
   Neural Information Processing Systems 10, Cambridge, MA: The MIT Press,
   ISBN: 0262100762 
11. Kearns, M.S., Solla, S.A., amd Cohn, D.A., eds. (1999) Advances in
   Neural Information Processing Systems 11, Cambridge,MA: The MIT Press,
   ISBN: 0262112450 
12. Solla, S.A., Leen, T., and M�ller, K.-R., eds. (2000) Advances in Neural
   Information Processing Systems 12, Cambridge, MA: The MIT Press, ISBN:
   0-262-19450-3 

Combining networks
------------------

Sharkey, A.J.C. (1999), Combining Artificial Neural Nets: Ensemble and
Modular Multi-Net Systems, London: Springer, ISBN: 185233004X 

Connectionism
-------------

Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A., and Parisi, D.
(1996) Rethinking Innateness: A Connectionist Perspective on Development, 
Cambridge, MA: The MIT Press, ISBN: 026255030X.
Chapter headings: New perspectives on development; Why connectionism?
Ontogenetic development: A connectionist synthesis; The shape of change;
Brain development; Interactions, all the way down; Rethinking innateness. 

Plunkett, K., and Elman, J.L. (1997), Exercises in Rethinking Innateness: A
Handbook for Connectionist Simulations, Cambridge, MA: The MIT Press, ISBN:
0262661055.
Chapter headings: Introduction and overview; The methodology of simulations;
Learning to use the simulator; Learning internal representations;
Autoassociation; Generalization; Translation invariance; Simple recurrent
networks; Critical points in learning; Modeling stages in cognitive
development; Learning the English past tense; The importance of starting
small. 

Feedforward networks
--------------------

Fine, T.L. (1999) Feedforward Neural Network Methodology, NY: Springer,
ISBN 0-387-98745-2. 

Husmeier, D. (1999), Neural Networks for Conditional Probability
Estimation: Forecasting Beyond Point Predictions, Berlin: Springer Verlag,
ISBN 185233095. 

Fuzzy logic and neurofuzzy systems
----------------------------------

See also "General (including SVMs and Fuzzy Logic)".

Kosko, B. (1997), Fuzzy Engineering, Upper Saddle River, NJ: Prentice Hall,
ISBN 0-13-124991-6.
Kosko's new book is a big improvement over his older neurofuzzy book and
makes an excellent sequel to Brown and Harris (1994). 

Nauck, D., Klawonn, F., and Kruse, R. (1997), Foundations of Neuro-Fuzzy
Systems, Chichester: Wiley, ISBN 0-471-97151-0.
Chapter headings: Historical and Biological Aspects; Neural Networks; Fuzzy
Systems; Modelling Neuro-Fuzzy Systems; Cooperative Neuro-Fuzzy Systems;
Hybrid Neuro-Fuzzy Systems; The Generic Fuzzy Perceptron; NEFCON -
Neuro-Fuzzy Control; NEFCLASS - Neuro-Fuzzy Classification; NEFPROX -
Neuro-Fuzzy Function Approximation; Neural Networks and Fuzzy Prolog; Using
Neuro-Fuzzy Systems. 

General (including SVMs and Fuzzy Logic)
----------------------------------------

Many books on neural networks, machine learning, etc., present various
methods as miscellaneous tools without any conceptual framework relating
different methods. The best of such neural net "cookbooks" is probably
Haykin's (1999) second edition.

Among conceptually-integrated books, there are two excellent books that use
the Vapnil-Chervonenkis theory as a unifying theme, and provide strong
coverage of support vector machines and fuzzy logic, as well as neural nets.
Of these two, Kecman (2001) provides clearer explanations and better
diagrams, but Cherkassky and Mulier (1998) are better organized have an
excellent section on unsupervised learning, especially self-organizing maps.
I have been tempted to add both of these books to the "best" list, but I
have not done so because I think VC theory is of doubtful practical utility
for neural nets. However, if you are especially interested in VC theory and
support vector machines, then both of these books can be highly recommended.
To help you choose between them, a detailed table of contents is provided
below for each book. 

Haykin, S. (1999), Neural Networks: A Comprehensive Foundation, 2nd ed.,
Upper Saddle River, NJ: Prentice Hall, ISBN 0-13-273350-1.
The second edition is much better than the first, which has been described
as a core-dump of Haykin's brain. The second edition covers more topics, is
easier to understand, and has better examples. 
Chapter headings: Introduction; Learning Processes; Single Layer
Perceptrons; Multilayer Perceptrons; Radial-Basis Function Networks; Support
Vector Machines; Committee Machines; Principal Components Analysis;
Self-Organizing Maps; Information-Theoretic Models; Stochastic Machines And
Their Approximates Rooted in Statistical Mechanics; Neurodynamic
Programming; Temporal Processing Using Feedforward Networks; Neurodynamics;
Dynamically Driven Recurrent Networks. 

Kecman, V. (2001), Learning and Soft Computing: Support Vector Machines,
Neural Networks, and Fuzzy Logic Models, Cambridge, MA: The MIT Press;
ISBN: 0-262-11255-8.
URL: http://www.support-vector.ws/

Detailed Table of Contents:

1. Learning and Soft Computing: Rationale, Motivations, Needs, Basics
   1.1 Examples of Applications in Diverse Fields
   1.2 Basic Tools of Soft Computing: Neural Networks, Fuzzy Logic Systems, and Support Vector Machines
       1.2.1 Basics of Neural Networks
       1.2.2 Basics of Fuzzy Logic Modeling
   1.3 Basic Mathematics of Soft Computing
       1.3.1 Approximation of Multivariate Functions
       1.3.2 Nonlinear Error Surface and Optimization
   1.4 Learning and Statistical Approaches to Regression and Classification
       1.4.1 Regression
       1.4.2 Classification
   Problems
   Simulation Experiments

2. Support Vector Machines
   2.1 Risk Minimization Principles and the Concept of Uniform Convergence
   2.2 The VC Dimension
   2.3 Structural Risk Minimization
   2.4 Support Vector Machine Algorithms
       2.4.1 Linear Maximal Margin Classifier for Linearly Separable Data
       2.4.2 Linear Soft Margin Classifier for Overlapping Classes
       2.4.3 The Nonlinear Classifier
       2.4.4 Regression by Support Vector
   Machines
   Problems
   Simulation Experiments

3. Single-Layer Networks
   3.1 The Perceptron
       3.1.1 The Geometry of Perceptron Mapping
       3.1.2 Convergence Theorem and
   Perceptron Learning Rule
   3.2 The Adaptive Linear Neuron (Adaline) and the Least Mean Square Algorithm
       3.2.1 Representational Capabilities of the Adaline
       3.2.2 Weights Learning for a Linear Processing Unit
   Problems
   Simulation Experiments

4. Multilayer Perceptrons
   4.1  The Error Backpropagation Algorithm
   4.2  The Generalized Delta Rule
   4.3  Heuristics or Practical Aspects of the Error Backpropagation Algorithm
        4.3.1 One, Two, or More Hidden Layers?
        4.3.2 Number of Neurons in a Hidden Layer, or the Bias-Variance Dilemma
        4.3.3 Type of Activation Functions in a Hidden Layer and the Geometry of Approximation
        4.3.4 Weights Initialization
        4.3.5 Error Function for Stopping Criterion at Learning
        4.3.6 Learning Rate and the Momentum Term
   Problems
   Simulation Experiments

5. Radial Basis Function Networks
   5.1 Ill-Posed Problems and the Regularization Technique 
   5.2 Stabilizers and Basis Functions
   5.3 Generalized Radial Basis Function Networks
       5.3.1 Moving Centers Learning
       5.3.2 Regularization with Nonradial Basis Functions
       5.3.3 Orthogonal Least Squares
       5.3.4 Optimal Subset Selection by Linear
   Programming
   Problems
   Simulation Experiments 

6. Fuzzy Logic Systems
   6.1 Basics of Fuzzy Logic Theory
       6.1.1 Crisp (or Classic) and Fuzzy Sets
       6.1.2 Basic Set Operations
       6.1.3 Fuzzy Relations
       6.1.4 Composition of Fuzzy Relations
       6.1.5 Fuzzy Inference
       6.1.6 Zadeh's Compositional Rule of Inference
       6.1.7 Defuzzification
   6.2 Mathematical Similarities between Neural Networks and Fuzzy Logic Models
   6.3 Fuzzy Additive Models
   Problems
   Simulation Experiments

7. Case Studies
   7.1 Neural Networks-Based Adaptive Control
       7.1.1 General Learning Architecture, or Direct Inverse Modeling
       7.1.2 Indirect Learning Architecture
       7.1.3 Specialized Learning Architecture
       7.1.4 Adaptive Backthrough Control
   7.2 Financial Time Series Analysis
   7.3 Computer Graphics
       7.3.1 One-Dimensional Morphing
       7.3.2 Multidimensional Morphing
       7.3.3 Radial Basis Function Networks for Human Animation
       7.3.4 Radial Basis Function Networks for Engineering Drawings

8. Basic Nonlinear Optimization Methods
   8.1 Classical Methods
       8.1.1 Newton-Raphson Method
       8.1.2 Variable Metric or Quasi-Newton Methods
       8.1.3 Davidon-Fletcher-Powel Method
       8.1.4 Broyden-Fletcher-Go1dfarb-Shano Method
       8.1.5 Conjugate Gradient Methods
       8.1.6 Fletcher-Reeves Method
       8.1.7 Polak-Ribiere Method
       8.1.8 Two Specialized Algorithms for a Sum-of-Error-Squares Error Function
               Gauss-Newton Method
               Levenberg-Marquardt Method
   8.2 Genetic Algorithms and Evolutionary Computing
       8.2.1 Basic Structure of Genetic Algorithms
       8.2.2 Mechanism of Genetic Algorithms

9. Mathematical Tools of Soft Computing
   9.1 Systems of Linear Equations
   9.2 Vectors and Matrices
   9.3 Linear Algebra and Analytic Geometry
   9.4 Basics of Multivariable Analysis
   9.5 Basics from Probability Theory

Cherkassky, V.S., and Mulier, F.M. (1998), Learning from Data : Concepts,
Theory, and Methods, NY: John Wiley & Sons; ISBN: 0-471-15493-8.

Detailed Table of Contents:

1 Introduction

  1.1 Learning and Statistical Estimation
  1.2 Statistical Dependency and Causality
  1.3 Characterization of Variables
  1.4 Characterization of Uncertainty
  References

2 Problem Statement, Classical Approaches, and Adaptive Learning

  2.1 Formulation of the Learning Problem
    2.1.1 Role of the Learning Machine
    2.1.2 Common Learning Tasks
    2.1.3 Scope of the Learning Problem Formulation
  2.2 Classical Approaches
    2.2.1 Density Estimation
    2.2.2 Classification (Discriminant Analysis)
    2.2.3 Regression
    2.2.4 Stochastic Approximation
    2.2.5 Solving Problems with Finite Data
    2.2.6 Nonparametric Methods
  2.3 Adaptive Learning: Concepts and Inductive Principles
    2.3.1 Philosophy, Major Concepts, and Issues
    2.3.2 A priori Knowledge and Model Complexity
    2.3.3 Inductive Principles
  2.4 Summary
  References

3 Regularization Framework

  3.1 Curse and Complexity of Dimensionality
  3.2 Function Approx. and Characterization of Complexity
  3.3 Penalization
    3.3.1 Parametric Penalties
    3.3.2 Nonparametric Penalties
  3.4 Model Selection (Complexity Control)
    3.4.1 Analytical Model Selection Criteria
    3.4.2 Model Selection via Resampling
    3.4.3 Bias-variance Trade-off
    3.4.4 Example of Model Selection
  3.5 Summary
  References

4 Statistical Learning Theory

  4.1 Conditions for Consistency and Convergence of ERM
  4.2 Growth Function and VC-Dimension
    4.2.1 VC-Dimension of the Set of Real-Valued Functions
    4.2.2 VC-Dim. for Classification and Regression Problems
    4.2.3 Examples of Calculating VC-Dimension
  4.3 Bounds on the Generalization
    4.3.1 Classification
    4.3.2 Regression
    4.3.3 Generalization Bounds and Sampling Theorem
  4.4 Structural Risk Minimization
  4.5 Case Study: Comparison of Methods for Model Selection
  4.6 Summary
  References

5 Nonlinear Optimization Strategies

  5.1 Stochastic Approximation Methods
    5.1.1 Linear Parameter Estimation
    5.1.2 Backpropagation Training of MLP Networks
  5.2 Iterative Methods
    5.2.1 Expectation-Maximization Methods for Density Est.
    5.2.2 Generalized Inverse Training of MLP Networks
  5.3 Greedy Optimization
    5.3.1 Neural Network Construction Algorithms
    5.3.2 Classification and Regression Trees (CART)
  5.4 Feature Selection, Optimization, and Stat. Learning Th.
  5.5 Summary
  References

6 Methods for Data Reduction and Dim. Reduction

  6.1 Vector Quantization
    6.1.1 Optimal Source Coding in Vector Quantization
    6.1.2 Generalized Lloyd Algorithm
    6.1.3 Clustering and Vector Quantization
    6.1.4 EM Algorithm for VQ and Clustering
  6.2 Dimensionality Reduction: Statistical Methods
    6.2.1 Linear Principal Components
    6.2.2 Principal Curves and Surfaces
  6.3 Dimensionality Reduction: Neural Network Methods
    6.3.1 Discrete Principal Curves and Self-org. Map Alg.
    6.3.2 Statistical Interpretation of the SOM Method
    6.3.3 Flow-through Version of the SOM and Learning Rate Schedules
    6.3.4 SOM Applications and Modifications
    6.3.5 Self-supervised MLP
  6.4 Summary
  References

7 Methods for Regression

  7.1 Taxonomy: Dictionary versus Kernel Representation
  7.2 Linear Estimators
    7.2.1 Estimation of Linear Models and Equivalence of Representations
    7.2.2 Analytic Form of Cross-validation
    7.2.3 Estimating Complexity of Penalized Linear Models
  7.3 Nonadaptive Methods
    7.3.1 Local Polynomial Estimators and Splines
    7.3.2 Radial Basis Function Networks
    7.3.3 Orthogonal Basis Functions and Wavelets
  7.4 Adaptive Dictionary Methods
    7.4.1 Additive Methods and Projection Pursuit Regression
    7.4.2 Multilayer Perceptrons and Backpropagation
    7.4.3 Multivariate Adaptive Regression Splines
  7.5 Adaptive Kernel Methods and Local Risk Minimization
    7.5.1 Generalized Memory-Based Learning
    7.5.2 Constrained Topological Mapping
  7.6 Empirical Comparisons
    7.6.1 Experimental Setup
    7.6.2 Summary of Experimental Results
  7.7 Combining Predictive Models
  7.8 Summary
  References

8 Classification

  8.1 Statistical Learning Theory formulation
  8.2 Classical Formulation
  8.3 Methods for Classification
    8.3.1 Regression-Based Methods
    8.3.2 Tree-Based Methods
    8.3.3 Nearest Neighbor and Prototype Methods
    8.3.4 Empirical Comparisons
  8.4 Summary
  References

9 Support Vector Machines

  9.1 Optimal Separating Hyperplanes
  9.2 High Dimensional Mapping and Inner Product Kernels
  9.3 Support Vector Machine for Classification
  9.4 Support Vector Machine for Regression
  9.5 Summary
  References

10 Fuzzy Systems

  10.1 Terminology, Fuzzy Sets, and Operations
  10.2 Fuzzy Inference Systems and Neurofuzzy Systems
    10.2.1 Fuzzy Inference Systems
    10.2.2 Equivalent Basis Function Representation
    10.2.3 Learning Fuzzy Rules from Data
  10.3 Applications in Pattern Recognition
    10.3.1 Fuzzy Input Encoding and Fuzzy Postprocessing
    10.3.2 Fuzzy Clustering
  10.4 Summary
  References

Appendix A: Review of Nonlinear Optimization

Appendix B: Eigenvalues and Singular Value Decomposition

History
-------

Hebb, D.O. (1949), The Organization of Behavior, NY: Wiley. Out of print. 

Rosenblatt, F. (1962), Principles of Neurodynamics, NY: Spartan Books. Out
of print. 

Anderson, J.A., and Rosenfeld, E., eds. (1988), Neurocomputing:
Foundatons of Research, Cambridge, MA: The MIT Press, ISBN 0-262-01097-6.
Author's Webpage: http://www.cog.brown.edu/~anderson
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262510480
43 articles of historical importance, ranging from William James to
Rumelhart, Hinton, and Williams. 

Anderson, J. A., Pellionisz, A. and Rosenfeld, E. (Eds). (1990). 
Neurocomputing 2: Directions for Research. The MIT Press: Cambridge, MA. 
Author's Webpage: http://www.cog.brown.edu/~anderson
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262510758

Carpenter, G.A., and Grossberg, S., eds. (1991), Pattern Recognition by
Self-Organizing Neural Networks, Cambridge, MA: The MIT Press, ISBN
0-262-03176-0
Articles on ART, BAM, SOMs, counterpropagation, etc. 

Nilsson, N.J. (1965/1990), Learning Machines, San Mateo, CA: Morgan
Kaufmann, ISBN 1-55860-123-6. 

Minsky, M.L., and Papert, S.A. (1969/1988) Perceptrons, Cambridge, MA: The
MIT Press, 1st ed. 1969, expanded edition 1988 ISBN 0-262-63111-3. 

Werbos, P.J. (1994), The Roots of Backpropagation, NY: John Wiley & Sons,
ISBN: 0471598976. Includes Werbos's 1974 Harvard Ph.D. thesis, Beyond
Regression. 

Kohonen, T. (1984/1989), Self-organization and Associative Memory, 1st ed.
1988, 3rd ed. 1989, NY: Springer. 
Author's Webpage: http://www.cis.hut.fi/nnrc/teuvo.html
Book Webpage (Publisher): http://www.springer.de/
Additional Information: Book is out of print. 

Rumelhart, D. E. and McClelland, J. L. (1986), Parallel Distributed
Processing: Explorations in the Microstructure of Cognition, Volumes 1 & 2,
Cambridge, MA: The MIT Press ISBN 0-262-63112-1. 
Author's Webpage: 
http://www-med.stanford.edu/school/Neurosciences/faculty/rumelhart.html
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262631121 

Hecht-Nielsen, R. (1990), Neurocomputing, Reading, MA: Addison-Wesley,
ISBN 0-201-09355-3.
Book Webpage (Publisher): http://www.awl.com/

Anderson, J.A., and Rosenfeld, E., eds. (1998), Talking Nets: An Oral
History of Neural Networks, Cambridge, MA: The MIT Press, ISBN
0-262-51111-8. 

Knowledge, rules, and expert systems
------------------------------------

Gallant, S.I. (1995), Neural Network Learning and Expert Systems, 
Cambridge, MA: The MIT Press, ISBN 0-262-07145-2.
Chapter headings:; Introduction and Important Definitions; Representation
Issues; Perceptron Learning and the Pocket Algorithm; Winner-Take-All Groups
or Linear Machines; Autoassociators and One-Shot Learning; Mean Squared
Error (MSE) Algorithms; Unsupervised Learning; The Distributed Method and
Radial Basis Functions; Computational Learning Theory and the BRD Algorithm;
Constructive Algorithms; Backpropagation; Backpropagation: Variations and
Applications; Simulated Annealing and Boltzmann Machines; Expert Systems and
Neural Networks; Details of the MACIE System; Noise, Redundancy, Fault
Detection, and Bayesian Decision Theory; Extracting Rules from Networks;
Appendix: Representation Comparisons. 

Cloete, I., and Zurada, J.M. (2000), Knowledge-Based Neurocomputing, 
Cambridge, MA: The MIT Press, ISBN 0-262-03274-0.
Articles: Knowledge-Based Neurocomputing: Past, Present, and Future;
Architectures and Techniques for Knowledge-Based Neurocomputing; Symbolic
Knowledge Representation in Recurrent Neural Networks: Insights from
Theoretical Models of Computation; A Tutorial on Neurocomputing of
Structures; Structural Learning and Rule Discovery; VL[subscript 1]ANN:
Transformation of Rules to Artificial Neural Networks; Integrations of
Heterogeneous Sources of Partial Domain Knowledge; Approximation of
Differential Equations Using Neural Networks; Fynesse: A Hybrid Architecture
for Self-Learning Control; Data Mining Techniques for Designing Neural
Network Time Series Predictors; Extraction of Decision Trees from Artificial
Neural Networks 369; Extraction of Linguistic Rules from Data via Neural
Networks and Fuzzy Approximation; Neural Knowledge Processing in Expert
Systems. 

Learning theory
---------------

Wolpert, D.H., ed. (1995) The Mathematics of Generalization: The
Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised
Learning, Santa Fe Institute Studies in the Sciences of Complexity, Volume
XX, Reading, MA: Addison-Wesley, ISBN: 0201409836.
Articles: The Status of Supervised Learning Science circa 1994 - The Search
for a Consensus; Reflections After Refereeing Papers for NIPS; The Probably
Approximately Correct (PAC) and Other Learning Models; Decision Theoretic
Generalizations of the PAC Model for Neural Net and Other Learning
Applications; The Relationship Between PAC, the Statistical Physics
Framework, the Bayesian Framework, and the VC Framework; Statistical Physics
Models of Supervised Learning; On Exhaustive Learning; A Study of
Maximal-Coverage Learning Algorithms; On Bayesian Model Selection; Soft
Classification, a.k.a. Risk Estimation, via Penalized Log Likelihood and
Smoothing Spline Analysis of Variance; Current Research; Preface to
Simplifying Neural Networks by Soft Weight Sharing; Simplifying Neural
Networks by Soft Weight Sharing; Error-Correcting Output Codes: A General
Method for Improving Multiclass Inductive Learning Programs; Image
Segmentation and Recognition. 

Anthony, M., and Bartlett, P.L. (1999), Neural Network Learning:
Theoretical Foundations, Cambridge: Cambridge University Press, ISBN
0-521-57353-X. 

Vapnik, V.N. (1998) Statistical Learning Theory, NY: Wiley, ISBN: 0471030031
This book is much better than Vapnik's The Nature of Statistical Learning
Theory.
Chapter headings:
0. Introduction: The Problem of Induction and Statistical Inference;
1. Two Approaches to the Learning Problem;
Appendix: Methods for Solving Ill-Posed Problems;
2. Estimation of the Probability Measure and Problem of Learning;
3. Conditions for Consistency of Empirical Risk Minimization Principle;
4. Bounds on the Risk for Indicator Loss Functions;
Appendix: Lower Bounds on the Risk of the ERM Principle;
5. Bounds on the Risk for Real-Valued Loss Functions;
6. The Structural Risk Minimization Principle;
Appendix: Estimating Functions on the Basis of Indirect Measurements;
7. Stochastic Ill-Posed Problems;
8. Estimating the Values of Functions at Given Points;
9. Perceptrons and Their Generalizations;
10. The Support Vector Method for Estimating Indicator Functions;
11. The Support Vector Method for Estimating Real-Valued Functions;
12. SV Machines for Pattern Recognition; (includes examples of digit
recognition)
13. SV Machines for Function Approximations, Regression Estimation, and
Signal Processing; (includes an example of positron emission tomography)
14. Necessary and Sufficient Conditions for Uniform Convergence of
Frequencies to Their Probabilities;
15. Necessary and Sufficient Conditions for Uniform Convergence of Means to
Their Expectations;
16. Necessary and Sufficient Conditions for Uniform One-Sided Convergence of
Means to Their Expectations;
Comments and Bibliographical Remarks. 

Object oriented programming
---------------------------

The FAQ maintainer is an old-fashioned C programmer and has no expertise in
object oriented programming, so he must rely on the readers of
comp.ai.neural-nets regarding the merits of books on OOP for NNs. 

There are many excellent books about NNs by Timothy Masters (listed
elsewhere in the FAQ) that provide C++ code for NNs. If you simply want code
that works, these books should satisfy your needs. If you want code that
exemplifies the highest standards of object oriented design, you will be
disappointed by Masters. 

The one book on OOP for NNs that seems to be consistently praised is:

Rogers, Joey (1996), Object-Oriented Neural Networks in C++, Academic
Press, ISBN 0125931158.
Contents:
1. Introduction
2. Object-Oriented Programming Review
3. Neural-Network Base Classes
4. ADALINE Network
5. Backpropagation Neural Network
6. Self-Organizing Neural Network
7. Bidirectional Associative Memory
Appendix A Support Classes
Appendix B Listings
References and Suggested Reading

However, you will learn very little about NNs other than elementary
programming techniques from Rogers. To quote a customer review at the Barnes
& Noble web site (http://www.bn.com): 

   A reviewer, a scientific programmer, July 19, 2000, **** Long
   explaination of neural net code - not of neural nets Good OO code for
   simple 'off the shelf' implementation, very open & fairly extensible
   for further cusomization. A complete & lucid explanation of the code
   but pretty weak on the principles, theory, and application of neural
   networks. Great as a code source, disappointing as a neural network
   tutorial. 

On-line and incremental learning
--------------------------------

Saad, D., ed. (1998), On-Line Learning in Neural Networks, Cambridge:
Cambridge University Press, ISBN 0-521-65263-4.
Articles: Introduction; On-line Learning and Stochastic Approximations;
Exact and Perturbation Solutions for the Ensemble Dynamics; A Statistical
Study of On-line Learning; On-line Learning in Switching and Drifting
Environments with Application to Blind Source Separation; Parameter
Adaptation in Stochastic Optimization; Optimal On-line Learning in
Multilayer Neural Networks; Universal Asymptotics in Committee Machines with
Tree Architecture; Incorporating Curvature Information into On-line
Learning; Annealed On-line Learning in Multilayer Neural Networks; On-line
Learning of Prototypes and Principal Components; On-line Learning with
Time-Correlated Examples; On-line Learning from Finite Training Sets;
Dynamics of Supervised Learning with Restricted Training Sets; On-line
Learning of a Decision Boundary with and without Queries; A Bayesian
Approach to On-line Learning; Optimal Perceptron Learning: an On-line
Bayesian Approach. 

Optimization
------------

Cichocki, A. and Unbehauen, R. (1993). Neural Networks for Optimization
and Signal Processing. NY: John Wiley & Sons, ISBN 0-471-93010-5
(hardbound), 526 pages, $57.95. 
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Chapter headings: Mathematical Preliminaries of Neurocomputing;
Architectures and Electronic Implementation of Neural Network Models;
Unconstrained Optimization and Learning Algorithms; Neural Networks for
Linear, Quadratic Programming and Linear Complementarity Problems; A Neural
Network Approach to the On-Line Solution of a System of Linear Algebraic;
Equations and Related Problems; Neural Networks for Matrix Algebra Problems;
Neural Networks for Continuous, Nonlinear, Constrained Optimization
Problems; Neural Networks for Estimation, Identification and Prediction;
Neural Networks for Discrete and Combinatorial Optimization Problems. 

Pulsed/Spiking networks
-----------------------

Maass, W., and Bishop, C.M., eds. (1999) Pulsed Neural Networks, 
Cambridge, MA: The MIT Press, ISBN: 0262133504.
Articles: Spiking Neurons; Computing with Spiking Neurons; Pulse-Based
Computation in VLSI Neural Networks; Encoding Information in Neuronal
Activity; Building Silicon Nervous Systems with Dendritic Tree Neuromorphs;
A Pulse-Coded Communications Infrastructure; Analog VLSI Pulsed Networks for
Perceptive Processing; Preprocessing for Pulsed Neural VLSI Systems; Digital
Simulation of Spiking Neural Networks; Populations of Spiking Neurons;
Collective Excitation Phenomena and Their Applications; Computing and
Learning with Dynamic Synapses; Stochastic Bit-Stream Neural Networks;
Hebbian Learning of Pulse Timing in the Barn Owl Auditory System. 

Recurrent
---------

Medsker, L.R., and Jain, L.C., eds. (2000), Recurrent Neural Networks:
Design and Applications, Boca Raton, FL: CRC Press, ISBN 0-8493-7181-3
Articles:
Introduction;
Recurrent Neural Networks for Optimization: The State of the Art;
Efficient Second-Order Learning Algorithms for Discrete-Time Recurrent
Neural Networks;
Designing High Order Recurrent Networks for Bayesian Belief Revision;
Equivalence in Knowledge Representation: Automata, Recurrent Neural
Networks, and Dynamical Fuzzy Systems;
Learning Long-Term Dependencies in NARX Recurrent Neural Networks;
Oscillation Responses in a Chaotic Recurrent Network;
Lessons from Language Learning;
Recurrent Autoassociative Networks: Developing Distributed Representations
of Hierarchically Structured Sequences by Autoassociation;
Comparison of Recurrent Neural Networks for Trajectory Generation;
Training Algorithms for Recurrent Neural Nets that Eliminate the Need for
Computation of Error Gradients with Application to Trajectory Production
Problem;
Training Recurrent Neural Networks for Filtering and Control;
Remembering How to Behave: Recurrent Neural Networks for Adaptive Robot
Behavior 

Reinforcement learning
----------------------

Sutton, R.S., and Barto, A.G. (1998), Reinforcement Learning: An
Introduction, The MIT Press, ISBN: 0-262193-98-1.
Author's Webpage: http://envy.cs.umass.edu/~rich/sutton.html and 
http://www-anw.cs.umass.edu/People/barto/barto.html
Book Webpage (Publisher):
http://mitpress.mit.edu/book-home.tcl?isbn=0262193981
Additional Information: http://www-anw.cs.umass.edu/~rich/book/the-book.html
Chapter headings: The Problem; Introduction; Evaluative Feedback; The
Reinforcement Learning Problem; Elementary Solution Methods; Dynamic
Programming; Monte Carlo Methods; Temporal-Difference Learning; A Unified
View; Eligibility Traces; Generalization and Function Approximation;
Planning and Learning; Dimensions of Reinforcement Learning; Case Studies. 

Bertsekas, D. P. and Tsitsiklis, J. N. (1996), Neuro-Dynamic
Programming, Belmont, MA: Athena Scientific, ISBN 1-886529-10-8.
Author's Webpage: http://www.mit.edu:8001/people/dimitrib/home.html and 
http://web.mit.edu/jnt/www/home.html
Book Webpage (Publisher):http://world.std.com/~athenasc/ndpbook.html

Speech recognition
------------------

Bourlard, H.A., and Morgan, N. (1994), Connectionist Speech Recognition: A
Hybrid Approach, Boston: Kluwer Academic Publishers, ISBN: 0792393961.
From The Publisher: Describes the theory and implementation of a method to
incorporate neural network approaches into state-of-the-art continuous
speech recognition systems based on Hidden Markov Models (HMMs) to improve
their performance. In this framework, neural networks (and in particular,
multilayer perceptrons or MLPs) have been restricted to well-defined
subtasks of the whole system, i.e., HMM emission probability estimation and
feature extraction. The book describes a successful five year international
collaboration between the authors. The lessons learned form a case study
that demonstrates how hybrid systems can be developed to combine neural
networks with more traditional statistical approaches. The book illustrates
both the advantages and limitations of neural networks in the framework of a
statistical system. Using standard databases and comparing with some
conventional approaches, it is shown that MLP probability estimation can
improve recognition performance. Other approaches are discussed, though
there is no such unequivocal experimental result for these methods.
Connectionist Speech Recognition: A Hybrid Approach is of use to anyone
intending to use neural networks for speech recognition or within the
framework provided by an existing successful statistical approach. This
includes research and development groups working in the field of speech
recognition, both with standard and neural network approaches, as well as
other pattern recognition and/or neural network researchers. This book is
also suitable as a text for advanced courses on neural networks or speech
processing. 

Statistics
----------

Cherkassky, V., Friedman, J.H., and Wechsler, H., eds. (1991) From
Statistics to Neural Networks: Theory and Pattern Recognition Applications, 
NY: Springer, ISBN 0-387-58199-5. 

Kay, J.W., and Titterington, D.M. (1999) Statistics and Neural Networks:
Advances at the Interface, Oxford: Oxford University Press, ISBN
0-19-852422-6.
Articles: Flexible Discriminant and Mixture Models; Neural Networks for
Unsupervised Learning Based on Information Theory; Radial Basis Function
Networks and Statistics; Robust Prediction in Many-parameter Models; Density
Networks; Latent Variable Models and Data Visualisation; Analysis of Latent
Structure Models with Multidimensional Latent Variables; Artificial Neural
Networks and Multivariate Statistics. 

White, H. (1992b), Artificial Neural Networks: Approximation and Learning
Theory, Blackwell, ISBN: 1557863296.
Articles: There Exists a Neural Network That Does Not Make Avoidable
Mistakes; Multilayer Feedforward Networks Are Universal Approximators;
Universal Approximation Using Feedforward Networks with Non-sigmoid Hidden
Layer Activation Functions; Approximating and Learning Unknown Mappings
Using Multilayer Feedforward Networks with Bounded Weights; Universal
Approximation of an Unknown Mapping and Its Derivatives; Neural Network
Learning and Statistics; Learning in Artificial Neural Networks: a
Statistical Perspective; Some Asymptotic Results for Learning in Single
Hidden Layer Feedforward Networks; Connectionist Nonparametric Regression:
Multilayer Feedforward Networks Can Learn Arbitrary Mappings; Nonparametric
Estimation of Conditional Quantiles Using Neural Networks; On Learning the
Derivatives of an Unknown Mapping with Multilayer Feedforward Networks;
Consequences and Detection of Misspecified Nonlinear Regression Models;
Maximum Likelihood Estimation of Misspecified Models; Some Results for Sieve
Estimation with Dependent Observations. 

Time-series forecasting
-----------------------

Weigend, A.S. and Gershenfeld, N.A., eds. (1994) Time Series Prediction:
Forecasting the Future and Understanding the Past, Reading, MA:
Addison-Wesley, ISBN 0201626020. Book Webpage (Publisher): 
http://www2.awl.com/gb/abp/sfi/complexity.html

Unsupervised learning
---------------------

Kohonen, T. (1995/1997), Self-Organizing Maps, 1st ed. 1995, 2nd ed. 1997,
Berlin: Springer-Verlag, ISBN 3540620176. 

Deco, G. and Obradovic, D. (1996), An Information-Theoretic Approach to
Neural Computing, NY: Springer-Verlag, ISBN 0-387-94666-7. 

Diamantaras, K.I., and Kung, S.Y. (1996) Principal Component Neural
Networks: Theory and Applications, NY: Wiley, ISBN 0-471-05436-4. 

Van Hulle, M.M. (2000), Faithful Representations and Topographic Maps:
From Distortion- to Information-Based Self-Organization, NY: Wiley, ISBN
0-471-34507-5. 

Books for the Beginner
++++++++++++++++++++++

Caudill, M. and Butler, C. (1990). Naturally Intelligent Systems. MIT Press:
Cambridge, Massachusetts. (ISBN 0-262-03156-6). 
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=0262531135
The authors try to translate mathematical formulas into English. The results
are likely to disturb people who appreciate either mathematics or English.
Have the authors never heard that "a picture is worth a thousand words"?
What few diagrams they have (such as the one on p. 74) tend to be confusing.
Their jargon is peculiar even by NN standards; for example, they refer to
target values as "mentor inputs" (p. 66). The authors do not understand
elementary properties of error functions and optimization algorithms. For
example, in their discussion of the delta rule, the authors seem oblivious
to the differences between batch and on-line training, and they attribute
magical properties to the algorithm (p. 71): 

   [The on-line delta] rule always takes the most efficient route from
   the current position of the weight vector to the "ideal" position,
   based on the current input pattern. The delta rule not only minimizes
   the mean squared error, it does so in the most efficient fashion
   possible--quite an achievement for such a simple rule. 

While the authors realize that backpropagation networks can suffer from
local minima, they mistakenly think that counterpropagation has some kind of
global optimization ability (p. 202): 

   Unlike the backpropagation network, a counterpropagation network
   cannot be fooled into finding a local minimum solution. This means
   that the network is guaranteed to find the correct response (or the
   nearest stored response) to an input, no matter what. 

But even though they acknowledge the problem of local minima, the authors
are ignorant of the importance of initial weight values (p. 186): 

   To teach our imaginary network something using backpropagation, we
   must start by setting all the adaptive weights on all the neurodes in
   it to random values. It won't matter what those values are, as long
   as they are not all the same and not equal to 1. 

Like most introductory books, this one neglects the difficulties of getting
good generalization--the authors simply declare (p. 8) that "A neural
network is able to generalize"! 

Chester, M. (1993). Neural Networks: A Tutorial, Englewood Cliffs, NJ: PTR
Prentice Hall. 
Book Webpage (Publisher): http://www.prenhall.com/
Additional Information: Seems to be out of print.
Shallow, sometimes confused, especially with regard to Kohonen networks. 

Dayhoff, J. E. (1990). Neural Network Architectures: An Introduction. Van
Nostrand Reinhold: New York. 
Comments from readers of comp.ai.neural-nets: "Like Wasserman's book,
Dayhoff's book is also very easy to understand".

Freeman, James (1994). Simulating Neural Networks with Mathematica,
Addison-Wesley, ISBN: 0-201-56629-X. Book Webpage (Publisher): 
http://cseng.aw.com/bookdetail.qry?ISBN=0-201-56629-X&ptype=0
Additional Information: Sourcecode available under: 
ftp://ftp.mathsource.com/pub/Publications/BookSupplements/Freeman-1993
Helps the reader make his own NNs. The mathematica code for the programs in
the book is also available through the internet: Send mail to 
MathSource@wri.com or try http://www.wri.com/ on the World Wide Web.

Freeman, J.A. and Skapura, D.M. (1991). Neural Networks: Algorithms,
Applications, and Programming Techniques, Reading, MA: Addison-Wesley. 
Book Webpage (Publisher): http://www.awl.com/
Additional Information: Seems to be out of print.
A good book for beginning programmers who want to learn how to write NN
programs while avoiding any understanding of what NNs do or why they do it. 

Gately, E. (1996). Neural Networks for Financial Forecasting. New York:
John Wiley and Sons, Inc.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Franco Insana comments:

* Decent book for the neural net beginner
* Very little devoted to statistical framework, although there 
    is some formulation of backprop theory
* Some food for thought
* Nothing here for those with any neural net experience

McClelland, J. L. and Rumelhart, D. E. (1988). Explorations in Parallel
Distributed Processing: Computational Models of Cognition and Perception
(software manual). The MIT Press. 
Book Webpage (Publisher): 
http://mitpress.mit.edu/book-home.tcl?isbn=026263113X (IBM version) and
http://mitpress.mit.edu/book-home.tcl?isbn=0262631296 (Macintosh)
Comments from readers of comp.ai.neural-nets: "Written in a tutorial style,
and includes 2 diskettes of NN simulation programs that can be compiled on
MS-DOS or Unix (and they do too !)"; "The programs are pretty reasonable as
an introduction to some of the things that NNs can do."; "There are *two*
editions of this book. One comes with disks for the IBM PC, the other comes
with disks for the Macintosh".

McCord Nelson, M. and Illingworth, W.T. (1990). A Practical Guide to Neural
Nets. Addison-Wesley Publishing Company, Inc. (ISBN 0-201-52376-0). 
Book Webpage (Publisher): 
http://cseng.aw.com/bookdetail.qry?ISBN=0-201-63378-7&ptype=1174
Lots of applications without technical details, lots of hype, lots of goofs,
no formulas.

Muller, B., Reinhardt, J., Strickland, M. T. (1995). Neural Networks.:An
Introduction (2nd ed.). Berlin, Heidelberg, New York: Springer-Verlag. ISBN
3-540-60207-0. (DOS 3.5" disk included.) 
Book Webpage (Publisher): 
http://www.springer.de/catalog/html-files/deutsch/phys/3540602070.html
Comments from readers of comp.ai.neural-nets: "The book was developed out of
a course on neural-network models with computer demonstrations that was
taught by the authors to Physics students. The book comes together with a
PC-diskette. The book is divided into three parts: (1) Models of Neural
Networks; describing several architectures and learing rules, including the
mathematics. (2) Statistical Physics of Neural Networks; "hard-core" physics
section developing formal theories of stochastic neural networks. (3)
Computer Codes; explanation about the demonstration programs. First part
gives a nice introduction into neural networks together with the formulas.
Together with the demonstration programs a 'feel' for neural networks can be
developed." 

Orchard, G.A. & Phillips, W.A. (1991). Neural Computation: A Beginner's
Guide. Lawrence Earlbaum Associates: London. 
Comments from readers of comp.ai.neural-nets: "Short user-friendly
introduction to the area, with a non-technical flavour. Apparently
accompanies a software package, but I haven't seen that yet".

Rao, V.B, and Rao, H.V. (1993). C++ Neural Networks and Fuzzy Logic.
MIS:Press, ISBN 1-55828-298-x, US $45 incl. disks. 
Covers a wider variety of networks than Masters (1993), but is shallow and
lacks Masters's insight into practical issues of using NNs.

Wasserman, P. D. (1989). Neural Computing: Theory & Practice. Van Nostrand
Reinhold: New York. (ISBN 0-442-20743-3) 
This is not as bad as some books on NNs. It provides an elementary account
of the mechanics of a variety of networks. But it provides no insight into
why various methods behave as they do, or under what conditions a method
will or will not work well. It has no discussion of efficient training
methods such as RPROP or conventional numerical optimization techniques.
And, most egregiously, it has no explanation of overfitting and
generalization beyond the patently false statement on p. 2 that "It is
important to note that the artificial neural network generalizes
automatically as a result of its structure"! There is no mention of
training, validation, and test sets, or of other methods for estimating
generalization error. There is no practical advice on the important issue of
choosing the number of hidden units. There is no discussion of early
stopping or weight decay. The reader will come away from this book with a
grossly oversimplified view of NNs and no concept whatsoever of how to use
NNs for practical applications. 

Comments from readers of comp.ai.neural-nets: "Wasserman flatly enumerates
some common architectures from an engineer's perspective ('how it works')
without ever addressing the underlying fundamentals ('why it works') -
important basic concepts such as clustering, principal components or
gradient descent are not treated. It's also full of errors, and unhelpful
diagrams drawn with what appears to be PCB board layout software from the
'70s. For anyone who wants to do active research in the field I consider it
quite inadequate"; "Okay, but too shallow"; "Quite easy to understand"; "The
best bedtime reading for Neural Networks. I have given this book to numerous
collegues who want to know NN basics, but who never plan to implement
anything. An excellent book to give your manager."

Not-quite-so-introductory Literature
++++++++++++++++++++++++++++++++++++

Kung, S.Y. (1993). Digital Neural Networks, Prentice Hall, Englewood
Cliffs, NJ.

Book Webpage (Publisher): http://www.prenhall.com/books/ptr_0136123260.html
Levine, D. S. (2000). Introduction to Neural and Cognitive Modeling. 2nd
ed., Lawrence Erlbaum: Hillsdale, N.J. 
Comments from readers of comp.ai.neural-nets: "Highly recommended".

Maren, A., Harston, C. and Pap, R., (1990). Handbook of Neural Computing
Applications. Academic Press. ISBN: 0-12-471260-6. (451 pages) 
Comments from readers of comp.ai.neural-nets: "They cover a broad area";
"Introductory with suggested applications implementation".

Pao, Y. H. (1989). Adaptive Pattern Recognition and Neural Networks
Addison-Wesley Publishing Company, Inc. (ISBN 0-201-12584-6) 
Book Webpage (Publisher): http://www.awl.com/
Comments from readers of comp.ai.neural-nets: "An excellent book that ties
together classical approaches to pattern recognition with Neural Nets. Most
other NN books do not even mention conventional approaches."

Refenes, A. (Ed.) (1995). Neural Networks in the Capital Markets.
Chichester, England: John Wiley and Sons, Inc.
Book Webpage (Publisher): http://www.wiley.com/
Additional Information: One has to search.
Franco Insana comments:

* Not for the beginner
* Excellent introductory material presented by editor in first 5 
  chapters, which could be a valuable reference source for any 
  practitioner
* Very thought-provoking
* Mostly backprop-related
* Most contributors lay good statistical foundation
* Overall, a wealth of information and ideas, but the reader has to 
  sift through it all to come away with anything useful

Simpson, P. K. (1990). Artificial Neural Systems: Foundations, Paradigms,
Applications and Implementations. Pergamon Press: New York. 
Comments from readers of comp.ai.neural-nets: "Contains a very useful 37
page bibliography. A large number of paradigms are presented. On the
negative side the book is very shallow. Best used as a complement to other
books".

Wasserman, P.D. (1993). Advanced Methods in Neural Computing. Van
Nostrand Reinhold: New York (ISBN: 0-442-00461-3). 
Comments from readers of comp.ai.neural-nets: "Several neural network topics
are discussed e.g. Probalistic Neural Networks, Backpropagation and beyond,
neural control, Radial Basis Function Networks, Neural Engineering.
Furthermore, several subjects related to neural networks are mentioned e.g.
genetic algorithms, fuzzy logic, chaos. Just the functionality of these
subjects is described; enough to get you started. Lots of references are
given to more elaborate descriptions. Easy to read, no extensive
mathematical background necessary." 

Zeidenberg. M. (1990). Neural Networks in Artificial Intelligence. Ellis
Horwood, Ltd., Chichester. 
Comments from readers of comp.ai.neural-nets: "Gives the AI point of view".

Zornetzer, S. F., Davis, J. L. and Lau, C. (1990). An Introduction to Neural
and Electronic Networks. Academic Press. (ISBN 0-12-781881-2) 
Comments from readers of comp.ai.neural-nets: "Covers quite a broad range of
topics (collection of articles/papers )."; "Provides a primer-like
introduction and overview for a broad audience, and employs a strong
interdisciplinary emphasis".

Zurada, Jacek M. (1992). Introduction To Artificial Neural Systems.
Hardcover, 785 Pages, 317 Figures, ISBN 0-534-95460-X, 1992, PWS Publishing
Company, Price: $56.75 (includes shipping, handling, and the ANS software
diskette). Solutions Manual available.
Comments from readers of comp.ai.neural-nets: "Cohesive and comprehensive
book on neural nets; as an engineering-oriented introduction, but also as a
research foundation. Thorough exposition of fundamentals, theory and
applications. Training and recall algorithms appear in boxes showing steps
of algorithms, thus making programming of learning paradigms easy. Many
illustrations and intuitive examples. Winner among NN textbooks at a senior
UG/first year graduate level-[175 problems]." Contents: Intro, Fundamentals
of Learning, Single-Layer & Multilayer Perceptron NN, Assoc. Memories,
Self-organizing and Matching Nets, Applications, Implementations, Appendix) 

Books with Source Code (C, C++)
+++++++++++++++++++++++++++++++

Blum, Adam (1992), Neural Networks in C++, Wiley.
-------------------------------------------------

Review by Ian Cresswell. (For a review of the text, see "The Worst" below.) 

Mr Blum has not only contributed a masterpiece of NN inaccuracy but also
seems to lack a fundamental understanding of Object Orientation. 

The excessive use of virtual methods (see page 32 for example), the
inclusion of unnecessary 'friend' relationships (page 133) and a penchant
for operator overloading (pick a page!) demonstrate inability in C++ and/or
OO. 

The introduction to OO that is provided trivialises the area and
demonstrates a distinct lack of direction and/or understanding. 

The public interfaces to classes are overspecified and the design relies
upon the flawed neuron/layer/network model. 

There is a notable disregard for any notion of a robust class hierarchy
which is demonstrated by an almost total lack of concern for inheritance and
associated reuse strategies. 

The attempt to rationalise differing types of Neural Network into a single
very shallow but wide class hierarchy is naive. 

The general use of the 'float' data type would cause serious hassle if this
software could possibly be extended to use some of the more sensitive
variants of backprop on more difficult problems. It is a matter of great
fortune that such software is unlikely to be reusable and will therefore,
like all good dinosaurs, disappear with the passage of time. 

The irony is that there is a card in the back of the book asking the
unfortunate reader to part with a further $39.95 for a copy of the software
(already included in print) on a 5.25" disk. 

The author claims that his work provides an 'Object Oriented Framework ...'.
This can best be put in his own terms (Page 137): 

... garble(float noise) ...

Swingler, K. (1996), Applying Neural Networks: A Practical Guide, London:
-------------------------------------------------------------------------
Academic Press. 
----------------

Review by Ian Cresswell. (For a review of the text, see "The Worst" below.) 

Before attempting to review the code associated with this book it should be
clearly stated that it is supplied as an extra--almost as an afterthought.
This may be a wise move. 

Although not as bad as other (even commercial) implementations, the code
provided lacks proper OO structure and is typical of C++ written in a C
style. 

Style criticisms include: 

1. The use of public data fields within classes (loss of encapsulation). 
2. Classes with no protected or private sections. 
3. Little or no use of inheritance and/or run-time polymorphism. 
4. Use of floats not doubles (a common mistake) to store values for
   connection weights. 
5. Overuse of classes and public methods. The network class has 59 methods
   in its public section. 
6. Lack of planning is evident for the construction of a class hierarchy. 

This code is without doubt written by a rushed C programmer. Whilst it would
require a C++ compiler to be successfully used, it lacks the tight
(optimised) nature of good C and the high level of abstraction of good C++. 

In a generous sense the code is free and the author doesn't claim any
expertise in software engineering. It works in a limited sense but would be
difficult to extend and/or reuse. It's fine for demonstration purposes in a
stand-alone manner and for use with the book concerned. 

If you're serious about nets you'll end up rewriting the whole lot (or
getting something better). 

The Worst
+++++++++

How not to use neural nets in any programming language
------------------------------------------------------

   Blum, Adam (1992), Neural Networks in C++, NY: Wiley. 

   Welstead, Stephen T. (1994), Neural Network and Fuzzy Logic
   Applications in C/C++, NY: Wiley. 

(For a review of Blum's source code, see "Books with Source Code" above.) 

Both Blum and Welstead contribute to the dangerous myth that any idiot can
use a neural net by dumping in whatever data are handy and letting it train
for a few days. They both have little or no discussion of generalization,
validation, and overfitting. Neither provides any valid advice on choosing
the number of hidden nodes. If you have ever wondered where these stupid
"rules of thumb" that pop up frequently come from, here's a source for one
of them: 

   "A rule of thumb is for the size of this [hidden] layer to be
   somewhere between the input layer size ... and the output layer size
   ..." Blum, p. 60. 

(John Lazzaro tells me he recently "reviewed a paper that cited this rule of
thumb--and referenced this book! Needless to say, the final version of that
paper didn't include the reference!") 

Blum offers some profound advice on choosing inputs: 

   "The next step is to pick as many input factors as possible that
   might be related to [the target]." 

Blum also shows a deep understanding of statistics: 

   "A statistical model is simply a more indirect way of learning
   correlations. With a neural net approach, we model the problem
   directly." p. 8. 

Blum at least mentions some important issues, however simplistic his advice
may be. Welstead just ignores them. What Welstead gives you is code--vast
amounts of code. I have no idea how anyone could write that much code for a
simple feedforward NN. Welstead's approach to validation, in his chapter on
financial forecasting, is to reserve two cases for the validation set! 

My comments apply only to the text of the above books. I have not examined
or attempted to compile the code. 

An impractical guide to neural nets
-----------------------------------

   Swingler, K. (1996), Applying Neural Networks: A Practical Guide, 
   London: Academic Press. 

(For a review of the source code, see "Books with Source Code" above.) 

This book has lots of good advice liberally sprinkled with errors, incorrect
formulas, some bad advice, and some very serious mistakes. Experts will
learn nothing, while beginners will be unable to separate the useful
information from the dangerous. For example, there is a chapter on "Data
encoding and re-coding" that would be very useful to beginners if it were
accurate, but the formula for the standard deviation is wrong, and the
description of the softmax function is of something entirely different than
softmax (see What is a softmax activation function?). Even more dangerous is
the statement on p. 28 that "Any pair of variables with high covariance are
dependent, and one may be chosen to be discarded." Although high
correlations can be used to identify redundant inputs, it is incorrect to
use high covariances for this purpose, since a covariance can be high simply
because one of the inputs has a high standard deviation. 

The most ludicrous thing I've found in the book is the claim that
Hecht-Neilsen used Kolmogorov's theorem to show that "you will never require
more than twice the number of hidden units as you have inputs" (p. 53) in an
MLP with one hidden layer. Actually, Hecht-Neilsen, says "the direct
usefulness of this result is doubtful, because no constructive method for
developing the [output activation] functions is known." Then Swingler
implies that V. Kurkova (1991, "Kolmogorov's theorem is relevant," Neural
Computation, 3, 617-622) confirmed this alleged upper bound on the number of
hidden units, saying that, "Kurkova was able to restate Kolmogorov's theorem
in terms of a set of sigmoidal functions." If Kolmogorov's theorem, or
Hecht-Nielsen's adaptation of it, could be restated in terms of known
sigmoid activation functions in the (single) hidden and output layers, then
Swingler's alleged upper bound would be correct, but in fact no such
restatement of Kolmogorov's theorem is possible, and Kurkova did not claim
to prove any such restatement. Swingler omits the crucial details that
Kurkova used two hidden layers, staircase-like activation functions (not
ordinary sigmoidal functions such as the logistic) in the first hidden
layer, and a potentially large number of units in the second hidden layer.
Kurkova later estimated the number of units required for uniform
approximation within an error epsilon as nm(m+1) in the first hidden
layer and m^2(m+1)^n in the second hidden layer, where n is the number
of inputs and m "depends on epsilon/||f|| as well as on the rate with
which f increases distances." In other words, Kurkova says nothing to
support Swinglers advice (repeated on p. 55), "Never choose h to be more
than twice the number of input units." Furthermore, constructing a counter
example to Swingler's advice is trivial: use one input and one output, where
the output is the sine of the input, and the domain of the input extends
over many cycles of the sine wave; it is obvious that many more than two
hidden units are required. For some sound information on choosing the number
of hidden units, see How many hidden units should I use? 

Choosing the number of hidden units is one important aspect of getting good
generalization, which is the most crucial issue in neural network training.
There are many other considerations involved in getting good generalization,
and Swingler makes several more mistakes in this area: 

 o There is dangerous misinformation on p. 55, where Swingler says, "If a
   data set contains no noise, then there is no risk of overfitting as there
   is nothing to overfit." It is true that overfitting is more common with
   noisy data, but severe overfitting can occur with noise-free data, even
   when there are more training cases than weights. There is an example of
   such overfitting under How many hidden layers should I use? 

 o Regarding the use of added noise (jitter) in training, Swingler says on
   p. 60, "The more noise you add, the more general your model becomes."
   This statement makes no sense as it stands (it would make more sense if
   "general" were changed to "smooth"), but it could certainly encourage a
   beginner to use far too much jitter--see What is jitter? (Training with
   noise). 

 o On p. 109, Swingler describes leave-one-out cross-validation, which he
   ascribes to Hecht-Neilsen. But Swingler concludes, "the method provides
   you with L minus 1 networks to choose from; none of which has been
   validated properly," completely missing the point that cross-validation
   provides an estimate of the generalization error of a network trained on
   the entire training set of L cases--see What are cross-validation and
   bootstrapping? Also, there are L leave-one-out networks, not L-1. 

While Swingler has some knowldege of statistics, his expertise is not
sufficient for him to detect that certain articles on neural nets are
statistically nonsense. For example, on pp. 139-140 he uncritically reports
a method that allegedly obtains error bars by doing a simple linear
regression on the target vs. output scores. To a trained statistician, this
method is obviously wrong (and, as usual in this book, the formula for
variance given for this method on p. 150 is wrong). On p. 110, Swingler
reports an article that attempts to apply bootstrapping to neural nets, but
this article is also obviously wrong to anyone familiar with bootstrapping.
While Swingler cannot be blamed entirely for accepting these articles at
face value, such misinformation provides yet more hazards for beginners. 

Swingler addresses many important practical issues, and often provides good
practical advice. But the peculiar combination of much good advice with some
extremely bad advice, a few examples of which are provided above, could
easily seduce a beginner into thinking that the book as a whole is reliable.
It is this danger that earns the book a place in "The Worst" list. 

Bad science writing
-------------------

   Dewdney, A.K. (1997), Yes, We Have No Neutrons: An Eye-Opening Tour
   through the Twists and Turns of Bad Science, NY: Wiley. 

This book, allegedly an expose of bad science, contains only one chapter of
19 pages on "the neural net debacle" (p. 97). Yet this chapter is so
egregiously misleading that the book has earned a place on "The Worst" list.
A detailed criticism of this chapter, along with some other sections of the
book, can be found at ftp://ftp.sas.com/pub/neural/badscience.html. Other
chapters of the book are reviewed in the November, 1997, issue of Scientific
American. 

------------------------------------------------------------------------

Subject: Journals and magazines about Neural Networks?
======================================================

[to be added: comments on speed of reviewing and publishing,
              whether they accept TeX format or ASCII by e-mail, etc.]

A. Dedicated Neural Network Journals:
+++++++++++++++++++++++++++++++++++++

Title:   Neural Networks
Publish: Pergamon Press
Address: Pergamon Journals Inc., Fairview Park, Elmsford,
         New York 10523, USA and Pergamon Journals Ltd.
         Headington Hill Hall, Oxford OX3, 0BW, England
Freq.:   10 issues/year (vol. 1 in 1988)
Cost/Yr: Free with INNS or JNNS or ENNS membership ($45?),
         Individual $65, Institution $175
ISSN #:  0893-6080
URL:     http://www.elsevier.nl/locate/inca/841
Remark:  Official Journal of International Neural Network Society (INNS),
         European Neural Network Society (ENNS) and Japanese Neural
         Network Society (JNNS).
         Contains Original Contributions, Invited Review Articles, Letters
         to Editor, Book Reviews, Editorials, Announcements, Software Surveys.

Title:   Neural Computation
Publish: MIT Press
Address: MIT Press Journals, 55 Hayward Street Cambridge,
         MA 02142-9949, USA, Phone: (617) 253-2889
Freq.:   Quarterly (vol. 1 in 1989)
Cost/Yr: Individual $45, Institution $90, Students $35; Add $9 Outside USA
ISSN #:  0899-7667
URL:     http://mitpress.mit.edu/journals-legacy.tcl
Remark:  Combination of Reviews (10,000 words), Views (4,000 words)
         and Letters (2,000 words).  I have found this journal to be of
         outstanding quality.
         (Note: Remarks supplied by Mike Plonski "plonski@aero.org")

Title:   NEURAL COMPUTING SURVEYS 
Publish: Lawrence Erlbaum Associates 
Address: 10 Industrial Avenue, Mahwah, NJ  07430-2262, USA
Freq.:   Yearly
Cost/Yr: Free on-line
ISSN #:  1093-7609
URL:     http://www.icsi.berkeley.edu/~jagota/NCS/
Remark:  One way to cope with the exponential increase in the number
         of articles published in recent years is to ignore most of
         them. A second, perhaps more satisfying, approach is to
         provide a forum that encourages the regular production --
         and perusal -- of high-quality survey articles. This is
         especially useful in an inter-disciplinary, evolving field
         such as neural networks. This journal aims to bring the
         second view-point to bear. It is intended to

         * encourage researchers to write good survey papers. 
         * motivate researchers to look here first to check
           what's known on an unfamiliar topic. 

Title:   IEEE Transactions on Neural Networks
Publish: Institute of Electrical and Electronics Engineers (IEEE)
Address: IEEE Service Cemter, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ,
         08855-1331 USA. Tel: (201) 981-0060
Cost/Yr: $10 for Members belonging to participating IEEE societies
Freq.:   Quarterly (vol. 1 in March 1990)
URL:     http://www.ieee.org/nnc/pubs/transactions.html
Remark:  Devoted to the science and technology of neural networks
         which disclose significant  technical knowledge, exploratory
         developments and applications of neural networks from biology to
         software to hardware.  Emphasis is on artificial neural networks.
         Specific aspects include self organizing systems, neurobiological
         connections, network dynamics and architecture, speech recognition,
         electronic and photonic implementation, robotics and controls.
         Includes Letters concerning new research results.
         (Note: Remarks are from journal announcement)

Title:   IEEE Transactions on Evolutionary Computation
Publish: Institute of Electrical and Electronics Engineers (IEEE)
Address: IEEE Service Cemter, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ,
         08855-1331 USA. Tel: (201) 981-0060
Cost/Yr: $10 for Members belonging to participating IEEE societies
Freq.:   Quarterly (vol. 1 in May 1997)
URL:     http://engine.ieee.org/nnc/pubs/transactions.html
Remark:  The IEEE Transactions on Evolutionary Computation will publish archival
         journal quality original papers in evolutionary computation and related
         areas, with particular emphasis on the practical application of the
         techniques to solving real problems in industry, medicine, and other
         disciplines.  Specific techniques include but are not limited to
         evolution strategies, evolutionary programming, genetic algorithms, and
         associated methods of genetic programming and classifier systems.  Papers
         emphasizing mathematical results should ideally seek to put these results
         in the context of algorithm design, however purely theoretical papers will
         be considered.  Other papers in the areas of cultural algorithms, artificial
         life, molecular computing, evolvable hardware, and the use of simulated
         evolution to gain a better understanding of naturally evolved systems are
         also encouraged.
         (Note: Remarks are from journal CFP)

Title:   International Journal of Neural Systems
Publish: World Scientific Publishing
Address: USA: World Scientific Publishing Co., 1060 Main Street, River Edge,
         NJ 07666. Tel: (201) 487 9655; Europe: World Scientific Publishing
         Co. Ltd., 57 Shelton Street, London WC2H 9HE, England.
         Tel: (0171) 836 0888; Asia: World Scientific Publishing Co. Pte. Ltd.,
         1022 Hougang Avenue 1 #05-3520, Singapore 1953, Rep. of Singapore
         Tel: 382 5663.
Freq.:   Quarterly (Vol. 1 in 1990)
Cost/Yr: Individual $122, Institution $255 (plus $15-$25 for postage)
ISSN #:  0129-0657 (IJNS)
Remark:  The International Journal of Neural Systems is a quarterly
         journal which covers information processing in natural
         and artificial neural systems. Contributions include research papers,
         reviews, and Letters to the Editor - communications under 3,000
         words in length, which are published within six months of receipt.
         Other contributions are typically published within nine months.
         The journal presents a fresh undogmatic attitude towards this
         multidisciplinary field and aims to be a forum for novel ideas and
         improved understanding of collective and cooperative phenomena with
         computational capabilities.
         Papers should be submitted to World Scientific's UK office. Once a
         paper is accepted for publication, authors are invited to e-mail
         the LaTeX source file of their paper in order to expedite publication.

Title:   International Journal of Neurocomputing
Publish: Elsevier Science Publishers, Journal Dept.; PO Box 211;
         1000 AE Amsterdam, The Netherlands
Freq.:   Quarterly (vol. 1 in 1989)
URL:     http://www.elsevier.nl/locate/inca/505628

Title:   Neural Processing Letters
Publish: Kluwer Academic publishers
Address: P.O. Box 322, 3300 AH Dordrecht, The Netherlands
Freq:    6 issues/year (vol. 1 in 1994)
Cost/Yr: Individuals $198, Institution $400 (including postage)
ISSN #:  1370-4621
URL:     http://www.wkap.nl/journalhome.htm/1370-4621
Remark:  The aim of the journal is to rapidly publish new ideas, original
         developments and work in progress.  Neural Processing Letters
         covers all aspects of the Artificial Neural Networks field.
         Publication delay is about 3 months.

Title:   Neural Network News
Publish: AIWeek Inc.
Address: Neural Network News, 2555 Cumberland Parkway, Suite 299,
         Atlanta, GA 30339 USA. Tel: (404) 434-2187
Freq.:   Monthly (beginning September 1989)
Cost/Yr: USA and Canada $249, Elsewhere $299
Remark:  Commercial Newsletter

Title:   Network: Computation in Neural Systems
Publish: IOP Publishing Ltd
Address: Europe: IOP Publishing Ltd, Techno House, Redcliffe Way, Bristol
         BS1 6NX, UK; IN USA: American Institute of Physics, Subscriber
         Services 500 Sunnyside Blvd., Woodbury, NY  11797-2999
Freq.:   Quarterly (1st issue 1990)
Cost/Yr: USA: $180,  Europe: 110 pounds
URL:     http://www.iop.org/Journals/ne
Remark:  Description: "a forum for integrating theoretical and experimental
         findings across relevant interdisciplinary boundaries."  Contents:
         Submitted articles reviewed by two technical referees  paper's
         interdisciplinary format and accessability."  Also Viewpoints and
         Reviews commissioned by the editors, abstracts (with reviews) of
         articles published in other journals, and book reviews.
         Comment: While the price discourages me (my comments are based
         upon a free sample copy), I think that the journal succeeds
         very well.  The highest density of interesting articles I
         have found in any journal.
         (Note: Remarks supplied by kehoe@csufres.CSUFresno.EDU)

Title:   Connection Science: Journal of Neural Computing,
         Artificial Intelligence and Cognitive Research
Publish: Carfax Publishing
Address: Europe: Carfax Publishing Company, PO Box 25, Abingdon, Oxfordshire
         OX14 3UE, UK.
         USA: Carfax Publishing Company, PO Box 2025, Dunnellon, Florida
         34430-2025, USA
         Australia: Carfax Publishing Company, Locked Bag 25, Deakin,
         ACT 2600, Australia
Freq.:   Quarterly (vol. 1 in 1989)
Cost/Yr: Personal rate:
         48 pounds (EC) 66 pounds (outside EC) US$118 (USA and Canada)
         Institutional rate:
         176 pounds (EC) 198 pounds (outside EC) US$340 (USA and Canada)

Title:   International Journal of Neural Networks
Publish: Learned Information
Freq.:   Quarterly (vol. 1 in 1989)
Cost/Yr: 90 pounds
ISSN #:  0954-9889
Remark:  The journal contains articles, a conference report (at least the
         issue I have), news and a calendar.
         (Note: remark provided by J.R.M. Smits "anjos@sci.kun.nl")

Title:   Sixth Generation Systems (formerly Neurocomputers)
Publish: Gallifrey Publishing
Address: Gallifrey Publishing, PO Box 155, Vicksburg, Michigan, 49097, USA
         Tel: (616) 649-3772, 649-3592 fax
Freq.    Monthly (1st issue January, 1987)
ISSN #:  0893-1585
Editor:  Derek F. Stubbs
Cost/Yr: $79 (USA, Canada), US$95 (elsewhere)
Remark:  Runs eight to 16 pages monthly. In 1995 will go to floppy disc-based
publishing with databases +, "the equivalent to 50 pages per issue are
planned." Often focuses on specific topics: e.g., August, 1994 contains two
articles: "Economics, Times Series and the Market," and "Finite Particle
Analysis - [part] II."  Stubbs also directs the company Advanced Forecasting
Technologies. (Remark by Ed Rosenfeld: ier@aol.com)

Title:   JNNS Newsletter (Newsletter of the Japan Neural Network Society)
Publish: The Japan Neural Network Society
Freq.:   Quarterly (vol. 1 in 1989)
Remark:  (IN JAPANESE LANGUAGE) Official Newsletter of the Japan Neural
         Network Society(JNNS)
         (Note: remarks by Osamu Saito "saito@nttica.NTT.JP")

Title:   Neural Networks Today
Remark:  I found this title in a bulletin board of october last year.
         It was a message of Tim Pattison, timpatt@augean.OZ
         (Note: remark provided by J.R.M. Smits "anjos@sci.kun.nl")

Title:   Computer Simulations in Brain Science

Title:   Internation Journal of Neuroscience

Title:   Neural Network Computation
Remark:  Possibly the same as "Neural Computation"

Title:   Neural Computing and Applications
Freq.:   Quarterly
Publish: Springer Verlag
Cost/yr: 120 Pounds
Remark:  Is the journal of the Neural Computing Applications Forum.
         Publishes original research and other information
         in the field of practical applications of neural computing.

B. NN Related Journals:
+++++++++++++++++++++++

Title:   Biological Cybernetics (Kybernetik)
Publish: Springer Verlag
Remark:  Monthly (vol. 1 in 1961)

Title:   Various IEEE Transactions and Magazines
Publish: IEEE
Remark:  Primarily see IEEE Trans. on System, Man and Cybernetics;
         Various Special Issues: April 1990 IEEE Control Systems
         Magazine.; May 1989 IEEE Trans. Circuits and Systems.;
         July 1988 IEEE Trans. Acoust. Speech Signal Process.

Title:   The Journal of Experimental and Theoretical Artificial Intelligence
Publish: Taylor & Francis, Ltd.
Address: London, New York, Philadelphia
Freq.:   ? (1st issue Jan 1989)
Remark:  For submission information, please contact either of the editors:
         Eric Dietrich                        Chris Fields
         PACSS - Department of Philosophy     Box 30001/3CRL
         SUNY Binghamton                      New Mexico State University
         Binghamton, NY 13901                 Las Cruces, NM 88003-0001
         dietrich@bingvaxu.cc.binghamton.edu  cfields@nmsu.edu

Title:   The Behavioral and Brain Sciences
Publish: Cambridge University Press
Remark:  (Remarks by Don Wunsch 
         This is a delightful journal that encourages discussion on a
         variety of controversial topics.  I have especially enjoyed
         reading some papers in there by Dana Ballard and Stephen
         Grossberg (separate papers, not collaborations) a few years
         back.  They have a really neat concept: they get a paper,
         then invite a number of noted scientists in the field to
         praise it or trash it.  They print these commentaries, and
         give the author(s) a chance to make a rebuttal or
         concurrence.  Sometimes, as I'm sure you can imagine, things
         get pretty lively. Their reviewers are called something like
         Behavioral and Brain Associates, and I believe they have to
         be nominated by current associates, and should be fairly
         well established in the field. The main thing is that I liked
         the articles I read. 

Title:   International Journal of Applied Intelligence
Publish: Kluwer Academic Publishers
Remark:  first issue in 1990(?)

Title:   International Journal of Modern Physics C
Publish: USA: World Scientific Publishing Co., 1060 Main Street, River Edge,
         NJ 07666. Tel: (201) 487 9655; Europe: World Scientific Publishing
         Co. Ltd., 57 Shelton Street, London WC2H 9HE, England.
         Tel: (0171) 836 0888; Asia: World Scientific Publishing Co. Pte. Ltd.,
         1022 Hougang Avenue 1 #05-3520, Singapore 1953, Rep. of Singapore
         Tel: 382 5663.
Freq:    bi-monthly
Eds:     H. Herrmann, R. Brower, G.C. Fox and S Nose

Title:   Machine Learning
Publish: Kluwer Academic Publishers
Address: Kluwer Academic Publishers
         P.O. Box 358
         Accord Station
         Hingham, MA 02018-0358 USA
Freq.:   Monthly (8 issues per year; increasing to 12 in 1993)
Cost/Yr: Individual $140 (1992); Member of AAAI or CSCSI $88
Remark:  Description: Machine Learning is an international forum for
         research on computational approaches to learning.  The journal
         publishes articles reporting substantive research results on a
         wide range of learning methods applied to a variety of task
         domains.  The ideal paper will make a theoretical contribution
         supported by a computer implementation.
         The journal has published many key papers in learning theory,
         reinforcement learning, and decision tree methods.  Recently
         it has published a special issue on connectionist approaches
         to symbolic reasoning.  The journal regularly publishes
         issues devoted to genetic algorithms as well.

Title:   INTELLIGENCE - The Future of Computing
Published by: Intelligence
Address: INTELLIGENCE, P.O. Box 20008, New York, NY 10025-1510, USA,
212-222-1123 voice & fax; email: ier@aol.com, CIS: 72400,1013
Freq.    Monthly plus four special reports each year (1st issue: May, 1984)
ISSN #:  1042-4296
Editor:  Edward Rosenfeld
Cost/Yr: $395 (USA), US$450 (elsewhere)
Remark:  Has absorbed several other newsletters, like Synapse/Connection
         and Critical Technology Trends (formerly AI Trends).
         Covers NN, genetic algorithms, fuzzy systems, wavelets, chaos
         and other advanced computing approaches, as well as molecular
         computing and nanotechnology.

Title:   Journal of Physics A: Mathematical and General
Publish: Inst. of Physics, Bristol
Freq:    24 issues per year.
Remark:  Statistical mechanics aspects of neural networks
         (mostly Hopfield models).

Title:   Physical Review A: Atomic, Molecular and Optical Physics
Publish: The American Physical Society (Am. Inst. of Physics)
Freq:    Monthly
Remark:  Statistical mechanics of neural networks.

Title:   Information Sciences
Publish: North Holland (Elsevier Science)
Freq.:   Monthly
ISSN:    0020-0255
Editor:  Paul P. Wang; Department of Electrical Engineering; Duke University;
         Durham, NC 27706, USA

------------------------------------------------------------------------

Subject: Conferences and Workshops on Neural
============================================
Networks?
=========

 o The journal "Neural Networks" has a list of conferences, workshops and
   meetings in each issue. 
 o NEuroNet maintains a list of Neural Network Events at 
   http://www.kcl.ac.uk/neuronet/events/index.html 
 o The IEEE Neural Network Council maintains a list of conferences at 
   http://www.ieee.org/nnc. 
 o Conferences, workshops, and other events concerned with neural networks,
   inductive learning, genetic algorithms, data mining, agents, applications
   of AI, pattern recognition, vision, and related fields. are listed at
   Georg Thimm's web page http://www.drc.ntu.edu.sg/users/mgeorg/enter.epl 

------------------------------------------------------------------------

Subject: Neural Network Associations?
=====================================

1. International Neural Network Society (INNS).
+++++++++++++++++++++++++++++++++++++++++++++++

   INNS membership includes subscription to "Neural Networks", the official
   journal of the society. Membership is $55 for non-students and $45 for
   students per year. Address: INNS Membership, P.O. Box 491166, Ft.
   Washington, MD 20749. 

2. International Student Society for Neural Networks
++++++++++++++++++++++++++++++++++++++++++++++++++++
   (ISSNNets).
   +++++++++++

   Membership is $5 per year. Address: ISSNNet, Inc., P.O. Box 15661,
   Boston, MA 02215 USA 

3. Women In Neural Network Research and technology
++++++++++++++++++++++++++++++++++++++++++++++++++
   (WINNERS).
   ++++++++++

   Address: WINNERS, c/o Judith Dayhoff, 11141 Georgia Ave., Suite 206,
   Wheaton, MD 20902. Phone: 301-933-9000. 

4. European Neural Network Society (ENNS)
+++++++++++++++++++++++++++++++++++++++++

   ENNS membership includes subscription to "Neural Networks", the official
   journal of the society. Membership is currently (1994) 50 UK pounds (35
   UK pounds for students) per year. Address: ENNS Membership, Centre for
   Neural Networks, King's College London, Strand, London WC2R 2LS, United
   Kingdom. 

5. Japanese Neural Network Society (JNNS)
+++++++++++++++++++++++++++++++++++++++++

   Address: Japanese Neural Network Society; Department of Engineering,
   Tamagawa University; 6-1-1, Tamagawa Gakuen, Machida City, Tokyo; 194
   JAPAN; Phone: +81 427 28 3457, Fax: +81 427 28 3597 

6. Association des Connexionnistes en THese (ACTH)
++++++++++++++++++++++++++++++++++++++++++++++++++

   (the French Student Association for Neural Networks); Membership is 100
   FF per year; Activities: newsletter, conference (every year), list of
   members, electronic forum; Journal 'Valgo' (ISSN 1243-4825); WWW page: 
   http://www.supelec-rennes.fr/acth/welcome.html ; Contact: acth@loria.fr 

7. Neurosciences et Sciences de l'Ingenieur (NSI)
+++++++++++++++++++++++++++++++++++++++++++++++++

   Biology & Computer Science Activity : conference (every year) Address :
   NSI - TIRF / INPG 46 avenue Felix Viallet 38031 Grenoble Cedex FRANCE 

8. IEEE Neural Networks Council
+++++++++++++++++++++++++++++++

   Web page at http://www.ieee.org/nnc 

9. SNN (Foundation for Neural Networks)
+++++++++++++++++++++++++++++++++++++++

   The Foundation for Neural Networks (SNN) is a university based non-profit
   organization that stimulates basic and applied research on neural
   networks in the Netherlands. Every year SNN orgines a symposium on Neural
   Networks. See http://www.mbfys.kun.nl/SNN/. 

You can find nice lists of NN societies in the WWW at 
http://www.emsl.pnl.gov:2080/proj/neuron/neural/societies.html and at 
http://www.ieee.org:80/nnc/research/othernnsoc.html. 

------------------------------------------------------------------------

Subject: Mailing lists, BBS, CD-ROM?
====================================

   See also "Other NN links?" in Part 7 of the FAQ.

1. Machine Learning mailing list
++++++++++++++++++++++++++++++++

   http://groups.yahoo.com/group/machine-learning/ 

   The Machine Learning mailing list is an unmoderated mailing list intended
   for people in Computer Sciences, Statistics, Mathematics, and other areas
   or disciplines with interests in Machine Learning. Researchers,
   practitioners, and users of Machine Learning in academia, industry, and
   government are encouraged to join the list to discuss and exchange ideas
   regarding any aspect of Machine Learning, e.g., various learning
   algorithms, data pre-processing, variable selection mechanism, instance
   selection, and applications to real-world problems. 

   You can post, read, and reply messages on the Web. Or you can choose to
   receive messages as individual emails, daily summaries, daily full-text
   digest, or read them on the Web only. 

2. The Connectionists Mailing List 
+++++++++++++++++++++++++++++++++++

   http://www.cnbc.cmu.edu/other/connectionists.html 

   CONNECTIONISTS is a moderated mailing list for discussion of technical
   issues relating to neural computation, and dissemination of professional
   announcements such as calls for papers, book announcements, and
   electronic preprints. CONNECTIONISTS is focused on meeting the needs of
   active researchers in the field, not on answering questions from
   beginners. 

3. Central Neural System Electronic Bulletin Board
++++++++++++++++++++++++++++++++++++++++++++++++++

      URL: ftp://www.centralneuralsystem.com/pub/CNS/bbs
      Supported by: Wesley R. Elsberry
                    3027 Macaulay Street
                    San Diego, CA 92106
      Email: welsberr@inia.cls.org
      Alternative URL: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/areas/neural/cns/0.html
      

   Many MS-DOS PD and shareware simulations, source code, benchmarks,
   demonstration packages, information files; some Unix, Macintosh, Amiga
   related files. Also available are files on AI, AI Expert listings
   1986-1991, fuzzy logic, genetic algorithms, artificial life, evolutionary
   biology, and many Project Gutenberg and Wiretap e-texts. 

4. AI CD-ROM
++++++++++++

   Network Cybernetics Corporation produces the "AI CD-ROM". It is an
   ISO-9660 format CD-ROM and contains a large assortment of software
   related to artificial intelligence, artificial life, virtual reality, and
   other topics. Programs for OS/2, MS-DOS, Macintosh, UNIX, and other
   operating systems are included. Research papers, tutorials, and other
   text files are included in ASCII, RTF, and other universal formats. The
   files have been collected from AI bulletin boards, Internet archive
   sites, University computer deptartments, and other government and
   civilian AI research organizations. Network Cybernetics Corporation
   intends to release annual revisions to the AI CD-ROM to keep it up to
   date with current developments in the field. The AI CD-ROM includes
   collections of files that address many specific AI/AL topics including
   Neural Networks (Source code and executables for many different platforms
   including Unix, DOS, and Macintosh. ANN development tools, example
   networks, sample data, tutorials. A complete collection of Neural Digest
   is included as well.) The AI CD-ROM may be ordered directly by check,
   money order, bank draft, or credit card from: 

           Network Cybernetics Corporation;
           4201 Wingren Road Suite 202;
           Irving, TX 75062-2763;
           Tel 214/650-2002;
           Fax 214/650-1929;

   The cost is $129 per disc + shipping ($5/disc domestic or $10/disc
   foreign) (See the comp.ai FAQ for further details) 

------------------------------------------------------------------------

Subject: How to benchmark learning methods?
===========================================

The NN benchmarking resources page at 
http://wwwipd.ira.uka.de/~prechelt/NIPS_bench.html was created after a NIPS
1995 workshop on NN benchmarking. The page contains pointers to various
papers on proper benchmarking methodology and to various sources of
datasets. 

Benchmark studies require some familiarity with the statistical design and
analysis of experiments. There are many textbooks on this subject, of which
Cohen (1995) will probably be of particular interest to researchers in
neural nets and machine learning (see also the review of Cohen's book by Ron
Kohavi in the International Journal of Neural Systems, which can be found
on-line at http://robotics.stanford.edu/users/ronnyk/ronnyk-bib.html). 

Reference: 

   Cohen, P.R. (1995), Empirical Methods for Artificial Intelligence,
   Cambridge, MA: The MIT Press. 

------------------------------------------------------------------------

Subject: Databases for experimentation with NNs?
================================================

1. UCI machine learning database
++++++++++++++++++++++++++++++++

   A large collection of data sets accessible via anonymous FTP at
   ftp.ics.uci.edu [128.195.1.1] in directory 
   /pub/machine-learning-databases" or via web browser at 
   http://www.ics.uci.edu/~mlearn/MLRepository.html 

2. UCI KDD Archive
++++++++++++++++++

   The UC Irvine Knowledge Discovery in Databases (KDD) Archive at 
   http://kdd.ics.uci.edu/ is an online repository of large datasets which
   encompasses a wide variety of data types, analysis tasks, and application
   areas. The primary role of this repository is to serve as a benchmark
   testbed to enable researchers in knowledge discovery and data mining to
   scale existing and future data analysis algorithms to very large and
   complex data sets. This archive is supported by the Information and Data
   Management Program at the National Science Foundation, and is intended to
   expand the current UCI Machine Learning Database Repository to datasets
   that are orders of magnitude larger and more complex. 

3. The neural-bench Benchmark collection
++++++++++++++++++++++++++++++++++++++++

   Accessible at http://www.boltz.cs.cmu.edu/ or via anonymous FTP at 
   ftp://ftp.boltz.cs.cmu.edu/pub/neural-bench/. In case of problems or if
   you want to donate data, email contact is "neural-bench@cs.cmu.edu". The
   data sets in this repository include the 'nettalk' data, 'two spirals',
   protein structure prediction, vowel recognition, sonar signal
   classification, and a few others. 

4. Proben1
++++++++++

   Proben1 is a collection of 12 learning problems consisting of real data.
   The datafiles all share a single simple common format. Along with the
   data comes a technical report describing a set of rules and conventions
   for performing and reporting benchmark tests and their results.
   Accessible via anonymous FTP on ftp.cs.cmu.edu [128.2.206.173] as 
   /afs/cs/project/connect/bench/contrib/prechelt/proben1.tar.gz. and also
   on ftp.ira.uka.de as /pub/neuron/proben1.tar.gz. The file is about 1.8 MB
   and unpacks into about 20 MB. 

5. Delve: Data for Evaluating Learning in Valid Experiments
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

   Delve is a standardised, copyrighted environment designed to evaluate the
   performance of learning methods. Delve makes it possible for users to
   compare their learning methods with other methods on many datasets. The
   Delve learning methods and evaluation procedures are well documented,
   such that meaningful comparisons can be made. The data collection
   includes not only isolated data sets, but "families" of data sets in
   which properties of the data, such as number of inputs and degree of
   nonlinearity or noise, are systematically varied. The Delve web page is
   at http://www.cs.toronto.edu/~delve/ 

6. Bilkent University Function Approximation Repository
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

   A repository of data sets collected mainly by searching resources on the
   web can be found at http://funapp.cs.bilkent.edu.tr/DataSets/ Most of the
   data sets are used for the experimental analysis of function
   approximation techniques and for training and demonstration by machine
   learning and statistics community. The original sources of most data sets
   can be accessed via associated links. A compressed tar file containing
   all data sets is available. 

7. NIST special databases of the National Institute Of Standards
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   And Technology:
   +++++++++++++++

   Several large databases, each delivered on a CD-ROM. Here is a quick
   list. 
    o NIST Binary Images of Printed Digits, Alphas, and Text 
    o NIST Structured Forms Reference Set of Binary Images 
    o NIST Binary Images of Handwritten Segmented Characters 
    o NIST 8-bit Gray Scale Images of Fingerprint Image Groups 
    o NIST Structured Forms Reference Set 2 of Binary Images 
    o NIST Test Data 1: Binary Images of Hand-Printed Segmented Characters 
    o NIST Machine-Print Database of Gray Scale and Binary Images 
    o NIST 8-Bit Gray Scale Images of Mated Fingerprint Card Pairs 
    o NIST Supplemental Fingerprint Card Data (SFCD) for NIST Special
      Database 9 
    o NIST Binary Image Databases of Census Miniforms (MFDB) 
    o NIST Mated Fingerprint Card Pairs 2 (MFCP 2) 
    o NIST Scoring Package Release 1.0 
    o NIST FORM-BASED HANDPRINT RECOGNITION SYSTEM 
   Here are example descriptions of two of these databases: 

   NIST special database 2: Structured Forms Reference Set (SFRS)
   --------------------------------------------------------------

   The NIST database of structured forms contains 5,590 full page images of
   simulated tax forms completed using machine print. THERE IS NO REAL TAX
   DATA IN THIS DATABASE. The structured forms used in this database are 12
   different forms from the 1988, IRS 1040 Package X. These include Forms
   1040, 2106, 2441, 4562, and 6251 together with Schedules A, B, C, D, E, F
   and SE. Eight of these forms contain two pages or form faces making a
   total of 20 form faces represented in the database. Each image is stored
   in bi-level black and white raster format. The images in this database
   appear to be real forms prepared by individuals but the images have been
   automatically derived and synthesized using a computer and contain no
   "real" tax data. The entry field values on the forms have been
   automatically generated by a computer in order to make the data available
   without the danger of distributing privileged tax information. In
   addition to the images the database includes 5,590 answer files, one for
   each image. Each answer file contains an ASCII representation of the data
   found in the entry fields on the corresponding image. Image format
   documentation and example software are also provided. The uncompressed
   database totals approximately 5.9 gigabytes of data. 

   NIST special database 3: Binary Images of Handwritten Segmented
   ---------------------------------------------------------------
   Characters (HWSC)
   -----------------

   Contains 313,389 isolated character images segmented from the 2,100
   full-page images distributed with "NIST Special Database 1". 223,125
   digits, 44,951 upper-case, and 45,313 lower-case character images. Each
   character image has been centered in a separate 128 by 128 pixel region,
   error rate of the segmentation and assigned classification is less than
   0.1%. The uncompressed database totals approximately 2.75 gigabytes of
   image data and includes image format documentation and example software.

   The system requirements for all databases are a 5.25" CD-ROM drive with
   software to read ISO-9660 format. Contact: Darrin L. Dimmick;
   dld@magi.ncsl.nist.gov; (301)975-4147

   The prices of the databases are between US$ 250 and 1895 If you wish to
   order a database, please contact: Standard Reference Data; National
   Institute of Standards and Technology; 221/A323; Gaithersburg, MD 20899;
   Phone: (301)975-2208; FAX: (301)926-0416

   Samples of the data can be found by ftp on sequoyah.ncsl.nist.gov in
   directory /pub/data A more complete description of the available
   databases can be obtained from the same host as 
   /pub/databases/catalog.txt 

8. CEDAR CD-ROM 1: Database of Handwritten Cities, States,
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   ZIP Codes, Digits, and Alphabetic Characters
   ++++++++++++++++++++++++++++++++++++++++++++

   The Center Of Excellence for Document Analysis and Recognition (CEDAR)
   State University of New York at Buffalo announces the availability of
   CEDAR CDROM 1: USPS Office of Advanced Technology The database contains
   handwritten words and ZIP Codes in high resolution grayscale (300 ppi
   8-bit) as well as binary handwritten digits and alphabetic characters
   (300 ppi 1-bit). This database is intended to encourage research in
   off-line handwriting recognition by providing access to handwriting
   samples digitized from envelopes in a working post office. 

        Specifications of the database include:
        +    300 ppi 8-bit grayscale handwritten words (cities,
             states, ZIP Codes)
             o    5632 city words
             o    4938 state words
             o    9454 ZIP Codes
        +    300 ppi binary handwritten characters and digits:
             o    27,837 mixed alphas  and  numerics  segmented
                  from address blocks
             o    21,179 digits segmented from ZIP Codes
        +    every image supplied with  a  manually  determined
             truth value
        +    extracted from live mail in a  working  U.S.  Post
             Office
        +    word images in the test  set  supplied  with  dic-
             tionaries  of  postal  words that simulate partial
             recognition of the corresponding ZIP Code.
        +    digit images included in test  set  that  simulate
             automatic ZIP Code segmentation.  Results on these
             data can be projected to overall ZIP Code recogni-
             tion performance.
        +    image format documentation and software included

   System requirements are a 5.25" CD-ROM drive with software to read
   ISO-9660 format. For further information, see 
   http://www.cedar.buffalo.edu/Databases/CDROM1/ or send email to Ajay
   Shekhawat at <ajay@cedar.Buffalo.EDU> 

   There is also a CEDAR CDROM-2, a database of machine-printed Japanese
   character images. 

9. AI-CD-ROM (see question "Other sources of information")
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

10. Time series
+++++++++++++++

   Santa Fe Competition
   --------------------

   Various datasets of time series (to be used for prediction learning
   problems) are available for anonymous ftp from ftp.santafe.edu in 
   /pub/Time-Series". Data sets include:
    o Fluctuations in a far-infrared laser 
    o Physiological data of patients with sleep apnea; 
    o High frequency currency exchange rate data; 
    o Intensity of a white dwarf star; 
    o J.S. Bachs final (unfinished) fugue from "Die Kunst der Fuge" 

   Some of the datasets were used in a prediction contest and are described
   in detail in the book "Time series prediction: Forecasting the future and
   understanding the past", edited by Weigend/Gershenfield, Proceedings
   Volume XV in the Santa Fe Institute Studies in the Sciences of Complexity
   series of Addison Wesley (1994). 

   M3 Competition
   --------------

   3003 time series from the M3 Competition can be found at 
   http://forecasting.cwru.edu/Data/index.html 

   The numbers of series of various types are given in the following table: 

   Interval  Micro Industry    Macro  Finance    Demog    Other    Total
   Yearly      146      102       83       58      245       11      645
   Quarterly   204       83      336       76       57        0      756
   Monthly     474      334      312      145      111       52     1428
   Other         4        0        0       29        0      141      174
   Total       828      519      731      308      413      204     3003

   Rob Hyndman's Time Series Data Library
   --------------------------------------

   A collection of over 500 time series on subjects including agriculture,
   chemistry, crime, demography, ecology, economics & finance, health,
   hydrology & meteorology, industry, physics, production, sales, simulated
   series, sport, transport & tourism, and tree-rings can be found at 
   http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/ 

11. Financial data
++++++++++++++++++

   http://chart.yahoo.com/d?s=

   http://www.chdwk.com/data/index.html

12. USENIX Faces
++++++++++++++++

   The USENIX faces archive is a public database, accessible by ftp, that
   can be of use to people working in the fields of human face recognition,
   classification and the like. It currently contains 5592 different faces
   (taken at USENIX conferences) and is updated twice each year. The images
   are mostly 96x128 greyscale frontal images and are stored in ascii files
   in a way that makes it easy to convert them to any usual graphic format
   (GIF, PCX, PBM etc.). Source code for viewers, filters, etc. is provided.
   Each image file takes approximately 25K. 

   For further information, see http://facesaver.usenix.org/

   According to the archive administrator, Barbara L. Dijker
   (barb.dijker@labyrinth.com), there is no restriction to use them.
   However, the image files are stored in separate directories corresponding
   to the Internet site to which the person represented in the image
   belongs, with each directory containing a small number of images (two in
   the average). This makes it difficult to retrieve by ftp even a small
   part of the database, as you have to get each one individually.
   A solution, as Barbara proposed me, would be to compress the whole set of
   images (in separate files of, say, 100 images) and maintain them as a
   specific archive for research on face processing, similar to the ones
   that already exist for fingerprints and others. The whole compressed
   database would take some 30 megabytes of disk space. I encourage anyone
   willing to host this database in his/her site, available for anonymous
   ftp, to contact her for details (unfortunately I don't have the resources
   to set up such a site). 

   Please consider that UUNET has graciously provided the ftp server for the
   FaceSaver archive and may discontinue that service if it becomes a
   burden. This means that people should not download more than maybe 10
   faces at a time from uunet. 

   A last remark: each file represents a different person (except for
   isolated cases). This makes the database quite unsuitable for training
   neural networks, since for proper generalisation several instances of the
   same subject are required. However, it is still useful for use as testing
   set on a trained network. 

13. Linguistic Data Consortium
++++++++++++++++++++++++++++++

   The Linguistic Data Consortium (URL: 
   http://www.ldc.upenn.edu/ldc/noframe.html) is an open consortium of
   universities, companies and government research laboratories. It creates,
   collects and distributes speech and text databases, lexicons, and other
   resources for research and development purposes. The University of
   Pennsylvania is the LDC's host institution. The LDC catalog includes
   pronunciation lexicons, varied lexicons, broadcast speech, microphone
   speech, mobile-radio speech, telephone speech, broadcast text,
   conversation text, newswire text, parallel text, and varied text, at
   widely varying fees. 

      Linguistic Data Consortium 
      University of Pennsylvania 
      3615 Market Street, Suite 200 
      Philadelphia, PA 19104-2608 
      Tel (215) 898-0464 Fax (215) 573-2175
      Email: ldc@ldc.upenn.edu 
      

14. Otago Speech Corpus
+++++++++++++++++++++++

   The Otago Speech Corpus contains speech samples in RIFF WAVE format that
   can be downloaded from 
   http://divcom.otago.ac.nz/infosci/kel/software/RICBIS/hyspeech_main.html 

15. Astronomical Time Series
++++++++++++++++++++++++++++

   Prepared by Paul L. Hertz (Naval Research Laboratory) & Eric D. Feigelson
   (Pennsyvania State University): 
    o Detection of variability in photon counting observations 1
      (QSO1525+337) 
    o Detection of variability in photon counting observations 2 (H0323+022)
    o Detection of variability in photon counting observations 3 (SN1987A) 
    o Detecting orbital and pulsational periodicities in stars 1 (binaries) 
    o Detecting orbital and pulsational periodicities in stars 2 (variables)
    o Cross-correlation of two time series 1 (Sun) 
    o Cross-correlation of two time series 2 (OJ287) 
    o Periodicity in a gamma ray burster (GRB790305) 
    o Solar cycles in sunspot numbers (Sun) 
    o Deconvolution of sources in a scanning operation (HEAO A-1) 
    o Fractal time variability in a seyfert galaxy (NGC5506) 
    o Quasi-periodic oscillations in X-ray binaries (GX5-1) 
    o Deterministic chaos in an X-ray pulsar? (Her X-1) 
   URL: http://xweb.nrl.navy.mil/www_hertz/timeseries/timeseries.html 

16. Miscellaneous Images
++++++++++++++++++++++++

   The USC-SIPI Image Database: 
   http://sipi.usc.edu/services/database/Database.html

   CityU Image Processing Lab: 
   http://www.image.cityu.edu.hk/images/database.html

   Center for Image Processing Research: http://cipr.rpi.edu/

   Computer Vision Test Images: 
   http://www.cs.cmu.edu:80/afs/cs/project/cil/ftp/html/v-images.html

   Lenna 97: A Complete Story of Lenna: 
   http://www.image.cityu.edu.hk/images/lenna/Lenna97.html

17. StatLib
+++++++++++

   The StatLib repository at http://lib.stat.cmu.edu/ at Carnegie Mellon
   University has a large collection of data sets, many of which can be used
   with NNs. 

------------------------------------------------------------------------

Next part is part 5 (of 7). Previous part is part 3. 

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.