TY - BOOK AU - Haykin,Simon TI - Neural networks : : a comprehensive foundation / SN - 0132733501 PY - 1999/// CY - Upper Saddle River, New Jersey : PB - Prentice-Hall, KW - NEURAL NETWORKS N1 - CONTENIDO 1. Introduction 1 What Is a Neural Network? 1 Human Brain 6 Models of a Neuron 10 Neural Networks Viewed as Directed Graphs 15 Feedback 18 Network Architectures 21 Knowledge Representation 23 Artificial Intelligence and Neural Networks 34 Historical Notes 38 2. Learning Processes 50 Error-Correction Learning 51 Memory-Based Learning 53 Hebbian Learning 55 Competitive Learning 58 Boltzmann Learning 60 Credit Assignment Problem 62 Learning with a Teacher 63 Learning without a Teacher 64 Learning Tasks 66 Memory 75 Adaptation 83 Statistical Nature of the Learning Process 84 Statistical Learning Theory 89 Probably Approximately Correct Model of Learning 102 3. Single Layer Perceptrons 117 Adaptive Filtering Problem 118 Unconstrained Optimization Techniques 121 Linear Least-Squares Filters 126 Least-Mean-Square Algorithm 128 Learning Curves 133 Learning Rate Annealing Techniques 134 Perceptron 135 Perceptron Convergence Theorem 137 Relation Between the Perceptron and Bayes Classifier for a Gaussian Environment 143 4. Multilayer Perceptrons 156 Some Preliminaries 159 Back-Propagation Algorithm 161 Summary of the Back-Propagation Algorithm 173 XOR Problem 175 Heuristics for Making the Back-Propagation Algorithm Perform Better 178 Output Representation and Decision Rule 184 Computer Experiment 187 Feature Detection 199 Back-Propagation and Differentiation 202 Hessian Matrix 204 Generalization 205 Approximations of Functions 208 Cross-Validation 213 Network Pruning Techniques 218 Virtues and Limitations of Back-Propagation Learning 226 Accelerated Convergence of Back-Propagation Learning 233 Supervised Learning Viewed as an Optimization Problem 234 Convolutional Networks 245 5. Radial-Basis Function Networks 256 Cover's Theorem on the Separability of Patterns 257 Interpolation Problem 262 Supervised Learning as an Ill-Posed Hypersurface Reconstruction Problem 265 Regularization Theory 267 Regularization Networks 277 Generalized Radial-Basis Function Networks 278 XOR Problem (Revisited) 282 Estimation of the Regularization Parameter 284 Approximation Properties of RBF Networks 290 Comparison of RBF Networks and Multilayer Perceptrons 293 Kernel Regression and Its Its Relation to RBF Networks 294 Learning Strategies 298 Computer Experiment 305 6. Support Vector Machines 318 Optimal Hyperplane for Linearly Separable Patterns 319 Optimal Hyperplane for Nonseparable Patterns 326 How to Build a Support Vector Machine for Pattern Recognition 329 Example: XOR Problem (Revisited) 335 Computer Experiment 337 epsis-Insensitive Loss Function 339 Support Vector Machines for Nonlinear Regression 340 7. Committee Machines 351 Ensemble Averaging 353 Computer Experiment I 355 Boosting 357 Computer Experiment II 364 Associative Gaussian Mixture Model 366 Hierarchical Mixture of Experts Model 372 Model Selection Using a Standard Decision Tree 374 A Priori and A Posteriori Probabilities 377 Maximum Likelihood Estimation 378 Learning Strategies for the HME Model 380 EM Algorithm 382 Application of the EM Algorithm to the HME Model 383 8. Principal Components Analysis 392 Some Intuitive Principles of Self-Organization 393 Principal Components Analysis 396 Hebbian-Based Maximum Eigenfilter 404 Hebbian-Based Principal Components Analysis 413 Computer Experiment: Image Coding 419 Adaptive Principal Components Analysis Using Lateral Inhibition 422 Two Classes of PCA Algorithms 430 Batch and Adaptive Methods of Computation 430 Kernel-Based Principal Components Analysis 432 9. Self-Organizing Maps 443 Two Basic Feature-Mapping Models 444 Self-Organizing Map 446 Summary of the SOM Algorithm 453 Properties of the Feature Map 454 Computer Simulations 461 Learning Vector Quantization 466 Computer Experiment: Adaptive Pattern Classification 468 Hierarchical Vector Quantization 470 Contextual Maps 474 10. Information-Theoretic Models 484 Entropy 485 Maximum Entropy Principle 490 Mutual Information 492 Kullback-Leibler Divergence 495 Mutual Information as an Objective Function To Be Optimized 498 Maximum Mutual Information Principle 499 Infomax and Redundancy Reduction 503 Spatially Coherent Features 506 Spatially Incoherent Features 508 Independent Components Analysis 510 Computer Experiment 523 Maximum Likelihood Estimation 525 Maximum Entropy Method 529 11. Stochastic Machines And Their Approximates Rooted in Statistical Mechanics 545 Statistical Mechanics 546 Markov Chains 548 Metropolis Algorithm 556 Simulated Annealing 558 Gibbs Sampling 561 Boltzmann Machine 562 Sigmoid Belief Networks 569 Helmholtz Machine 574 Mean-Field Theory 576 Deterministic Boltzmann Machine 578 Deterministic Sigmoid Belief Networks 579 Deterministic Annealing 586 12. Neurodynamic Programming 603 Markovian Decision Processes 604 Bellman's Optimality Criterion 607 Policy Iteration 610 Value Iteration 612 Neurodynamic Programming 617 Approximate Policy Iteration 618 Q-Learning 622 Computer Experiment 627 13. Temporal Processing Using Feedforward Networks 635 Short-term Memory Structures 636 Network Architectures for Temporal Processing 640 Focused Time Lagged Feedforward Networks 643 Computer Experiment 645 Universal Myopic Mapping Theorem 646 Spatio-Temporal Models of a Neuron 648 Distributed Time Lagged Feedforward Networks 651 Temporal Back-Propagation Algorithm 652 14. Neurodynamics 664 Dynamical Systems 666 Stability of Equilibrium States 669 Attractors 674 Neurodynamical Models 676 Manipulation of Attractors as a Recurrent Network Paradigm 680 Hopfield Models 680 Computer Experiment I 696 Cohen-Grossberg Theorem 701 Brain-State-in-a-Box Model 703 Computer Experiment II 709 Strange Attractors and Chaos 709 Dynamic Reconstruction of a Chaotic Process 714 Computer Experiment III 718 15. Dynamically Driven Recurrent Networks 732 Recurrent Network Architectures 733 State-Space Model 739 Nonlinear Autoregressive with Exogenous Inputs Model 746 Computation Power of Recurrent Networks 747 Learning Algorithms 750 Back-Propagation Through Time 751 Real-Time Recurrent Learning 756 Kalman Filters 762 Decoupled Extended Kalman Filters 765 Computer Experiment 770 Vanishing Gradients in Recurrent Networks 773 System Identification 776 Model-Reference Adaptive Control 780 Epilogue 790 Bibliography 796 Index 837 ER -