《神经网络与机器学习》简介:
《神经网络与机器学习(英文版第3版)》的可读性非常强,作者举重若轻地对神经网络的基本模型和主要学习理论进行了深入探讨和分析,通过大量的试验报告、例题和习题来帮助读者更好地学习神经网络。神经网络是计算智能和机器学习的重要分支,在诸多领域都取得了很大的成功。在众多神经网络著作中,影响最为广泛的是SimonHaykin的《神经网络原理》(第4版更名为《神经网络与机器学习》)。在《神经网络与机器学习(英文版第3版)》中,作者结合近年来神经网络和机器学习的最新进展,从理论和实际应用出发,全面。系统地介绍了神经网络的基本模型、方法和技术,并将神经网络和机器学习有机地结合在一起。《神经网络与机器学习(英文版第3版)》不但注重对数学分析方法和理论的探讨,而且也非常关注神经网络在模式识别、信号处理以及控制系统等实际工程问题中的应用。
本版在前一版的基础上进行了广泛修订,提供了神经网络和机器学习这两个越来越重要的学科的最新分析。
《神经网络与机器学习》目录:
Preface vAcknowledgements xivAbbreviations and Symbols xviGLOSSARY xxiIntroduction 11. What is a Neural Network? 12. The Human Brain 63. Models of a Neuron 104. Neural Networks Viewed As Directed Graphs 155. Feedback 186. Network Architectures 217. Knowledge Representation 248. Learning Processes 349. Learning Tasks 3810. Concluding Remarks 45 Notes and References 46Chapter 1 Rosenblatt'sPerceptron 471.1 Introduction 471.2 Perceptron 481.3 The Perceptron Convergence Theorem 501.4 Relation Between the Perceptron and Bayes Classifier for a Gaussian Environment 551.5 Computer Experiment: Pattern Classification 601.6 The Batch Perceptron Algorithm 621.7 Summary and Discussion 65 Notes and References 66 Problems 66Chapter 2 Model Building through Regression 682.1 Introduction 682.2 Linear Regression Model: Preliminary Considerations 692.3 Maximum a Posteriori Estimation of the Parameter Vector 712.4 Relationship .Between Regularized Least-Squares Estimation and MAP Estimation 762.5 Computer Experiment: Pattern Classification 772.6 The Minimum-Description-Length Principle 792.7 Finite Sample-Size Considerations 822.8 The Instrumental-Variables Method 862.9 Summary and Discussion 88 Notes and References 89 Problems 89Chapter 3 The Least-Mean-Square Algorithm 913.1 Introduction 913.2 Filtering Structure of the LMS Algorithm 923.3 Unconstrained Optimization:a Review 943.4 The Wiener Filter 1003.5 The Least-Mean-Square Algorithm 1023.6 Markov Model Portraying the Deviation of the LMS Algorithm from the Wiener Filter 1043.7 The Langevin Equation: Characterization of Brownian Motion 1063.8 Kushner's Direct-Averaging Method 1073.9 Statistical LMS LearningTheory for Small Learning-Rate Parameter 1083.10 Computer Experiment I:Linear Prediction 1103.11 Computer Experiment II:Pattern Classification 1123.12 Virtues and Limitations of the LMSAlgorithm 1133.13 Learning-Rate Annealing Schedules 1153.14 Summary and Discussion 117 Notes and References 118 Problems 119Chapter 4 Multilayer Perceptrons 1224.1 Introduction 1234.2 Some Preliminaries 1244.3 Batch Learning and On-Line Learning 1264.4 The Back-Propagation Algorithm 1294.5 XOR Problem 1414.6 Heuristics for Making the Back-Propagation Algorithm Perform Better 1444.7 Computer Experiment: Pattern Classification 1504.8 Back Propagation and Differentiation 1534.9 The Hessian and Its Role in On-Line Learning 1554.10 Optimal Annealing and Adaptive Control of the Learning Rate 1574.11 Generalization 1644.12 Approximations of Functions 1664.13 Cross-Validation 1714.14 Complexity Regularization and Network Pruning 1754.15 Virtues and Limitations of Back-Propagation Learning 1804.16 Supervised Learning Viewed as an Optimization Problem 1864.17 ConvolutionalNetworks 2014.18 Nonlinear Filtering 2034.19 Small-Scale Versus Large-Scale Learning Problems 2094.20 Summary and Discussion 217 Notes and References 219 Problems 221Chapter 5 Kernel Methods and Radial-Basis Function Networks 2305.1 Introduction 2305.2 Cover's Theorem on the Separability of Patterns 2315.3 The Interpolation Problem 2365.4 Radial-Basis-Function Networks 2395.5 K-Means Clustering 2425.6 Reeursive Least-Squares Estimation of the Weight Vector 2455.7 Hybrid Learning Procedure for RBF Networks 2495.8 Computer Experiment: Pattern Classification 2505.9 Interpretations of the Gaussian Hidden Units 2525.10 Kernel Regression and Its Relation to RBF Networks 2555.11 Summary and Discussion 259 Notes and References 261 Problems 263Chapter 6 Support Vector Machines 2686.1 Introduction 2686.2 Optimal Hyperplane for Linearly Separable Patterns 2696.3 Optimal Hyperplane for Nonseparable Patterns 2766.4 The Support Vector Machine Viewed as a Kernel Machine 2816.5 Design of Support Vector Machines 2846.6 XOR Problem 2866.7 Computer Experiment: Pattern Classification 2896.8 Regression: Robustness Considerations 2896.9 Optimal Solution of the Linear Regression Problem 2936.10 The Representer Theorem and Related Issues 2966.11 Summary and Discussion 302 Notes and References 304 Problems 307Chapter 7 RegularizationTheory 3137.1 Introduction 3137.2 Hadamard's Conditions for Well-Posedness 3147.3 Tikhonov's Regularization Theory 3157.4 Regularization Networks 3267.5 Generalized Radial-Basis-Function Networks 3277.6 The Regularized Least-Squares Estimator: Revisited 3317.7 Additional Notes of Interest on Regularization 3357.8 Estimation of the Regularization Parameter 3367.9 Semisupervised Learning 3427.10 Manifold Regularization: Preliminary Considerations 3437.11 Differentiable Manifolds 3457.12 Generalized RegularizationTheory 3487.13 Spectral Graph Theory 3507.14 Generalized Representer Theorem 3527.15 LaplacianRegularizedLeast-SquaresAlgorithm 3547.16 Experiments on Pattern Classification Using Semisupervised Learning 3567.17 Summary and Discussion 359 Notes and References 361 Problems 363Chapter 8 Principal-ComponentsAnalysis 3678.1 Introduction 3678.2 Principles of Self-Organization 3688.3 Self-Organized Feature Analysis 3728.4 Principal-Components Analysis: Perturbation Theory 3738.5 Hebbian-Based Maximum Eigenfilter 3838.6 Hebbian-Based Principal-Components Analysis 3928.7 Case Study: Image Coding 3988.8 Kernel Principal-Components Analysis 4018.9 Basic Issues Involved in the Coding of Natural Images 4068.10 Kernel Hebbian Algorithm 4078.11 Summary and Discussion 412 Notes and References 415 Problems 418Chapter 9 Self-OrganizingMaps 4259.1 Introduction 4259.2 Two Basic Feature-Mapping Models 4269.3 Self-Organizing Map 4289.4 Properties of the Feature Map 4379.5 Computer Experiments I: Disentangling Lattice Dynamics Using SOM 4459.6 Contextual Maps 4479.7 Hierarchical Vector Quantization 4509.8 Kernel Self-Organizing Map 4549.9 Computer Experiment II: Disentangling Lattice Dynamics Using Kernel SOM 4629.10 Relationship Between Kernel SOM and Kullback-Leibler Divergence 4649.11 Summary and Discussion 466 Notes and References 468 Problems 470Chapter 10 Information-Theoretic Learning Models 47510.1 Introduction 47610.2 Entropy 47710.3 Maximum-Entropy Principle 48110.4 Mutual Information 48410.5 Kullback-Leibler Divergence 48610.6 Copulas 48910.7 Mutual Information as an Objective Function to be Optimized 49310.8 Maximum Mutual Information Principle 49410.9 Infomax and Redundancy Reduction 49910.10 Spatially Coherent Features 50110.11 Spatially Incoherent Features 50410.12 Independent-Components Analysis 50810.13 Sparse Coding of Natural lmages and Comparison with lCA Coding 51410.14 Natural-Gradient Learning for lndependent-Components Analysis 51610.15 Maximum-Likelihood Estimation for lndependent-Components Analysis 52610.16 Maximum-Entropy Learning for Blind Source Separation 52910.17 Maximization of Negentropy for Independent-Components Analysis 53410.18 Coherent lndependent-Components Analysis 54110.19 Rate Distortion Theory and lnformation Bottleneck 54910.20 Optimal Manifold Representation of Data 55310.21 Computer Experiment: Pattern Classification 56010.22 Summary and Discussion 561 Notes and References 564 Problems 572Chapter 11 Stochastic Methods Rooted in Statistical Mechanics 57911.1 Introduction 58011.2 Statistical Mechanics 58011.3 Markov Chains 58211.4 Metropolis Algorithm 59111.5 Simulated Annealing 59411.6 Gibbs Sampling 59611.7 Boltzmann Machine 59811.8 Logistic Belief Nets 60411.9 Deep Belief Nets 60611.10 Deterministic Annealing 61011.11 Analogy of Deterministic Annealing with Expectation-Maximization Algorithm 61611.12 Summary and Discussion 617 Notes and References 619 Problems 621Chapter 12 Dynamic Programming 62712.1 Introduction 62712.2 Markov Decision Process 62912.3 Bellman's Optimality Criterion 63112.4 Policy Iteration 63512.5 Value Iteration 63712.6 Approximate Dynamic Programming: Direct Methods 64212.7 Temporal-Difference Learning 64312.8 Q-Learning 64812.9 Approximate Dynamic Programming: Indirect Methods 65212.10 Least-Squares Policy Evaluation 65512.11 Approximate Policy Iteration 66012.12 Summary and Discussion 663 Notes and References 665 Problems 668Chapter 13 Neurodynamics 67213.1 Introduction 67213.2 Dynamic Systems 67413.3 Stability of Equilibrium States 67813.4 Attractors 68413.5 Neurodynamic Models 68613.6 Manipulation of Attractors as a Recurrent Network Paradigm 68913.7 Hopfield Model 69013.8 The Cohen-Grossberg Theorem 70313.9 Brain-State-In-A-Box Model 70513.10 Strange Attractors and Chaos 71113.11 Dynamic Reconstruction of a Chaotic Process 71613.12 Summary and Discussion 722 Notes and References 724 Problems 727Chapter 14 Bayseian Filtering for State Estimation of Dynamic Systems 73114.1 Introduction 73114.2 State-Space Models 73214.3 Kalman Filters 73614.4 The Divergence-Phenomenon and Square-Root Filtering 74414.5 The Extended Kalman Filter 75014.6 The Bayesia.n Filter 75514.7 Cubature Kalman Filter: Building on the Kalman Filter 75914.8 Particle Filters 76514.9 Computer Experiment: Comparative Evaluation of Extended Kalman and Particle Filters 77514.10 Kalman Filtering in Modeling of Brain Functions 77714.11 Summary and Discussion 780 Notes and References 782 Problems 784Chapter 15 Dynamically Driven Recurrent Networks 79015.1 Introduction 79015.2 Recurrent Network Architectures 79115.3 Universal Approximation Theorem 79715.4 Controllability and Observability 79915.5 Computational Power of Recurrent Networks 80415.6 Learning Algorithms 80615.7 Back Propagation Through Time 80815.8 Real-Vane Recurrent Learning 81215.9 Vanishing Gradients in Recurrent Networks 81815.10 Supervised Training Framework for Recurrent Networks Using Nonlinear Sequential State Estimators 82215.11 Computer Experiment: Dynamic Reconstruction of Mackay-Glass Attractor 82915.12 Adaptivity Considerations 83115.13 Case Study: Model Reference Applied to Neurocontrol 83315.14 Summary and Discussion 835 Notes and References 839 Problems 842Bibliography 845Index 889
· · · · · ·