The 4 quantity set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitutes the court cases of the twenty third foreign convention on Neural details Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 complete papers offered have been conscientiously reviewed and chosen from 431 submissions. The four volumes are equipped in topical sections on deep and reinforcement studying; immense facts research; neural information research; robotics and keep an eye on; bio-inspired/energy effective info processing; entire mind structure; neurodynamics; bioinformatics; biomedical engineering; information mining and cybersecurity workshop; desktop studying; neuromorphic undefined; sensory notion; trend popularity; social networks; brain-machine interface; desktop imaginative and prescient; time sequence research; data-driven procedure for extracting latent gains; topological and graph dependent clustering equipment; computational intelligence; info mining; deep neural networks; computational and cognitive neurosciences; idea and algorithms.

Namely, the hidden layers are commonly used by both actor and critic. In the learning of the critic, Temporal Difference (TD) error rˆt is used rˆt = rt + γP (st ) − P (st−1 ) (1) where rt is a reward, γ is a discount factor, st is a state vector, and P (st ) denotes a state value. The state value at t − 1, P (st−1 ) is trained by P T (st−1 ) = P (st−1 ) + rˆt = rt + γP (st ) (2) where P T (st−1 ) denotes the training signal for the state value. On the other hand, the actor output vector a(st−1 ) is trained by aT (st−1 ) = a(st−1 ) + rˆt rndt−1 (3) 34 K.

567–572 (2015) 10. : Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016) 11. : Density Ratio Estimation in Machine Learning. Cambridge University Press, Cambridge (2012) 12. : On scalability, generalization, and hybridization of coevolutionary learning: a case study for Othello. IEEE Trans. Comput. Intell. AI Games 5(3), 214–226 (2013) 13. : Efficient computation of optimal actions. PNAS 106(28), 11478– 11483 (2009) 14. : Inverse reinforcement learning using dynamic policy programming.

IJCV, 1–42, April 2015 13. : Deep fisher networks for large-scale image classification. , Weinberger, K. ) NIPS 2013, pp. 163–171 (2013) 14. : Two-stream convolutional networks for action recognition in videos. , Weinberger, K. ) NIPS 2014, pp. 568–576 (2014) 15. : Very deep convolutional networks for largescaleimage recognition. In: ICLR 2015 (2015) 16. : Deepface: closing the gap to humanlevel performance in face verification. In: IEEE CVPR 2014, pp. 1701–1708 (2014) 17. : Action recognition with improved trajectories.

