NLP：LSTM之父眼中的深度学习十年简史《The 2010s: Our Decade of Deep Learning / Outlook on the 2020s》的参考文献

2024-05-10 11:58:05

The 2010s: Our Decade of Deep Learning / Outlook on the 2020s

References Beyond Those in Reference [MIR]

[MIR] J. Schmidhuber (2019). Deep Learning: Our Miraculous Year 1990-1991. Containing most references cited above. For convenience also appended below. Compare reddit posts [R2-R8] influenced by ref [MIR] (although my name is frequently misspelled).

[BW] H. Bourlard, C. J. Wellekens (1989). Links between Markov models and multilayer perceptrons. NIPS 1989, p. 502-510.

[BRI] Bridle, J.S. (1990). Alpha-Nets: A Recurrent "Neural" Network Architecture with a Hidden Markov Model Interpretation, Speech Communication, vol. 9, no. 1, pp. 83-92.

[BOU] H Bourlard, N Morgan (1993). Connectionist speech recognition. Kluwer, 1993.

[HYB12] Hinton, G. E., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., and Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag., 29(6):82-97.

[LSTM14] S. Fernandez, A. Graves, J. Schmidhuber. Sequence labelling in structured domains with hierarchical recurrent neural networks. In Proc. IJCAI 07, p. 774-779, Hyderabad, India, 2007 (talk).PDF.

[LSTM15] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. Advances in Neural Information Processing Systems 22, NIPS'22, p 545-552, Vancouver, MIT Press, 2009. PDF.

[LSTM16] M. Stollenga, W. Byeon, M. Liwicki, J. Schmidhuber. Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation. Advances in Neural Information Processing Systems (NIPS), 2015. Preprint: arxiv:1506.07452.

[TR1] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin (2017). Attention is all you need. NIPS 2017, pp. 5998-6008.

[TR2] J. Devlin, M. W. Chang, K. Lee, K. Toutanova (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805.

[SLG] S. Le Grand. Medium (2019). TLDR: Schmidhuber's Lab did it first. Link.

[AC18] Y. Burda, H. Edwards, D. Pathak, A. Storkey, T. Darrell, and A. A. Efros. Large-scale study of curiosity-driven learning. Preprint arXiv:1808.04355, 2018.

[T94] G. Tesauro. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation 6:2, p 215-219, 1994.

[DM4] Mastering the game of Go with deep neural networks and tree search. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. Nature 529:7587, p 484-489, 2016.

[CAR1] Prof. Schmidhuber's highlights of robot car history (2007, updated 2011). Link.

[NAT2] D. Heaven. Why deep-learning AIs are so easy to fool. Nature 574, 163-166 (2019). Link. ["A baby doesn't learn by downloading data from Facebook," says Schmidhuber.]

[SV1] S. Zuboff (2019). The age of surveillance capitalism. The Fight for a Human Future at the New Frontier of Power. NY: PublicAffairs.

[SV2] Facial recognition changes China. Twitter discussion@hardmaru

[META10] T. Schaul and J. Schmidhuber. Metalearning. Scholarpedia, 5(6):4650, 2010.

[META17] R. Miikkulainen, Q. Le, K. Stanley, C. Fernando. NIPS 2017 Metalearning Symposium.

[LIP1] M. Wand, J. Koutnik, J. Schmidhuber. Lipreading with Long Short-Term Memory. Proc. ICASSP, p 6115-6119, 2016.

[DR16] A Giusti, J Guzzi, DC Ciresan, F He, JP Rodriguez, F Fontana, M Faessler, C Forster, J Schmidhuber, G Di Caro, D Scaramuzza, LM Gambardella (2016): First drone with onboard vision based on deep neural nets learns to navigate in the forest. Youtube video (Feb 2016).

[DNC2] R. Csordas, J. Schmidhuber. Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control. International Conference on Learning Representations (ICLR 2019). PDF.

[UDRL] Upside Down Reinforcement Learning (2019). Google it.

[K96] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 237-285.

[H13] M. Hausknecht, J. Lehman, R. Miikkulainen, P. Stone. A Neuroevolution Approach to General Atari Game Playing. IEEE Transactions on Computational Intelligence and AI in Games, 16 Dec. 2013.

[LOC] S. Hochreiter and J. Schmidhuber. Feature extraction through LOCOCODE. PDF. Neural Computation 11(3): 679-714, 1999

[OBJ1] Greff, K., Rasmus, A., Berglund, M., Hao, T., Valpola, H., Schmidhuber, J. (2016). Tagger: Deep unsupervised perceptual grouping. NIPS 2016, pp. 4484-4492.

[OBJ2] Greff, K., Van Steenkiste, S., Schmidhuber, J. (2017). Neural expectation maximization. NIPS 2017, pp. 6691-6701.

[OBJ3] van Steenkiste, S., Chang, M., Greff, K., Schmidhuber, J. (2018). Relational neural expectation maximization: Unsupervised discovery of objects and their interactions. ICLR 2018.

[IG] X Chen, Y Duan, R Houthooft, J Schulman, I Sutskever, P Abbeel (2016). Infogan: Interpretable representation learning by information maximizing generative adversarial nets. NIPS 2016, pp. 2172-2180.

[WAV1] van Steenkiste, S., Koutnik, J., Driessens, K., Schmidhuber, J. (July 2016). A wavelet-based encoding for neuroevolution. GECCO 2016, pp. 517-524.

[OAI3] Salimans, T., Ho, J., Chen, X., Sidor, S., Sutskever, I. (2017). Evolution strategies as a scalable alternative to reinforcement learning. Preprint arXiv:1703.03864.

[MIR]-related discussions (2019) with many comments at reddit/ml (the largest machine learning forum with over 800k subscribers), ranked by votes (my name is often misspelled):

[R2] Reddit/ML, 2019. JS really had GANs in 1990. Link.

[R3] Reddit/ML, 2019. NeurIPS 2019 Bengio Schmidhuber Meta-Learning Fiasco. Link.

[R4] Reddit/ML, 2019. Five major deep learning papers by G. Hinton did not cite similar earlier work by JS. Link.

[R5] Reddit/ML, 2019. The 1997 LSTM paper by Hochreiter & Schmidhuber has become the most cited deep learning research paper of the 20th century. Link.

[R6] Reddit/ML, 2019. DanNet, the CUDA CNN of Dan Ciresan in JS' team, won 4 image recognition challenges prior to AlexNet. Link.

[R7] Reddit/ML, 2019. JS on Seppo Linnainmaa, inventor of backpropagation in 1970. Link.

[R8] Reddit/ML, 2019. JS on Alexey Ivakhnenko, godfather of deep learning 1965. Link.

Below a few selected interviews of the 2010s in newspapers and magazines (use DeepL or Google Translate (Sec. 1) to translate German texts). Hundreds of additional interviews and news articles (mostly in English or German) can be found through search engines.

[ACM16] ACM interview by S. Ibaraki (2016). Chat with J. Schmidhuber: Artificial Intelligence & Deep Learning - Now & Future.Link.

[INV16] J. Carmichael. AI gained consciousness in 1991... J. Schmidhuber is convinced the ultimate breakthrough already happened. Inverse, Dec 2016. Link.

[SR18] JS interviewed by Swiss Re (2018): The intelligence behind artificial intelligence. Link.

[CNNTV2] JS interviewed by CNNmoney (2019): Part 2 on a healthcare data market where every patient can become a micro-entrepreneur. (Part 1 is more general.)

[FA15] Intelligente Roboter werden vom Leben fasziniert sein. (Intelligent robots will be fascinated by life.) FAZ, 1 Dec 2015. Link.

[SP16] JS interviewed by C. Stoecker: KI wird das All erobern. (AI will conquer the universe.) SPIEGEL, 6 Feb 2016. Link.

[FA18] KI ist eine Riesenchance für Deutschland. (AI is a huge chance for Germany.) FAZ, 2018. Link.

[SPE17] JS interviewed by P. Hummel: Ein Wettrüsten wird sich nicht verhindern lassen. (An AI arms race is inevitable.) Spektrum, 28 Aug 2017. Link.

[CAR2] Interview with J. Schmidhuber at the Geneva Motor Show 2019: KI wird die Autobranche revolutionieren. (AI will revolutionise the car industry.) Blick, 11/03/2019. Link.

[FATV] AI & Economy. Public Night Talk with J. Schmidhuber, organised by FAZ and Hertie Stiftung (2019, in German). Youtube link.

Selected References from Reference [MIR]

[DL1] J. Schmidhuber, 2015. Deep Learning in neural networks: An overview. Neural Networks, 61, 85-117. More.

[DL2] J. Schmidhuber, 2015. Deep Learning. Scholarpedia, 10(11):32832.

[DL4] J. Schmidhuber, 2017. Our impact on the world's most valuable public companies: 1. Apple, 2. Alphabet (Google), 3. Microsoft, 4. Facebook, 5. Amazon ... HTML.

[DLC] J. Schmidhuber, 2015. Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436). June 2015. HTML.

[AV1] A. Vance. Google Amazon and Facebook Owe Jürgen Schmidhuber a Fortune - This Man Is the Godfather the AI Community Wants to Forget. Business Week, Bloomberg, May 15, 2018.

[KO0] J. Schmidhuber. Discovering problem solutions with low Kolmogorov complexity and high generalization capability. Technical Report FKI-194-94, Fakultät für Informatik, Technische Universität München, 1994. PDF. Also at ICML'95.

[KO2] J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5):857-873, 1997. PDF.

[CO1] J. Koutnik, F. Gomez, J. Schmidhuber (2010). Evolving Neural Networks in Compressed Weight Space. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2010), Portland, 2010. PDF.

[CO2] J. Koutnik, G. Cuccu, J. Schmidhuber, F. Gomez. Evolving Large-Scale Neural Networks for Vision-Based Reinforcement Learning. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Amsterdam, July 2013. PDF.

[CO3] R. K. Srivastava, J. Schmidhuber, F. Gomez. Generalized Compressed Network Search. Proc. GECCO 2012. PDF.

[DM1] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller. Playing Atari with Deep Reinforcement Learning. Tech Report, 19 Dec. 2013, arxiv:1312.5602.

[DM2] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Human-level control through deep reinforcement learning. Nature, vol. 518, p 1529, 26 Feb. 2015. Link.

[DM3] S. Stanford. DeepMind's AI, AlphaStar Showcases Significant Progress Towards AGI. Medium ML Memoirs, 2019. [Alphastar has a "deep LSTM core."]

[OAI1] G. Powell, J. Schneider, J. Tobin, W. Zaremba, A. Petron, M. Chociej, L. Weng, B. McGrew, S. Sidor, A. Ray, P. Welinder, R. Jozefowicz, M. Plappert, J. Pachocki, M. Andrychowicz, B. Baker. Learning Dexterity. OpenAI Blog, 2018.

[OAI2] OpenAI et al. (Dec 2019). Dota 2 with Large Scale Deep Reinforcement Learning. Preprint arxiv:1912.06680. [An LSTMcomposes 84% of the model's total parameter count.]

[OAI2a] J. Rodriguez. The Science Behind OpenAI Five that just Produced One of the Greatest Breakthrough in the History of AI. Towards Data Science, 2018. [An LSTM was the core of OpenAI Five.]

[MC43] W. S. McCulloch, W. Pitts. A Logical Calculus of Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biophysics, Vol. 5, p. 115-133, 1943.

[K56] S.C. Kleene. Representation of Events in Nerve Nets and Finite Automata. Automata Studies, Editors: C.E. Shannon and J. McCarthy, Princeton University Press, p. 3-42, Princeton, N.J., 1956.

[VAN1] S. Hochreiter. Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TUM, 1991 (advisor J.S.) PDF. [More on the Fundamental Deep Learning Problem.]

[VAN3] S. Hochreiter, Y. Bengio, P. Frasconi, J. Schmidhuber. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In S. C. Kremer and J. F. Kolen, eds., A Field Guide to Dynamical Recurrent Neural Networks. IEEE press, 2001. PDF.

[LSTM0] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory.TR FKI-207-95, TUM, August 1995. PDF.

[LSTM1] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. PDF. Based on [LSTM0]. More.

[LSTM2] F. A. Gers, J. Schmidhuber, F. Cummins. Learning to Forget: Continual Prediction with LSTM. Neural Computation, 12(10):2451-2471, 2000. PDF. [The "vanilla LSTM architecture" that everybody is using today, e.g., in Google's Tensorflow.]

[LSTM3] A. Graves, J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:5-6, pp. 602-610, 2005. PDF.

[LSTM4] S. Fernandez, A. Graves, J. Schmidhuber. An application of recurrent neural networks to discriminative keyword spotting. Intl. Conf. on Artificial Neural Networks ICANN'07, 2007. PDF.

[LSTM5] A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke, J. Schmidhuber. A Novel Connectionist System for Improved Unconstrained Handwriting Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 5, 2009. PDF.

[LSTM6] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. NIPS'22, p 545-552, Vancouver, MIT Press, 2009. PDF.

[LSTM7] J. Bayer, D. Wierstra, J. Togelius, J. Schmidhuber. Evolving memory cell structures for sequence learning. Proc. ICANN-09, Cyprus, 2009. PDF.

[LSTM8] A. Graves, A. Mohamed, G. E. Hinton. Speech Recognition with Deep Recurrent Neural Networks. ICASSP 2013, Vancouver, 2013. PDF.

[LSTM9] O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, G. Hinton. Grammar as a Foreign Language. Preprint arXiv:1412.7449 [cs.CL].

[LSTM10] A. Graves, D. Eck and N. Beringer, J. Schmidhuber. Biologically Plausible Speech Recognition with LSTM Neural Nets. In J. Ijspeert (Ed.), First Intl. Workshop on Biologically Inspired Approaches to Advanced Information Technology, Bio-ADIT 2004, Lausanne, Switzerland, p. 175-184, 2004. PDF.

[LSTM11] N. Beringer and A. Graves and F. Schiel and J. Schmidhuber. Classifying unprompted speech by retraining LSTM Nets. In W. Duch et al. (Eds.): Proc. Intl. Conf. on Artificial Neural Networks ICANN'05, LNCS 3696, pp. 575-581, Springer-Verlag Berlin Heidelberg, 2005.

[LSTM12] D. Wierstra, F. Gomez, J. Schmidhuber. Modeling systems with internal state using Evolino. In Proc. of the 2005 conference on genetic and evolutionary computation (GECCO), Washington, D. C., pp. 1795-1802, ACM Press, New York, NY, USA, 2005. Got a GECCO best paper award.

[LSTM13] F. A. Gers and J. Schmidhuber. LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages. IEEE Transactions on Neural Networks 12(6):1333-1340, 2001. PDF.

[NAS] B. Zoph, Q. V. Le. Neural Architecture Search with Reinforcement Learning. Preprint arXiv:1611.01578 (PDF), 2017.

[S2S] I. Sutskever, O. Vinyals, Quoc V. Le. Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (NIPS), 2014, 3104-3112.

[CTC] A. Graves, S. Fernandez, F. Gomez, J. Schmidhuber. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. ICML 06, Pittsburgh, 2006. PDF.

[GSR15] Dramatic improvement of Google's speech recognition through LSTM: Alphr Technology, Jul 2015, or 9to5google, Jul 2015

[META1] J. Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-... hook. Diploma thesis, Tech Univ. Munich, 1987. HTML.

[FASTMETA1] J. Schmidhuber. Steps towards `self-referential' learning. Technical Report CU-CS-627-92, Dept. of Comp. Sci., University of Colorado at Boulder, November 1992.

[FASTMETA2] J. Schmidhuber. A self-referential weight matrix. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 446-451. Springer, 1993. PDF.

[FASTMETA3] J. Schmidhuber. An introspective network that can learn to run its own weight change algorithm. In Proc. of the Intl. Conf. on Artificial Neural Networks, Brighton, pages 191-195. IEE, 1993.

[FAST0] J. Schmidhuber. Learning to control fast-weight memories: An alternative to recurrent nets. Technical Report FKI-147-91, Institut für Informatik, Technische Universität München, March 1991. PDF.

[FAST1] J. Schmidhuber. Learning to control fast-weight memories: An alternative to recurrent nets. Neural Computation, 4(1):131-139, 1992.PDF. HTML. Pictures (German).

[FAST2] J. Schmidhuber. Reducing the ratio between learning complexity and number of time-varying variables in fully recurrent nets. In Proceedings of the International Conference on Artificial Neural Networks, Amsterdam, pages 460-463. Springer, 1993. PDF.

[FAST3] I. Schlag, J. Schmidhuber. Gated Fast Weights for On-The-Fly Neural Program Generation. Workshop on Meta-Learning, @NIPS 2017, Long Beach, CA, USA.

[FAST3a] I. Schlag, J. Schmidhuber. Learning to Reason with Third Order Tensor Products. Advances in Neural Information Processing Systems (NIPS), Montreal, 2018. Preprint: arXiv:1811.12143. PDF.

[DNC] Hybrid computing using a neural network with dynamic external memory. A. Graves, G. Wayne, M. Reynolds, T. Harley, I. Danihelka, A. Grabska-Barwinska, S. G. Colmenarejo, E. Grefenstette, T. Ramalho, J. Agapiou, A. P. Badia, K. M. Hermann, Y. Zwols, G. Ostrovski, A. Cain, H. King, C. Summerfield, P. Blunsom, K. Kavukcuoglu, D. Hassabis. Nature, 538:7626, p 471, 2016.

[PDA1] G.Z. Sun, H.H. Chen, C.L. Giles, Y.C. Lee, D. Chen. Neural Networks with External Memory Stack that Learn Context - Free Grammars from Examples. Proceedings of the 1990 Conference on Information Science and Systems, Vol.II, pp. 649-653, Princeton University, Princeton, NJ, 1990.

[PDA2] M. Mozer, S. Das. A connectionist symbol manipulator that discovers the structure of context-free languages. Proc. NIPS 1993.

[WU] Y. Wu et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. Preprint arXiv:1609.08144 (PDF), 2016.

[GT16] Google's dramatically improved Google Translate of 2016 is based on LSTM, e.g., WIRED, Sep 2016, or siliconANGLE, Sep 2016

[FB17] By 2017, Facebook used LSTM to handle over 4 billion automatic translations per day (The Verge, August 4, 2017); see alsoFacebook blog by J.M. Pino, A. Sidorov, N.F. Ayan (August 3, 2017)

[LSTM-RL] B. Bakker, F. Linaker, J. Schmidhuber. Reinforcement Learning in Partially Observable Mobile Robot Domains Using Unsupervised Event Extraction. In Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), Lausanne, 2002. PDF.

[HW1] Srivastava, R. K., Greff, K., Schmidhuber, J. Highway networks. Preprints arXiv:1505.00387 (May 2015) and arXiv:1507.06228 (July 2015). Also at NIPS 2015. The first working very deep feedforward nets with over 100 layers. Let g, t, h, denote non-linear differentiable functions. Each non-input layer of a highway net computes g(x)x + t(x)h(x), where x is the data from the previous layer. (Like LSTM with forget gates [LSTM2] for RNNs.) Resnets [HW2] are a special case of this where g(x)=t(x)=const=1. More.

[HW2] He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. Preprint arXiv:1512.03385 (Dec 2015). Residual nets are a special case of highway nets [HW1], with g(x)=1 (a typical highway net initialization) and t(x)=1. More.

[HW3] K. Greff, R. K. Srivastava, J. Schmidhuber. Highway and Residual Networks learn Unrolled Iterative Estimation. Preprintarxiv:1612.07771 (2016). Also at ICLR 2017.

[JOU17] Jouppi et al. (2017). In-Datacenter Performance Analysis of a Tensor Processing Unit. Preprint arXiv:1704.04760

[CNN1] K. Fukushima: Neural network model for a mechanism of pattern recognition unaffected by shift in position - Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979. [More in Scholarpedia.]

[CNN1a] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987.

[CNN2] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. PDF.

大合集！80篇CVPR2020论文分方向整理: 目标检测/图像分割/姿态估计等，附打包下载

从论文ID公布以来,极市一直在对CVPR进行实时跟进,本文是对80篇CVPR2020论文整理和分类,均有论文链接,部分含开源代码,涵盖的方向有:目标检测.目标跟踪.图像分割.人脸识别.姿态估计.三维点 ...
【第三期】20篇强化学习论文总结（附下载链接）

前段时间,我们为大家整理了105篇强化学习论文的综述及列表(点击获取). 为了方便大家学习,我们将会出5期强化学习的论文总结,每期会有20篇左右的论文,在每周一发布,敬请关注. 目前已出了 [第一期] ...
【AI大咖】扒一下低调的Yoshua Bengio大神

说五个关键词,你会想到谁? 花书作者,2018年图灵获奖者,银灰卷发,theano,MILA,你心中的答案会是Yoshua Bengio么? 关于他的故事,且听我细细道来. 作者&编辑 | 小 ...
人工智能与信息安全：新的革命与挑战

人工智能,是一种借鉴生物感知系统和神经系统来开发相应模拟算法的计算机技术,其主要特点是可从数据中学习特征并进行自我迭代训练. 通常,人工智能算法往往需要大量的数据以及庞大的计算资源作为支撑.随着计算机 ...
堪比当年的LSTM，Transformer引燃机器学习圈：它是万能的

机器之心报道作者:魔王谷歌研究科学家 David Ha:Transformer 是新的 LSTM. 2017 年 6 月谷歌发布论文<Attention is All You Need> ...
解读！清华、谷歌等10篇强化学习论文总结

强化学习(Reinforcement Learning,RL)正成为当下机器学习中最热门的研究领域之一.与常见的监督学习和非监督学习不同,强化学习强调智能体(agent)与环境(environment ...
【每周CV论文推荐】初学者必须精读的5篇深度学习优化相关文章

欢迎来到<每周CV论文推荐>.在这个专栏里,还是本着有三AI一贯的原则,专注于让大家能够系统性完成学习,所以我们推荐的文章也必定是同一主题的. 从事深度学习岗位,扎实的深度学习理论基础是必 ...
Paper：LSTM之父眼中的深度学习十年简史《The 2010s: Our Decade of Deep Learning / Outlook on the 2020s》的解读

Paper:LSTM之父眼中的深度学习十年简史<The 2010s: Our Decade of Deep Learning / Outlook on the 2020s>的解读The 2 ...
深度学习十年发展回顾：里程碑论文汇编

大数据文摘出品:来源:leogao.dev编译:武帅.狗小白.马莉随着21世纪第二个十年行将结束,我们有必要回顾一下这十年来在深度学习领域所取得的巨大进步.在性能日益强大的计算机及大数据可用性的推动 ...
NLP实操手册: 基于Transformer的深度学习架构的应用指南(综述)

人工智能算法与Python大数据致力于提供深度学习.机器学习.人工智能干货文章,为AI人员提供学习路线以及前沿资讯 23篇原创内容公众号点上方人工智能算法与Python大数据获取更多干货在右上 ...
「NLP」如何全面深度学习图神经网络理论与实践，有三AI NLP负责人带学计划出炉

图神经网络(GNN)很重要,因为图很重要.现实世界中,万物相互关联,构成一个个网络,这些网络通常通过图来表示.那么如何理解和表征这些复杂的图网络,是理解和利用它的基础. 随着深度学习的发展,神经网络展 ...
详解NLP中的预训练模型、图神经网络、模型压缩、知识图谱、信息抽取、序列模型、深度学习、语法分析、文...

NLP近几年非常火,且发展特别快.像BERT.GPT-3.图神经网络.知识图谱等技术应运而生. 我们正处在信息爆炸的时代.面对每天铺天盖地的网络资源和论文.很多时候我们面临的问题并不是缺资源,而是找准 ...
【NLP】如何全面深度学习知识图谱理论与实战，有三AI NLP负责人带学计划出炉

最近几年知识图谱作为人工智能领域很热门的一项技术,已经在不少领域都取得了不少成功的落地案例.不过知识图谱作为人工智能的一个底层技术,确实不如图像,语音等技术一样让人能很直观的感受到它的存在.于是乎,总 ...
用于NLP的Python：使用Keras进行深度学习文本生成

原文链接:http://tecdat.cn/?p=8448 文本生成是NLP的最新应用之一.深度学习技术已用于各种文本生成任务,例如写作诗歌,生成电影脚本甚至创作音乐.但是,在本文中,我们将看到一个非 ...
深度学习之图解LSTM

原文标题: Understanding LSTM and its diagrams 原文链接:https://medium.com/mlreview/understanding-lstm-and-it ...
Matlab用深度学习长短期记忆（LSTM）神经网络对文本数据进行分类

原文链接:http://tecdat.cn/?p=23151 这个例子展示了如何使用深度学习长短期记忆(LSTM)网络对文本数据进行分类. 文本数据是有顺序的.一段文字是一个词的序列,它们之间可能有依 ...

NLP：LSTM之父眼中的深度学习十年简史《The 2010s: Our Decade of Deep Learning / Outlook on the 2020s》的参考文献

The 2010s: Our Decade of Deep Learning / Outlook on the 2020s

References Beyond Those in Reference [MIR]

Selected References from Reference [MIR]

相关推荐