— REFERENCES —

Here I’ll post all the papers, articles, books, courses, YT videos, reels or tiktoks that I’ve consumed and consider important to remember.

All my posts will have this one common reference/bibliography section.

— Papers (Code “PA”) —

Turing Paper (1936) Turing, A. M. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, s2-42(1),
230-265. https://doi.org/10.1112/plms/s2-42.1.230.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5(4), 115-133. https://doi.org/10.1007/BF02478259
Long Short-Term Memory (LSTM original) Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
Learning Phrase Representations using RNN Encoder-Decoder Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp.
1724-1734). Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1179
Neural Machine Translation (Attention Mechanism) Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In 3rd International Conference on
Learning Representations, ICLR 2015. arXiv:1409.0473. https://arxiv.org/abs/1409.0473
Sequence to Sequence Learning Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems
(Vol. 27, pp. 3104-3112). Curran Associates, Inc. https://arxiv.org/abs/1409.3215
Generative Adversarial Networks Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In
Advances in Neural Information Processing Systems (Vol. 27, pp. 2672-2680). Curran Associates, Inc. https://arxiv.org/abs/1406.2661
A Decomposable Attention Model Parikh, A., Täckström, O., Das, D., & Uszkoreit, J. (2016). A decomposable attention model for natural language inference. In Proceedings of the 2016
Conference on Empirical Methods in Natural Language Processing (pp. 2249-2255). Association for Computational Linguistics.
https://doi.org/10.18653/v1/D16-1244
Attention is All You Need (Transformers) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (Vol. 30, pp. 6000-6010). Curran Associates, Inc. https://arxiv.org/abs/1706.03762
LSTM: A Search Space Odyssey Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE Transactions on Neural Networks
and Learning Systems, 28(10), 2222-2232. https://doi.org/10.1109/TNNLS.2016.2582924
From Turing to Transformers Cheok, A. D., & Zhang, E. Y. (2023). From Turing to transformers: A comprehensive review and tutorial on the evolution and applications of generative
transformer models. Sci, 5(4), 46. https://doi.org/10.3390/sci5040046
Hierarchical Reasoning Model HRM Team. (2025). Hierarchical reasoning model. arXiv:2506.21734. https://arxiv.org/abs/2506.21734

— Books (Code “BO”) —

Roger Penrose – Emperor’s New Mind
Roger Penrose – Shadows of the Mind
Aldous Huxley – The Doors of Perception
Beyond AI – J. Storrs Hall, Phd
Mind and matter – Erwin Schrödinger
The black swan – Nassim Nicholas Taleb

— Articles (Code “AR”) —

— Courses (Code “CO”) —

A collection of my thoughts and experiments

Recent Post

— REFERENCES —

Leave a comment Cancel reply

Recent Post

— REFERENCES —

Share this:

Leave a comment Cancel reply