Signs and Supersigns in Deep Learning
DOI:
https://doi.org/10.15837/ijccc.2024.1.6392Keywords:
deep learning, computational semiotics, neural netywork explainability, meural architecture optimizationAbstract
Semiotics is the study of signs and sign-using behavior. Computational semiotics is an interdisciplinary field which proposes a new kind of approach to intelligent systems, where an explicit account for the notion of sign is prominent. Our fundamental thesis is that information concentration processes appear in successive layers of deep learning models: each layer aggregates information from the previous layer of the network. In computational semiotics, this information concentration is known as superization, and it is accompanied by a decrease of entropy: signs are aggregated into supersign. Our interdisciplinary approach enables us to depict superization processes within deep learning models. This is a novel semantic interpretation of deep learning. We use concepts from computational semiotics to explain decision processes in deep learning. Semiotic tools can be used to optimize the architecture of deep neural networks. Interpretability/explainability and architecture optimization of neural models are currently among the hottest topics in machine learning. We illustrate our semiotic approach with several applications. Our contribution can be seen as the initial move in establishing a cohesive semiotic framework for deep learning models.References
U. Eco, A Theory of Semiotics. Indiana University Press, 1976.
https://doi.org/10.1007/978-1-349-15849-2
T. Sebeok, Signs: An Introduction to Semiotics, ser. Toronto Studies in Semiotics. University of Toronto Press, 1994.
D. Chandler, Semiotics: The basics. Taylor & Francis, 2017.
C. S. Peirce, Collected papers of Charles Sanders Peirce. Harvard University Press, 1960, vol. 2.
A. Jappy, "Iconicity, hypoiconicity," in The Commens Encyclopedia: The Digital Encyclopedia of Peirce Studies. New Edition. Commens, 2014.
E. Rochberg-Halton and K. McMurtrey, "The foundations of modern semiotic: Charles Peirce and Charles Morris," The American Journal of Semiotics, vol. 2, no. 1/2, pp. 129-156, 2007.
https://doi.org/10.5840/ajs198321/211
C. Morris and M. Charles William, Writings on the General Theory of Signs, ser. Approaches to semiotics. Mouton, 1972.
H. Zemanek, "Semiotics and programming languages," Communications of the ACM, vol. 9, no. 3, pp. 139-143, 1966.
https://doi.org/10.1145/365230.365249
K. Tanaka-Ishii, Semiotics of Programming, 1st ed. USA: Cambridge University Press, 2010.
M. Bense, Semiotische Prozesse und Systeme in Wissenschaftstheorie und Design, Ästhetik und Mathematik. Baden-Baden: Agis-Verlag, 1975.
A. A. Moles, Information Theory and Esthetic Perception. Urbana,: University of Illinois Press, 1966.
F. Nake, "Information aesthetics: An heroic experiment," Journal of Mathematics and the Arts, vol. 6, no. 2-3, pp. 65-75, 2012.
https://doi.org/10.1080/17513472.2012.679458
H. Frank, Kybernetische Grundlagen der Pädagogik: eine Einführung in die Informationspsychologie und ihre philosophischen, mathematischen und physiologischen Grundlagen. Baden- Baden: Agis Verlag, 1969.
R. Gunzenhäuser, Maß und Information als ästhetische Kategorien: Einführung in die ästhetische Theorie GD Birkhoffs und die Informationsästhetik. Agis Verlag, 1975.
J. Rigau, M. Feixas, and M. Sbert, "Informational aesthetics measures," IEEE computer graphics and applications, vol. 28, no. 2, pp. 24-34, 2008.
https://doi.org/10.1109/MCG.2008.34
R. Gudwin and F. Gomide, "A computational semiotics approach for soft computing," in 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, vol. 4. IEEE, 1997, pp. 3981-3986.
A. Gomes, R. Gudwin, and J. Queiroz, "Towards meaning processes in computers from peircean semiotics," SEED Journal-Semiotics, Evolution, Energy, and Development, vol. 3, no. 2, pp. 69-79, 2003.
R. R. Gudwin, "Semiotic synthesis and semionic networks," SEEDJournal (Semiotics, Evolution, Energy, and Development), vol. 2, no. 2, pp. 55-83, 2002.
R. Gudwin and J. Queiroz, "Towards an introduction to computational semiotics," in International Conference on Integration of Knowledge Intensive Multi-Agent Systems. IEEE, 2005, pp. 393-398.
J. S. Baxter, E. Gibson, R. Eagleson, and T. M. Peters, "The semiotics of medical image segmentation," Medical image analysis, vol. 44, pp. 54-71, 2018.
https://doi.org/10.1016/j.media.2017.11.007
M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," CoRR, vol. abs/1311.2901, 2013. [Online]. Available: http://arxiv.org/abs/1311.2901
R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra, "Grad-CAM: Why did you say that? Visual explanations from deep networks via gradient-based localization," CoRR, vol. abs/1610.02391, 2016. [Online]. Available: http://arxiv.org/abs/1610.02391
https://doi.org/10.1109/ICCV.2017.74
N. Tishby and N. Zaslavsky, "Deep learning and the information bottleneck principle," 2015.
https://doi.org/10.1109/ITW.2015.7133169
R. Shwartz-Ziv and N. Tishby, "Opening the black box of deep neural networks via information," 2017.
B. Kovalerchuk, R. Andonie, N. Datia, K. Nazemi, and E. Banissi, "Visual knowledge discovery with artificial intelligence: Challenges and future directions," in Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery. Cham: Springer International Publishing, 2022, pp. 1-27.
https://doi.org/10.1007/978-3-030-93119-3_1
B. Kovalerchuk, K. Nazemi, R. Andonie, N. Datia, and E. Banissi, Integrating Artificial Intelligence and Visualization for Visual Knowledge Discovery. Springer, 2022.
https://doi.org/10.1007/978-3-030-93119-3
B. Muşat and R. Andonie, "Semiotic aggregation in deep learning," Entropy, vol. 22, no. 12, 2020. [Online]. Available: https://www.mdpi.com/1099-4300/22/12/1365
https://doi.org/10.3390/e22121365
B. Muşat and R. Andonie, "Information bottleneck in deep learning-a semiotic approach," International Journal of Computers Communications & Control, vol. 17, no. 1, 2022.
https://doi.org/10.15837/ijccc.2022.1.4650
--, "Pruning convolutional filters via reinforcement learning with entropy minimization," in Artificial Intelligence and Soft Computing, L. Rutkowski, R. Scherer, M. orytkowski, W. Pedrycz, R. Tadeusiewicz, and J. M. Zurada, Eds. Cham: Springer Nature Switzerland, 2023, pp. 167-180.
https://doi.org/10.1007/978-3-031-42505-9_15
B. Muşat and R. Andonie, "Accelerating convolutional neural network pruning via spatial aura entropy," in 2023 27th International Conference Information Visualisation (IV), 2023, pp. 286- 291.
https://doi.org/10.1109/IV60283.2023.00056
I. Stan and R. Andonie, "Cybernetical model of the artist-consumer relationship (in Romanian)," Studia Universitatis Babes-Bolyai, vol. 2, pp. 9-15, 1977.
R. Andonie, "A semiotic approach to hierarchical computer vision," in Cybernetics and Systems (Proceedings of the Seventh International Congress of Cybernetics and Systems, London, Sept. 7-11, 1987), J. Ross, Ed. Lytham St. Annes, U.K.: Thales Publication, 1987, pp. 930-933.
--, "Semiotic aggregation in computer vision," Revue Roumaine de linguistique, Cahiers de linguistique théorique et appliquée, vol. 24, pp. 103-107, 1987.
J. C. Baez, T. Fritz, and T. Leinster, "A characterization of entropy in terms of information loss," Entropy, vol. 13, no. 11, pp. 1945-1957, 2011. [Online]. Available: https://www.mdpi.com/1099-4300/13/11/1945
https://doi.org/10.3390/e13111945
P. Burt and E. Adelson, "The Laplacian pyramid as a compact image code," IEEE Transactions on Communications, vol. 31, no. 4, pp. 532-540, 1983.
https://doi.org/10.1109/TCOM.1983.1095851
A. K. Wong and M. A. Vogel, "Resolution-dependent information measures for image analysis," IEEE Transactions on Systems, Man, and Cybernetics, vol. 7, no. 1, pp. 49-61, 1977.
https://doi.org/10.1109/TSMC.1977.4309589
R. Andonie and A. Marian, "A probabilistic model in the automatic generation of visual structures," Revue Roumaine de linguistique, Cahiers de linguistique théorique et appliquée, vol. 22, pp. 3-17, 1985.
--, "Piet Mondrian - Computer aided analysis and synthesis (in Romanian)," in Mathematical Semiotics of Visual Arts, S. Marcus, Ed. Bucureşti: Editura Ştiinţifică şi Enciclopedică, 1982, pp. 66-72.
A. Marian, P. Puşcaş, and R. Andonie, "The bases of a metalanguage in the cybernetic aesthetics (possible relationships between the visual and sonorous structures at the level of near, mid, and distant orders)," Revue Roumaine de linguistique, vol. 30, pp. 51-65, 1985.
K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition," in Computer Vision - ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 346-361.
https://doi.org/10.1007/978-3-319-10578-9_23
I. Kokkinos, "Ubernet: Training a universal convolutional neural network for low-, mid-, and highlevel vision using diverse datasets and limited memory," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5454-5463.
https://doi.org/10.1109/CVPR.2017.579
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834-848, 2017.
https://doi.org/10.1109/TPAMI.2017.2699184
Y. Cheng, D. Wang, P. Zhou, and T. Zhang, "A survey of model compression and acceleration for deep neural networks," CoRR, vol. abs/1710.09282, 2017. [Online]. Available: http://arxiv.org/abs/1710.09282
Y. LeCun, J. S. Denker, and S. A. Solla, "Optimal brain damage," in Advances in neural information processing systems, 1990, pp. 598-605.
S. Oymak and M. Soltanolkotabi, "Towards moderate overparameterization: global convergence guarantees for training shallow neural networks," CoRR, vol. abs/1902.04674, 2019.
D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag, "What is the state of neural network pruning?" 2020.
T. Gale, E. Frank, and M. Johnson, "The state of sparsity in deep neural networks," IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 8, pp. 2403-2424, 2019.
D. Blalock, J. J. Gonzalez Ortiz, J. Frankle, and J. Guttag, "What is the state of neural network pruning?" Proceedings of machine learning and systems, vol. 2, pp. 129-146, 2020.
S. Han, J. Pool, J. Tran, and W. Dally, "Learning both weights and connections for efficient neural network," in Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28. Curran Associates, Inc., 2015.
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, "Pruning filters for efficient convnets," in Proceedings of the International Conference on Learning Representations, 2017.
Y. Lin, B. Fang, and Y. Tang, "A computational model for saliency maps by using local entropy," in Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010.
https://doi.org/10.1609/aaai.v24i1.7648
K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep inside convolutional networks: Visualising image classification models and saliency maps." CoRR, vol. abs/1312.6034, 2013. [Online]. Available: http://dblp.uni-trier.de/db/journals/corr/corr1312.html#SimonyanVZ13
A. G. Journel and C. V. Deutsch, "Entropy and spatial disorder," Mathematical Geology, vol. 25, no. 3, pp. 329-355, 1993.
https://doi.org/10.1007/BF00901422
E. Volden, G. Giraudon, and M. Berthod, "Modelling image redundancy," in 1995 International Geoscience and Remote Sensing Symposium, IGARSS '95. Quantitative Remote Sensing for Science and Applications, vol. 3, 1995, pp. 2148-2150.
C. Szegedy, S. Ioffe, and V. Vanhoucke, "Inception-v4, inception-resnet and the impact of residual connections on learning," CoRR, vol. abs/1602.07261, 2016. [Online]. Available: http://arxiv.org/abs/1602.07261
https://doi.org/10.1609/aaai.v31i1.11231
N. Tishby, F. C. Pereira, and W. Bialek, "The information bottleneck method," 2000.
H. Haken and J. Portugali, Information adaptation: the interplay between Shannon information and semantic information in cognition. Springer, 2014.
https://doi.org/10.1007/978-3-319-11170-4
S. Watanabe, "Learning process and inverse H-theorem," IRE Transactions on Information Theory, vol. 8, no. 5, pp. 246-251, 1962.
https://doi.org/10.1109/TIT.1962.1057747
--, Knowing and Guessing a Quantitative Study of Inference and Information. New York: Wiley, 1969.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http://arxiv.org/abs/1512.03385
G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA, USA: IEEE Computer Society, jul 2017, pp. 2261-2269. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/CVPR.2017.243
https://doi.org/10.1109/CVPR.2017.243
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," CoRR, vol. abs/1409.4842, 2014. [Online]. Available: http://arxiv.org/abs/1409.4842
https://doi.org/10.1109/CVPR.2015.7298594
X. He, K. Zhao, and X. Chu, "AutoML: A survey of the state-of-the-art," Knowledge-Based Systems, vol. 212, p. 106622, 2021. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S0950705120307516
https://doi.org/10.1016/j.knosys.2020.106622
Q. Yao, M. Wang, H. J. Escalante, I. Guyon, Y. Hu, Y. Li, W. Tu, Q. Yang, and Y. Yu, "Taking human out of learning applications: A survey on automated machine learning," CoRR, vol. abs/1810.13306, 2018. [Online]. Available: http://arxiv.org/abs/1810.13306
Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, "AMC: AutoML for Model Compression and Acceleration on Mobile Devices," in European Conference on Computer Vision (ECCV), 2018.
https://doi.org/10.1007/978-3-030-01234-2_48
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," in ICLR, Y. Bengio and Y. LeCun, Eds., 2016. [Online]. Available: http://dblp.uni-trier.de/db/conf/iclr/iclr2016.html# LillicrapHPHETS15
A. Almog and E. Shmueli, "Structural entropy: monitoring correlation-based networks over time with application to financial markets," Scientific reports, vol. 9, no. 1, pp. 1-13, 2019.
https://doi.org/10.1038/s41598-019-47210-8
V. Konda and J. Tsitsiklis, "Actor-critic algorithms," in Advances in Neural Information Processing Systems, S. Solla, T. Leen, and K. Müller, Eds., vol. 12. MIT Press, 1999. [Online]. Available: https://proceedings.neurips.cc/paper/1999/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf
I. Loshchilov and F. Hutter, "SGDR: stochastic gradient descent with warm restarts," in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [Online]. Available: https://openreview.net/forum?id=Skq89Scxx
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," 6 2018.
https://doi.org/10.1109/CVPR.2018.00474
Z. Wang, C. Li, and X. Wang, "Convolutional neural network pruning with structural redundancy reduction," in 2021 IEEE/CVF Conference On Computer Vision And Pattern Recognition (CVPR), 2021, pp. 14 908-14 917.
https://doi.org/10.1109/CVPR46437.2021.01467
M. Nadin, "Information and semiotic processes: The semiotics of computation," Cybernetics & Human Knowing, vol. 18, pp. 153-175, 2011.
--, "Semiotic machine," Public Journal of Semiotics, vol. 1, pp. 57-75, 2007.
Additional Files
Published
Issue
Section
License
Copyright (c) 2023 Razvan ANDONIE; Bogdan Musat
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.