Extract Executable Action Sequences from Natural Language Instructions Based on DQN for Medical Service Robots
Keywords:
medical robots, human-robots interaction, DQN agent, attention mechanism, longshort term memoryAbstract
The emergence and popularization of medical robots bring great convenience to doctors in treating patients. The core of medical robots is the interaction and cooperation between doctors and robots, so it is crucial to design a simple and stable human-robots interaction system for medical robots. Language is the most convenient way for people to communicate with each other, so in this paper, a DQN agent based on long-short term memory (LSTM) and attention mechanism is proposed to enable the robots to extract executable action sequences from doctors’ natural language instructions. For this, our agent should be able to complete two related tasks: 1) extracting action names from instructions. 2) extracting action arguments according to the extracted action names. We evaluate our agent on three datasets composed of texts with an average length of 49.95, 209.34, 417.17 words respectively. The results show that our agent can perform better than similar agents. And our agent has a better ability to handle long texts than previous works.
References
[2] Bastianelli, E.; Castellucci, G.; Croce, D.; Iocchi, L.; Basili, R.; Nardi, D. (2014). HuRIC: a Human Robot Interaction Corpus. Proceedings of the 9th International Conference on Language Resources and Evaluation, 4519-4526, 2014.
[3] Mensio, M.; Bastianelli, E.; Tiddi, I.; Rizzo, G. (2020). Mitigating Bias in Deep Nets with Knowledge Bases: the Case of Natural Language Understanding for Robots. AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering, 1, 1-9, 2020.
[4] Kaelbling, L. P.; Littman, M. L.; Moore, A. W. (1995). An introduction to reinforcement learning. The Biology and Technology of Intelligent Autonomous Agents. Springer, Berlin, Heidelberg, 90- 127, 1995. https://doi.org/10.1007/978-3-642-79629-6_5
[5] Shi, H.; Li, X.; Hwang, K. S.; Pan, W.; Xu, G. (2016). Decoupled visual servoing with fuzzyQlearning. IEEE Transactions on Industrial Informatics, 14(1), 241-252, 2016. https://doi.org/10.1109/TII.2016.2617464
[6] Shi, H.; Shi, L.; Xu, M.; Hwang, K. S. (2020). End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile robots. IEEE Transactions on Industrial Informatics, 16(4), 2393-2402, 2020. https://doi.org/10.1109/TII.2019.2936167
[7] Shi, H.; Lin, Z.; Hwang, K. S.; Yang, S.; Chen, J. (2018). An adaptive strategy selection method with reinforcement learning for robotic soccer games. IEEE Access, 6, 8376-8386, 2018. https://doi.org/10.1109/ACCESS.2018.2808266
[8] Shi, H.; Lin, Z.,;Zhang, S.; Li, X.; Hwang, K. S. (2018). An adaptive decision-making method with fuzzy Bayesian reinforcement learning for robot soccer. Information Sciences, 436, 268-281, 2018. https://doi.org/10.1016/j.ins.2018.01.032
[9] Feng, W.; Zhuo, H. H.; Kambhampati, S. (2018). Extracting action sequences from texts based on deep reinforcement learning. Proceedings of the 27th International Joint Conference on Artificial Intelligence, 4064-4070, 2018. https://doi.org/10.24963/ijcai.2018/565
[10] Hochreiter, S.; Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735- 1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
[11] Branavan, S. R.; Chen, H.; Zettlemoyer, L. S.; Barzilay, R. (2009). Reinforcement Learning for Mapping Instructions to Actions. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 82-90, 2009. https://doi.org/10.3115/1687878.1687892
[12] Fillmore, C. J. (2006). Frame semantics. Cognitive linguistics: Basic readings, 34, 373-400, 2006. https://doi.org/10.1515/9783110199901.373
[13] Qin, L.; Yu, N.; Zhao, D. (2018). Applying the Convolutional Neural Network Deep Learning Technology to Behavioural Recognition in Intelligent Video. Tehnicki vjesnik-Technical Gazette, DOI: 10.17559/TV-20171229024444. 25(2), 528-535, 2018. https://doi.org/10.17559/TV-20171229024444
[14] Afify, H. M.; Mohammed, K. K.; Hassanien, A. E. (2020). Multi-Images Recognition of Breast Cancer Histopathological via Probabilistic Neural Network Approach. Journal of System and Management Sciences, 1(2), 53-68, 2020.
[15] Dumitrescu, C. M.; Dumitrache, I. (2019). Combining Deep Learning Technologies with Multi- Level Gabor Features for Facial Recognition in Biometric Automated Systems, Studies in Informatics and Control, 28(2), 221-230, 2019. https://doi.org/10.24846/v28i2y201910
[16] Vaswani A; Shazeer N; Parmar N, et al. (2017). Attention is all you need. Advances in neural information processing systems, 5998-6008, 2017.
[17] Devlin, J.; Chang, M. W.; Lee, K.; Toutanova, K (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the NorthAmerican Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171-4186, 2019.
[18] Li, Y.; He, J.; Zhou, X.; Zhang, Y.; Baldridge, J. (2020). Mapping Natural Language Instructions to Mobile UI Action Sequences. arXiv preprint arXiv:2005.03776, 2020. https://doi.org/10.18653/v1/2020.acl-main.729
[19] Azaria, A.; Srivastava, S.; Krishnamurthy, J.; Labutov, I.; Mitchell, T. M. (2020). Mitchell. An agent for learning new natural language commands. Autonomous Agents and Multi-Agent Systems, 34(1), 6, 2020. https://doi.org/10.1007/s10458-019-09425-x
[20] Zhuo, H. H.; Feng, W.; Xu, Q.; Yang, Q.; Lin, Y. (2019). Federated reinforcement learning. arXiv preprint arXiv:1901.08277, 2019.
[21] Kumar, G.; Foster, G.; Cherry, C.; Krikun, M. (2019). Reinforcement Learning based Curriculum Optimization for Neural Machine Translation. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 2054-2061, 2019. https://doi.org/10.18653/v1/N19-1208
[22] Shi, Z.; Chen, X.; Qiu, X.; Huang, X. (2018). Toward diverse text generation with inverse reinforcement learning. Proceedings of the 27th International Joint Conference on Artificial Intelligence, 4361-4367, 2018. https://doi.org/10.24963/ijcai.2018/606
[23] Raffel, C.; Ellis, D. P. (2016). Feed-forward networks with attention can solve some long-term memory problems. International Conference on Learning Representations, 2016.
[24] Dozat, T. (2016). Incorporating Nesterov Momentum into Adam. International Conference on Learning Representations, 1, 2013-2016, 2016.
[25] Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555, 2014.
Additional Files
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.