A New Semantic-Based Tool Detection Method for Robots
Keywords:
Functional Components, Mask-R-CNN Network, Tool Classification, Functional SemanticsAbstract
Home helper robots have become more acceptable due to their excellent image recognition ability. However, some common household tools remain challenging to recognize, classify, and use by robots. We designed a detection method for the functional components of common household tools based on the mask regional convolutional neural network (Mask-R-CNN). This method is a multitask branching target detection algorithm that includes tool classification, target box regression, and semantic segmentation. It provides accurate recognition of the functional components of tools. The method is compared with existing algorithms on the dataset UMD Part Affordance dataset and exhibits effective instance segmentation and key point detection, with higher accuracy and robustness than two traditional algorithms. The proposed method helps the robot understand and use household tools better than traditional object detection algorithms.
References
[2] Zhu, Y., Fathi, A., Fei-Fei, L. (2014, September). Reasoning about object affordances in a knowledge base representation. In European conference on computer vision (pp. 408-424). Springer, Cham. https://doi.org/10.1007/978-3-319-10605-2_27
[3] Greene, M. R., Baldassano, C., Esteva, A., Beck, D. M., Fei-Fei, L. (2014). Affordances provide a fundamental categorization principle for visual scenes. arXiv preprint arXiv:1411.5340. https://doi.org/10.1167/15.12.572
[4] Koppula, H. S., Saxena, A. (2014, September). Physically grounded spatio-temporal object affordances. In European Conference on Computer Vision (pp. 831-847). Springer, Cham. https://doi.org/10.1007/978-3-319-10578-9_54
[5] Stark, L., Bowyer, K. (1994). Function-based generic recognition for multiple object categories. CVGIP: Image Understanding, 59(1), 1-21. https://doi.org/10.1006/ciun.1994.1001
[6] Bohg, J., Kragic, D. (2009, June). Grasping familiar objects using shape context. In 2009 International Conference on Advanced Robotics (pp. 1-6). IEEE.
[7] Saxena, A., Driemeyer, J., Ng, A. Y. (2008). Robotic grasping of novel objects using vision. The International Journal of Robotics Research, 27(2), 157-173. https://doi.org/10.1177/0278364907087172
[8] Stark, M., Lies, P., Zillich, M., Wyatt, J., Schiele, B. (2008, May). Functional object class detection based on learned affordance cues. In International conference on computer vision systems (pp. 435-444). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79547-6_42
[9] Grabner, H., Gall, J., Van Gool, L.(2011, June). What makes a chair a chair?. In CVPR 2011 (pp. 1529-1536). IEEE. https://doi.org/10.1109/CVPR.2011.5995327
[10] Kjellstrí¶m, H., Romero, J., Kragic, D. (2011). Visual object-action recognition: Inferring object affordances from human demonstration. Computer Vision and Image Understanding, 115(1), 81- 90. https://doi.org/10.1016/j.cviu.2010.08.002
[11] Zhu, Y., Zhao, Y., Chun Zhu, S. (2015). Understanding tools: Task-oriented object modeling, learning and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2855-2864). https://doi.org/10.1109/CVPR.2015.7298903
[12] Hassan, M., Dharmaratne, A. (2015, November). Attribute based affordance detection from human-object interaction images. In Image and Video Technology (pp. 220-232). Springer, Cham. https://doi.org/10.1007/978-3-319-30285-0_18
[13] Kemp, C. C., Edsinger, A. (2006, June). Robot manipulation of human tools: Autonomous detection and control of task relevant features. In Proc. of the Fifth Intl. Conference on Development and Learning (Vol. 42).
[14] Mar, T., Tikhanoff, V., Metta, G., Natale, L. (2015, November). Multi-model approach based on 3D functional features for tool affordance learning in robotics. In 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids) (pp. 482-489). IEEE. https://doi.org/10.1109/HUMANOIDS.2015.7363593
[15] Lenz, I., Lee, H., Saxena, A. (2015). Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 34(4-5), 705-724. https://doi.org/10.1177/0278364914549607
[16] Redmon, J., Angelova, A. (2015, May). Real-time grasp detection using convolutional neural networks. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1316-1322). IEEE. https://doi.org/10.1109/ICRA.2015.7139361
[17] Myers, A., Teo, C. L., Fermüller, C., Aloimonos, Y. (2015, May). Affordance detection of tool parts from geometric features. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1374-1381). IEEE. https://doi.org/10.1109/ICRA.2015.7139369
[18] Abelha, P., Guerin, F., Schoeler, M. (2016, May). A model-based approach to finding substitute tools in 3d vision data. In 2016 IEEE International Conference on Robotics and Automation (ICRA) (pp. 2471-2478). IEEE. https://doi.org/10.1109/ICRA.2016.7487400
[19] Schoeler, M., Wí¶rgí¶tter, F. (2015). Bootstrapping the semantics of tools: Affordance analysis of real world objects on a per-part basis. IEEE Transactions on Cognitive and Developmental Systems, 8(2), 84-98. https://doi.org/10.1109/TAMD.2015.2488284
[20] Massa, F., Girshick, R. (2018). maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. Accessed: Apr, 29, 2019.
[21] [Online]. Available: https://github.com/facebookresearch/maskR-CNN-benchmark
[22] Peiliang Wu, Ben He, Lingfu Kong. (2017) A classification method of household daily tools based on functional semantic combination of components. Robots, 39(06): 786-794.
[23] Lakani, S. R., Rodríguez-Sánchez, A. J., Piater, J. (2019). Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Autonomous Robots, 43(5), 1155- 1172. https://doi.org/10.1007/s10514-018-9787-5
Additional Files
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.