Fault Triggers in the TensorFlow Framework: An Experience Report

Abstract

TensorFlow is one of the most popular machine learning frameworks for developing machine learning algorithms. Because of the popularity and large-scale use of TensorFlow, even a single bug may lead to severe consequences and impact a large number of users. With a growing number of safety-critical systems built upon TensorFlow, its reliability is becoming increasingly important. An essential step to ensure TensorFlow’s reliability is to understand the characteristics of bugs that occurred in TensorFlow. This paper presents the first comprehensive empirical study on fault triggering conditions in TensorFlow. 2,285 bug reports from TensorFlow’s GitHub repository are collected. A bug classification is performed based on fault triggering conditions, followed by the frequency distribution of different types of bugs and the evolution features of varying bug types over time. Then the relationships between bug types and fixing time are also investigated. In addition, the root causes of Bohrbugs and Mandelbugs are studied. Five root causes are discovered. Furthermore, the analysis of regression bugs in TensorFlow is conducted. We have revealed 10 important findings based on our empirical results. There are 8 implications based on these findings are provided for developers and users.

Publication
2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE)
Date
Links