Reusing Interactive Analysis Workflows
Abstract
Interactive visual analysis has many advantages, but an important disadvantage is that analysis processes and workflows cannot be easily stored and reused. This is in contrast to code-based analysis workflows, which can simply be run on updated datasets, and adapted when necessary. In this paper, we introduce methods to capture workflows in interactive visualization systems for different interactions such as selections, filters, categorizing/grouping, labeling, and aggregation. These workflows can then be applied to updated datasets, making interactive visualization sessions reusable. We demonstrate this specification using an interactive visualization system that tracks interaction provenance, and allows generating workflows from the recorded actions. The system can then be used to compare different versions of datasets and apply workflows to them. Finally, we introduce a Python library that can load workflows and apply it to updated datasets directly in a computational notebook, providing a seamless bridge between computational workflows and interactive visualization tools.
Supporting Information
Filename | Description |
---|---|
cgf14528-sup-0001-S1.pdf2.6 MB | Supporting Information |
cgf14528-sup-0001-S2.mp4111.9 MB | Supporting Information |
cgf14528-sup-0001-S3.pdf60.6 KB | Supporting Information |
cgf14528-sup-0001-S4.pdf2.6 MB | Supporting Information |
cgf14528-sup-0001-S5.pdf225.4 KB | Supporting Information |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Altintas I., Berkley C., Jaeger E., Jones M., Ludascher B., Mock S.: Kepler: An extensible system for design and execution of scientific workflows. In Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004. (June 2004), pp. 423–424. doi:10.1109/SSDM.2004.1311241. 2
- Becker R. A., Cleveland W. S.: Brushing Scatterplots. Technometrics 29, 2 (1987), 127–142. doi:10.1080/00401706.1987.10488204. 3
- Berthold M. R., Cebron N., Dill F., Gabriel T. R., Kötter T., Meinl T., Ohl P., Thiel K., Wiswedel B.: KNIME - the Konstanz Information Miner: Version 2.0 and Beyond. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 26–31. doi:10.1145/1656274.1656280. 2
10.1145/1656274.1656280 Google Scholar
- Bavoil L., Callahan S. P., Scheidegger C., Vo H. T., Crossno P., Silva C. T., Freire J.: VisTrails: Enabling Interactive Multiple-View Visualizations. In Proceedings of the IEEE Conference on Visualization (VIS '05) (2005), pp. 135–142. doi:10.1109/VISUAL.2005.1532788. 2
- Camisetty A., Chandurkar C., Sun M., Koop D.: Enhancing Web-based Analytics Applications through Provenance. IEEE Transactions on Visualization and Computer Graphics 25, 1 (Jan. 2019), 131–141. doi:10.1109/TVCG.2018.2865039. 2
- Cutler Z. T., Gadhave K., Lex A.: Trrack: A Library for Provenance Tracking in Web-Based Visualizations. In IEEE Visualization Conference (VIS) (Salt Lake City, UT, USA, 2020), IEEE, pp. 116–120. doi:10.1109/VIS47514.2020.00030. 2, 5
- Chen Y. V., Qian Z. C., Woodbury R., Dill J., Shaw C. D.: Employing a Parametric Model for Analytic Provenance. ACM Transactions on Interactive Intelligent Systems 4, 1 (Apr. 2014), 6:1–6:32. doi:10.1145/2591510. 2
10.1145/2591510 Google Scholar
- Deelman E., Gannon D., Shields M., Taylor I.: Workflows and e-Science: An overview of workflow system features and capabilities. Future Generation Computer Systems 25, 5 (May 2009), 528–540. doi:10.1016/j.future.2008.06.012. 2
- Dunne C., Henry Riche N., Lee B., Metoyer R., Robertson G.: GraphTrail: Analyzing Large Multivariate, Heterogeneous Networks While Supporting Exploration History. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12) (2012), ACM, pp. 1663–1672. doi:10.1145/2207676.2208293. 2
- Gadhave K., Görtler J., Cutler Z., Nobre C., Deussen O., Meyer M., Phillips J. M., Lex A.: Predicting intent behind selections in scatterplot visualizations. Information Visualization 20, 4 (Oct. 2021), 207–228. doi:10.1177/14738716211038604. 3, 5, 9
- Gratzl S., Lex A., Gehlenborg N., Cosgrove N., Streit M.: From Visual Exploration to Storytelling and Back Again. Computer Graphics Forum (EuroVis '16) 35, 3 (2016), 491–500. doi:10.1111/cgf.12925. 2
- Goecks J., Nekrutenko A., Taylor J., Team T. G.: Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11, 8 (2010), R86. 2
- Heer J., Agrawala M., Willett W.: Generalized Selection via Interactive Query Relaxation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (New York, NY, USA, 2008), CHI '08, ACM, pp. 959–968. doi:10.1145/1357054.1357203. 3
- Heer J., Mackinlay J., Stolte C., Agrawala M.: Graphical Histories for Visualization: Supporting Analysis, Communication, and Evaluation. IEEE Transactions on Visualization and Computer Graphics (InfoVis '08) 14, 6 (2008), 1189–1196. doi:10.1109/TVCG.2008.137. 2
- Heer J., Shneiderman B.: Interactive dynamics for visual analysis. Communications of the ACM 55, 4 (2012), 45–54. doi:10.1145/2133806.2133821. 3
- Hunter J. D.: Matplotlib: A 2D Graphics Environment. Computing in Science Engineering 9, 3 (May 2007), 90–95. doi:10.1109/MCSE.2007.55. 2
- Kreuseler M., Nocke T., Schumann H.: A History Mechanism for Visual Data Mining. In Proceedings of the IEEE Symposium on Information Visualization (InfoVis '04) (2004), pp. 49–56. doi:10.1109/INFVIS.2004.2. 2
- Knuth D. E.: Literate Programming. The Computer Journal 27, 2 (Jan. 1984), 97–111. doi:10.1093/comjnl/27.2.97. 2
- Lam H.: A Framework of Interaction Costs in Information Visualization. IEEE Transactions on Visualization and Computer Graphics 14, 6 (Nov. 2008), 1149–1156. doi:10.1109/TVCG.2008.109. 2
- Martin A. R., Ward M. O.: High Dimensional Brushing for Interactive Exploration of Multivariate Data. In Proceedings of the IEEE Conference on Visualization (Vis '95) (1995), IEEE Computer Society Press, pp. 271–278. doi:10.1109/VISUAL.1995.485139. 3
- North C., Chang R., Endert A., Dou W., May R., Pike B., Fink G.: Analytic Provenance: Process+Interaction+Insight. In CHI ‘11 Extended Abstracts on Human Factors in Computing Systems (2011), CHI EA ‘11, pp. 33–36. doi:10.1145/1979742.1979570. 2
- Niederer C., Stitz H., Hourieh R., Grassinger F., Aigner W., Streit M.: TACO: Visualizing Changes in Tables Over Time. IEEE Transactions on Visualization and Computer Graphics (InfoVis '17) 24, 1 (2017), 677–686. 5
- Parker S. G., Johnson C. R.: SCIRun: A Scientific Programming Environment for Computational Steering. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC '95) (1995), ACM, p. 52. 2
- Ritchie H., Mathieu E., Rodés-Guirao L., Appel C., Giattino C., Ortiz-Ospina E., Hasell J., Macdonald B., Beltekian D., Roser M.: Coronavirus Pandemic (COVID-19). Our World in Data (Mar. 2020). 8
- Shneiderman B.: Direct Manipulation: A Step Beyond Programming Languages. Computer 16, 8 (Aug. 1983), 57–69. doi:10.1109/MC.1983.1654471. 2
- Schmidt J., Ortner T.: Visualization in Notebook-Style Interfaces. In Proceedings of the Workshop on the Gap between Visualization Research and Visualization Software (VisGap) (May 2020). doi:10.2312/visgap.20201104. 2
- Shrinivasan Y. B., van Wijk J. J.: Supporting the analytical reasoning process in information visualization. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (2008), CHI '08, pp. 1237–1246. doi:10.1145/1357054.1357247. 2
- van den Elzen S., van Wijk J. J.: Small Multiples, Large Singles: A New Approach for Visual Data Exploration. Computer Graphics Forum (EuroVis '13) 32, 3pt2 (2013), 191–200. doi:10.1111/cgf.12106. 2
- VanderPlas J., Granger B. E., Heer J., Moritz D., Wongsuphasawat K., Satyanarayan A., Lees E., Timofeev I., Welsh B., Sievert S.: Altair: Interactive statistical visualizations for Python. Journal of open source software 3, 32 (2018), 1057. 2, 10
10.21105/joss.01057 Google Scholar
- Wu Y., Hellerstein J. M., Satyanarayan A.: B2: Bridging code and interactive visualization in computational notebooks. In ACM User Interface Software & Technology (UIST) (2020). 2, 7, 10
- Xu K., Ottley A., Walchshofer C., Streit M., Chang R., Wenskovitch J.: Survey on the Analysis of User Interactions and Visualization Provenance. Computer Graphics Forum 39, 3 (2020), 757–783. doi:10.1111/cgf.14035. 2
- Yu B., Silva C. T.: VisFlow - Web-based Visualization Framework for Tabular Data with a Subset Flow Model. IEEE Transactions on Visualization and Computer Graphics (InfoVis '16) 23, 1 (2017), 251–260. doi:10.1109/TVCG.2016.2598497. 2, 7
- Zaman L., Stuerzlinger W., Neugebauer C., Woodbury R., Elkhaldi M., Shireen N. I., Terry M. A.: GEM-NI: A System for Creating and Managing Alternatives In Generative Design. CHI (2015). doi:10.1145/2702123.2702398. 2