---
_id: '36522'
abstract:
- lang: eng
  text: "Jupyter notebooks enable developers to interleave code snippets with rich-text
    and in-line visualizations. Data scientists use Jupyter notebook as the de-facto
    standard for creating and sharing machine-learning based solutions, primarily
    written in Python. Recent studies have demonstrated, however, that a large portion
    of Jupyter notebooks available on public platforms are undocumented and lacks
    a narrative structure. This reduces the readability of these notebooks. To address
    this shortcoming, this paper presents HeaderGen, a novel tool-based approach that
    automatically annotates code cells with categorical markdown headers based on
    a taxonomy of machine-learning operations, and classifies and displays function
    calls according to this taxonomy. For this functionality to be realized, HeaderGen
    enhances an existing call graph analysis in PyCG. To improve precision, HeaderGen
    extends PyCG's analysis with support for handling external library code and flow-sensitivity.
    The former is realized by facilitating the resolution of function return-types.
    Furthermore, HeaderGen uses type information to perform pattern matching on code
    syntax to annotate code cells.\r\nThe evaluation on 15 real-world Jupyter notebooks
    from Kaggle shows that HeaderGen's underlying call graph analysis yields high
    accuracy (96.4% precision and 95.9% recall). This is because HeaderGen can resolve
    return-types of external libraries where existing type inference tools such as
    pytype (by Google), pyright (by Microsoft), and Jedi fall short. The header generation
    has a precision of 82.2% and a recall rate of 96.8% with regard to headers created
    manually by experts. In a user study, HeaderGen helps participants finish comprehension
    and navigation tasks faster. All participants clearly perceive HeaderGen as useful
    to their task."
author:
- first_name: Ashwin Prasad
  full_name: Shivarpatna Venkatesh, Ashwin Prasad
  id: '66637'
  last_name: Shivarpatna Venkatesh
- first_name: Jiawei
  full_name: Wang, Jiawei
  last_name: Wang
- first_name: Li
  full_name: Li, Li
  last_name: Li
- first_name: Eric
  full_name: Bodden, Eric
  id: '59256'
  last_name: Bodden
  orcid: 0000-0003-3470-3647
citation:
  ama: 'Shivarpatna Venkatesh AP, Wang J, Li L, Bodden E. Enhancing Comprehension
    and Navigation in Jupyter Notebooks with Static Analysis. In: IEEE SANER 2023
    (International Conference on Software Analysis, Evolution and Reengineering);
    2023. doi:<a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>'
  apa: Shivarpatna Venkatesh, A. P., Wang, J., Li, L., &#38; Bodden, E. (2023). <i>Enhancing
    Comprehension and Navigation in Jupyter Notebooks with Static Analysis</i>. IEEE
    SANER 2023 (International Conference on Software Analysis, Evolution and Reengineering).
    <a href="https://doi.org/10.48550/ARXIV.2301.04419">https://doi.org/10.48550/ARXIV.2301.04419</a>
  bibtex: '@inproceedings{Shivarpatna Venkatesh_Wang_Li_Bodden_2023, title={Enhancing
    Comprehension and Navigation in Jupyter Notebooks with Static Analysis}, DOI={<a
    href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>},
    publisher={IEEE SANER 2023 (International Conference on Software Analysis, Evolution
    and Reengineering)}, author={Shivarpatna Venkatesh, Ashwin Prasad and Wang, Jiawei
    and Li, Li and Bodden, Eric}, year={2023} }'
  chicago: Shivarpatna Venkatesh, Ashwin Prasad, Jiawei Wang, Li Li, and Eric Bodden.
    “Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis.”
    IEEE SANER 2023 (International Conference on Software Analysis, Evolution and
    Reengineering), 2023. <a href="https://doi.org/10.48550/ARXIV.2301.04419">https://doi.org/10.48550/ARXIV.2301.04419</a>.
  ieee: 'A. P. Shivarpatna Venkatesh, J. Wang, L. Li, and E. Bodden, “Enhancing Comprehension
    and Navigation in Jupyter Notebooks with Static Analysis,” presented at the IEEE
    SANER 2023 (International Conference on Software Analysis, Evolution and Reengineering),
    2023, doi: <a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>.'
  mla: Shivarpatna Venkatesh, Ashwin Prasad, et al. <i>Enhancing Comprehension and
    Navigation in Jupyter Notebooks with Static Analysis</i>. IEEE SANER 2023 (International
    Conference on Software Analysis, Evolution and Reengineering), 2023, doi:<a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>.
  short: 'A.P. Shivarpatna Venkatesh, J. Wang, L. Li, E. Bodden, in: IEEE SANER 2023
    (International Conference on Software Analysis, Evolution and Reengineering),
    2023.'
conference:
  name: IEEE SANER 2023 (International Conference on Software Analysis, Evolution
    and Reengineering)
date_created: 2023-01-13T08:03:26Z
date_updated: 2025-04-07T10:18:03Z
ddc:
- '000'
doi: 10.48550/ARXIV.2301.04419
file:
- access_level: open_access
  content_type: application/pdf
  creator: ashwin
  date_created: 2023-01-26T10:48:40Z
  date_updated: 2023-01-26T10:48:40Z
  file_id: '40304'
  file_name: 2301.04419.pdf
  file_size: 1862440
  relation: main_file
file_date_updated: 2023-01-26T10:48:40Z
has_accepted_license: '1'
keyword:
- static analysis
- python
- code comprehension
- annotation
- literate programming
- jupyter notebook
language:
- iso: eng
oa: '1'
publisher: IEEE SANER 2023 (International Conference on Software Analysis, Evolution
  and Reengineering)
status: public
title: Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis
type: conference
user_id: '15249'
year: '2023'
...
---
_id: '23388'
abstract:
- lang: eng
  text: As one of the most popular programming languages, PYTHON has become a relevant
    target language for static analysis tools. The primary data structure for performing
    an inter-procedural static analysis is call-graph (CG), which links call sites
    to potential call targets in a program. There exists multiple algorithms for constructing
    callgraphs, tailored to specific languages. However, comparatively few implementations
    target PYTHON. Moreover, there is still lack of empirical evidence as to how these
    few algorithms perform in terms of precision and recall. This paper thus presents
    EVAL_CG, an extensible framework for comparative analysis of Python call-graphs.
    We conducted two experiments which run the CG algorithms on different Python programming
    constructs and real-world applications. In both experiments, we evaluate three
    CG generation frameworks namely, Code2flow, Pyan, and Wala. We record precision,
    recall, and running time, and identify sources of unsoundness of each framework.
    Our evaluation shows that none of the current CG construction frameworks produce
    a sound CG. Moreover, the static CGs contain many spurious edges. Code2flow is
    also comparatively slow. Hence, further research is needed to support CG generation
    for Python programs.
author:
- first_name: Sriteja
  full_name: Kummita, Sriteja
  id: '72582'
  last_name: Kummita
- first_name: Goran
  full_name: Piskachev, Goran
  id: '41936'
  last_name: Piskachev
  orcid: 0000-0003-4424-5838
- first_name: Johannes
  full_name: Spaeth, Johannes
  last_name: Spaeth
- first_name: Eric
  full_name: Bodden, Eric
  id: '59256'
  last_name: Bodden
  orcid: 0000-0003-3470-3647
citation:
  ama: 'Kummita S, Piskachev G, Spaeth J, Bodden E. Qualitative and Quantitative Analysis
    of Callgraph Algorithms for PYTHON. In: <i>Proceedings of the 2021 International
    Conference on Code Quality (ICCQ)</i>. ; 2021. doi:<a href="https://doi.org/10.1109/ICCQ51190.2021.9392986">10.1109/ICCQ51190.2021.9392986</a>'
  apa: Kummita, S., Piskachev, G., Spaeth, J., &#38; Bodden, E. (2021). Qualitative
    and Quantitative Analysis of Callgraph Algorithms for PYTHON. In <i>Proceedings
    of the 2021 International Conference on Code Quality (ICCQ)</i>. Virtual. <a href="https://doi.org/10.1109/ICCQ51190.2021.9392986">https://doi.org/10.1109/ICCQ51190.2021.9392986</a>
  bibtex: '@inproceedings{Kummita_Piskachev_Spaeth_Bodden_2021, title={Qualitative
    and Quantitative Analysis of Callgraph Algorithms for PYTHON}, DOI={<a href="https://doi.org/10.1109/ICCQ51190.2021.9392986">10.1109/ICCQ51190.2021.9392986</a>},
    booktitle={Proceedings of the 2021 International Conference on Code Quality (ICCQ)},
    author={Kummita, Sriteja and Piskachev, Goran and Spaeth, Johannes and Bodden,
    Eric}, year={2021} }'
  chicago: Kummita, Sriteja, Goran Piskachev, Johannes Spaeth, and Eric Bodden. “Qualitative
    and Quantitative Analysis of Callgraph Algorithms for PYTHON.” In <i>Proceedings
    of the 2021 International Conference on Code Quality (ICCQ)</i>, 2021. <a href="https://doi.org/10.1109/ICCQ51190.2021.9392986">https://doi.org/10.1109/ICCQ51190.2021.9392986</a>.
  ieee: S. Kummita, G. Piskachev, J. Spaeth, and E. Bodden, “Qualitative and Quantitative
    Analysis of Callgraph Algorithms for PYTHON,” in <i>Proceedings of the 2021 International
    Conference on Code Quality (ICCQ)</i>, Virtual, 2021.
  mla: Kummita, Sriteja, et al. “Qualitative and Quantitative Analysis of Callgraph
    Algorithms for PYTHON.” <i>Proceedings of the 2021 International Conference on
    Code Quality (ICCQ)</i>, 2021, doi:<a href="https://doi.org/10.1109/ICCQ51190.2021.9392986">10.1109/ICCQ51190.2021.9392986</a>.
  short: 'S. Kummita, G. Piskachev, J. Spaeth, E. Bodden, in: Proceedings of the 2021
    International Conference on Code Quality (ICCQ), 2021.'
conference:
  location: Virtual
  name: International Conference on Code Quality (ICCQ)
  start_date: 2021-03-27
date_created: 2021-08-12T14:00:54Z
date_updated: 2022-01-06T06:55:52Z
doi: 10.1109/ICCQ51190.2021.9392986
keyword:
- Static Analysis
- Callgraph Analysis
- Python
- Qualitative Analysis
- Quantitative Analysis
- Empirical Evaluation
language:
- iso: eng
main_file_link:
- url: https://ieeexplore.ieee.org/document/9392986
publication: Proceedings of the 2021 International Conference on Code Quality (ICCQ)
publication_identifier:
  isbn:
  - 978-1-7281-8477-7
publication_status: published
status: public
title: Qualitative and Quantitative Analysis of Callgraph Algorithms for PYTHON
type: conference
user_id: '72582'
year: '2021'
...
