---
_id: '36522'
abstract:
- lang: eng
  text: "Jupyter notebooks enable developers to interleave code snippets with rich-text
    and in-line visualizations. Data scientists use Jupyter notebook as the de-facto
    standard for creating and sharing machine-learning based solutions, primarily
    written in Python. Recent studies have demonstrated, however, that a large portion
    of Jupyter notebooks available on public platforms are undocumented and lacks
    a narrative structure. This reduces the readability of these notebooks. To address
    this shortcoming, this paper presents HeaderGen, a novel tool-based approach that
    automatically annotates code cells with categorical markdown headers based on
    a taxonomy of machine-learning operations, and classifies and displays function
    calls according to this taxonomy. For this functionality to be realized, HeaderGen
    enhances an existing call graph analysis in PyCG. To improve precision, HeaderGen
    extends PyCG's analysis with support for handling external library code and flow-sensitivity.
    The former is realized by facilitating the resolution of function return-types.
    Furthermore, HeaderGen uses type information to perform pattern matching on code
    syntax to annotate code cells.\r\nThe evaluation on 15 real-world Jupyter notebooks
    from Kaggle shows that HeaderGen's underlying call graph analysis yields high
    accuracy (96.4% precision and 95.9% recall). This is because HeaderGen can resolve
    return-types of external libraries where existing type inference tools such as
    pytype (by Google), pyright (by Microsoft), and Jedi fall short. The header generation
    has a precision of 82.2% and a recall rate of 96.8% with regard to headers created
    manually by experts. In a user study, HeaderGen helps participants finish comprehension
    and navigation tasks faster. All participants clearly perceive HeaderGen as useful
    to their task."
author:
- first_name: Ashwin Prasad
  full_name: Shivarpatna Venkatesh, Ashwin Prasad
  id: '66637'
  last_name: Shivarpatna Venkatesh
- first_name: Jiawei
  full_name: Wang, Jiawei
  last_name: Wang
- first_name: Li
  full_name: Li, Li
  last_name: Li
- first_name: Eric
  full_name: Bodden, Eric
  id: '59256'
  last_name: Bodden
  orcid: 0000-0003-3470-3647
citation:
  ama: 'Shivarpatna Venkatesh AP, Wang J, Li L, Bodden E. Enhancing Comprehension
    and Navigation in Jupyter Notebooks with Static Analysis. In: IEEE SANER 2023
    (International Conference on Software Analysis, Evolution and Reengineering);
    2023. doi:<a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>'
  apa: Shivarpatna Venkatesh, A. P., Wang, J., Li, L., &#38; Bodden, E. (2023). <i>Enhancing
    Comprehension and Navigation in Jupyter Notebooks with Static Analysis</i>. IEEE
    SANER 2023 (International Conference on Software Analysis, Evolution and Reengineering).
    <a href="https://doi.org/10.48550/ARXIV.2301.04419">https://doi.org/10.48550/ARXIV.2301.04419</a>
  bibtex: '@inproceedings{Shivarpatna Venkatesh_Wang_Li_Bodden_2023, title={Enhancing
    Comprehension and Navigation in Jupyter Notebooks with Static Analysis}, DOI={<a
    href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>},
    publisher={IEEE SANER 2023 (International Conference on Software Analysis, Evolution
    and Reengineering)}, author={Shivarpatna Venkatesh, Ashwin Prasad and Wang, Jiawei
    and Li, Li and Bodden, Eric}, year={2023} }'
  chicago: Shivarpatna Venkatesh, Ashwin Prasad, Jiawei Wang, Li Li, and Eric Bodden.
    “Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis.”
    IEEE SANER 2023 (International Conference on Software Analysis, Evolution and
    Reengineering), 2023. <a href="https://doi.org/10.48550/ARXIV.2301.04419">https://doi.org/10.48550/ARXIV.2301.04419</a>.
  ieee: 'A. P. Shivarpatna Venkatesh, J. Wang, L. Li, and E. Bodden, “Enhancing Comprehension
    and Navigation in Jupyter Notebooks with Static Analysis,” presented at the IEEE
    SANER 2023 (International Conference on Software Analysis, Evolution and Reengineering),
    2023, doi: <a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>.'
  mla: Shivarpatna Venkatesh, Ashwin Prasad, et al. <i>Enhancing Comprehension and
    Navigation in Jupyter Notebooks with Static Analysis</i>. IEEE SANER 2023 (International
    Conference on Software Analysis, Evolution and Reengineering), 2023, doi:<a href="https://doi.org/10.48550/ARXIV.2301.04419">10.48550/ARXIV.2301.04419</a>.
  short: 'A.P. Shivarpatna Venkatesh, J. Wang, L. Li, E. Bodden, in: IEEE SANER 2023
    (International Conference on Software Analysis, Evolution and Reengineering),
    2023.'
conference:
  name: IEEE SANER 2023 (International Conference on Software Analysis, Evolution
    and Reengineering)
date_created: 2023-01-13T08:03:26Z
date_updated: 2025-04-07T10:18:03Z
ddc:
- '000'
doi: 10.48550/ARXIV.2301.04419
file:
- access_level: open_access
  content_type: application/pdf
  creator: ashwin
  date_created: 2023-01-26T10:48:40Z
  date_updated: 2023-01-26T10:48:40Z
  file_id: '40304'
  file_name: 2301.04419.pdf
  file_size: 1862440
  relation: main_file
file_date_updated: 2023-01-26T10:48:40Z
has_accepted_license: '1'
keyword:
- static analysis
- python
- code comprehension
- annotation
- literate programming
- jupyter notebook
language:
- iso: eng
oa: '1'
publisher: IEEE SANER 2023 (International Conference on Software Analysis, Evolution
  and Reengineering)
status: public
title: Enhancing Comprehension and Navigation in Jupyter Notebooks with Static Analysis
type: conference
user_id: '15249'
year: '2023'
...
