# Cell 1
a = 1
a + 23
Nima Sarang
March 7, 2026
I’ve been using Jupyter notebooks on and off for about a decade now, and more often than not the session would end up being sluggish after a few hours. I’d sometimes get this feeling that there must be a memory leak somewhere but I had brushed it off because it was reasonable to assume a tool as popular as this one wouldn’t have an obvious issue like this. That was until in a recent project where I’ve been working with big data, I consistently hit OOM errors. To my frustration, even though I tried my best to free up memory and call the garbage collector immediately, most of the time it had no effect. So it annoyed me enough to investigate :)
To my surprise, the memory leak actually comes from a “feature” that is enabled by default, and there’s no easy way to opt out, as far as I understand.
As you likely know, there’s an easy way to quickly inspect an object in Jupyter by putting it at the end of a notebook cell. Jupyter attempts to print or render the object, depending on what methods it has, e.g. _repr_html_, _repr_png_, _repr_latex_, etc. Most often, you use this to check the value of a variable, print a pandas dataframe, show a plot, etc. This is a common pattern. Here are some examples:
| A | B | |
|---|---|---|
| 0 | 0 | 100 |
| 1 | 1 | 101 |
| 2 | 2 | 102 |
| 3 | 3 | 103 |
| 4 | 4 | 104 |
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
The memory leak is that each of the objects above are being kept in some Jupyter namespace. Why? So you can reference the output objects using the Out dictionary.
<class 'pandas.core.frame.DataFrame'>
| A | B | |
|---|---|---|
| 0 | 0 | 100 |
| 1 | 1 | 101 |
| 2 | 2 | 102 |
| 3 | 3 | 103 |
| 4 | 4 | 104 |
This is actually bad. In the previous example, I defined a 10,000 × 10,000 float64 array, which consumes around 800 MB. Even though the displayed result was summarized, Jupyter still keeps a reference to the original array. As a result, running del arr will not free the memory.
To validate the hypothesis, let’s design an experiment. We’ll create a numpy array like before, display it, and try to remove any references to it.
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
Now, we’ll use weakref to keep a reference to the arr object. A weakref is a reference that doesn’t prevent garbage collection. This helps us determine whether an object was truly freed by the garbage collector without creating an additional reference in the process. For example:
1. Is `a_ref` pointing to `a`? True
2. Is `a` still alive? True
We can see a was cleared by the garbage collector since its reference count reached zero. Applying the same test to arr, which was displayed at the end of a cell, we get:
Is `arr` still alive? True
As expected, the object is still alive. Maybe it’ll go away if we del Out[5]?
Will you look at that! Just how many other references are made to the object?
3 other references, made by various Jupyter sub-components. This means it’s not straighforward to free up these objects by the time that they’ve been added to Jupyter, since without having a reference you wouldn’t even know where to start.
IPython’s displayhook has a method called update_user_ns. This is the single place where all those references get written with keys such as _, __, ___, _n, Out[n]. So the solution is to just no-op the method:
And just like that, it’s gone. It took me five hours to debug this, but I’m glad there’s at least a viable solution. You’re welcome ;)
@online{sarang2026,
author = {Sarang, Nima},
title = {Memory {Leak} in {Jupyter} {Notebooks} and {How} to {Fix}
{It}},
date = {2026-03-07},
url = {https://www.nimasarang.com/blog/2026-03-07-jupyter-memory-leak/},
langid = {en}
}