Ticket #49 (closed defect: fixed)

Opened 14 months ago

Last modified 12 months ago

Reference cycle in HDUs causing memory leak

Reported by: stsci_embray Owned by: stsci_embray
Priority: high Milestone: 3.0.0
Version: 2.4 Severity: major
Keywords: Cc: zonca@…

Description (last modified by stsci_embray) (diff)

The following sample sent in by a user (Andrea Zonca; e-mail address in the CC list) illustrates the problem:

import objgraph
import pyfits
import gc
import numpy as np

print("Create table")
#a=pyfits.Column(name='',format='K',array=[1,2,3]) # Column instead correctly frees up memory
a=pyfits.new_table([pyfits.Column(name='',format='K',array=[1,2,3])])

print("Delete table")
del a

print("Garbage collect")
gc.collect()
gc.collect()
gc.collect()

print("Still there")
print(objgraph.by_type('recarray')[0])

output:

output:

Create table
Delete table
Garbage collect
Still there
[(1L,) (2L,) (3L,)]

Change History

  Changed 14 months ago by stsci_embray

  • status changed from new to assigned

After some exploring I found that the ColDefs object is still in memory as well.

The problem, I think, is this: When the HDU object's data object is created, an additional reference to the HDU's columns object is made in data._coldefs (i.e. data._coldefs = self.columns).

When the HDU is deleted, the data and columns objects are referencing each other, and are not deleted.

  Changed 14 months ago by stsci_embray

The problem goes deeper than I thought:

The way new_table() works, it actually makes a copy of the data in the data arrays of all the input columns. I believe this is by design.

It does this by first creating an empty recarray (actually wrapped by a FITS_rec) large enough to hold the data in all the columns. It then basically copies the column arrays into the new FITS_rec.

Finally, and this is the part that causes problems, it loops through the column arrays and overrides them like so: column.array = hdu.data.field(idx). This ensures that accessing column.array is equivalent to data.field(idx), and that all the column arrays are using the same underlying ndarray as the FITS_rec.

Seems reasonable, right? The problem is, if you look in the rec.recarray.field() implementation, you see that it does something like:

obj = self.getfield(...)
return obj.view(numpy.ndarray)

Here obj is a FITS_rec object, since it's just a view of self, which in this case is the FITS_rec in hdu.data. So the returned ndarray has a FITS_rec as its base, which in turn has the original FITS_rec as its base. This creates a nasty reference cycle.

I have a few possible workarounds to this, none of which I'm completely satisfied with yet...

follow-up: ↓ 4   Changed 14 months ago by stsci_embray

Should be fixed in r49, but I would like to get feedback from the user who reported it.

in reply to: ↑ 3   Changed 14 months ago by stsci_embray

Replying to stsci_embray:

Should be fixed in r49, but I would like to get feedback from the user who reported it.

That should be r844.

  Changed 14 months ago by stsci_embray

  • cc zonca@… added
  • description modified (diff)

  Changed 14 months ago by stsci_embray

  • status changed from assigned to closed
  • resolution set to fixed

Andrea confirmed that the memory leak was fixed for him. He also pointed out a problem with pyfits.new_table() copying data [which I believe is by design] but I opened an issue for it in #51.

  Changed 14 months ago by stsci_embray

  • status changed from closed to reopened
  • resolution fixed deleted

Reopening.

Nadia pointed out a test in stsci_python that was failing; specifically, the problem was with stsci_python.stistools.mktrace, which contains code that keeps a reference to an HDU's data, and then allows the HDU to fall out of scope. This causes the data._coldefs attribute to resolve to None, since it was a weakref to the HDU object's .columns.

Using weakrefs is fine for internal implementation details. But in this case it touches users' code too closely, and thus probably shouldn't be used since they cause too many surprises.

I will have to implement a solution that does not rely on weakrefs.

  Changed 14 months ago by stsci_embray

I implemented a new fix for this in r846. It does not use weakrefs, and should eliminate the reference cycle altogether.

Andrea: you might want to double-check that this new version works with your code.

Thanks.

  Changed 12 months ago by stsci_embray

  • version set to 2.4
  • milestone set to 3.0.0

  Changed 12 months ago by stsci_embray

  • status changed from reopened to closed
  • resolution set to fixed

Reports seem to suggest that the new fix works.

Note: See TracTickets for help on using tickets.