Revamp of the I3Calorimetry extractor. by Aske-Rosted · Pull Request #866 · graphnet-team/graphnet

Aske-Rosted · 2026-01-27T06:37:18Z

The changes made to the Calorimetric introduced by #819 were developed using low energy events. While these changes did improve the correctness of the labels they unfortunately were not written in a way that made them suitable for high energy files.

This PR Rethinks the approach by having adding a hierarchy in the energy counting, that is tracks energies take priority over cascade energies.

If a track is found the deposit energy inside the detector volume it is recorded and then the particle along with the sub-tree is removed by using the frame[self.mctree].erase(track.id). The cascades are then counted by looking at only the leaf-nodes in the mctree which ensures that we are not double inside the cascades themselves and that we are only cascades terminating inside the detector volume.

This should ensure that we are eliminating both of the double counting scenarios explained in #816.

sevmag · 2026-04-20T10:04:29Z

Hi @Aske-Rosted, I'm currently reviewing the changes. Just a small question. What do you mean by

were not written in a way that made them suitable for high energy files.

Are you talking about the correctness of the labels or performance issues?

sevmag · 2026-04-20T11:01:16Z

I have some concerns regarding whether the current track handling produces the intended outcome. 🧐

Hypothetical Scenario 🧪

Consider a NuTau CC interaction that creates a tau. The tau enters the detector volume and decays into a muon mid-detector, which then exits the volume.

This results in a track_list like this:

track_list = [tau, mu]

In this case, the mu is a child in the tau's subtree.

When iterating through the list as seen here, the result seems highly dependent on the ordering of track_list:

Scenario 1: The tau is processed first 📍

In this case, we only iterate over the tau. Because the muon is in the tau's subtree, it is erased after the tau is processed.

e_deposited: Only includes energy from the tau (entry to decay). It misses the energy deposited by the muon. ❌
e_entrance: Correctly identifies the tau energy upon entering the detector. ✅

Scenario 2: The muon is processed first 📍

Here, we iterate over the muon first, then the tau.

e_deposited: Correct; it sums the energy of both the muon and the tau. ✅
e_entrance: Incorrect; the entrance energy of both particles is counted, even though only the tau should be considered the primary entering particle. ❌

Since the behavior changes based on list order, we might need to implement a more robust check for parent-child relationships before processing. What do you think?

sevmag

I really like the direction this is going, and I think we can increase its speed quite a bit by leveraging the tree-like structure. However, I do still have some concerns about the handling of the track particles (see my comment). Let me know if there are any questions!

sevmag · 2026-04-20T10:13:31Z

+            primary_energy = sum(
+                [
+                    p.energy
+                    for p in self.check_primary_energy(


You use the check_primary_energy function from the I3Extractor. Could you explain in which scenarios the energy of the primary particle is Nan? The remedy in these scenarios is to take the daughters, are we entirely sure that the daughters always have a non NaN energy? Might be worth refining the docstring for this function in

graphnet/src/graphnet/data/extractors/icecube/i3extractor.py

Line 112 in 4bcc607

def check_primary_energy(

since I actually don't know why

In some rare instances the energy of the particle is just not set, this is something that happens upstream in the simulation process, I don't exactly know why. It has been sometime since I have looked at it but I believe that there will be a daughter of the same type as the parent particle with an actual energy defined.

Maybe let's just be sure to check that all of the energies of the daughters that are set as the new primaries, after the line below, do have a non-nan energy, and otherwise, raise an error.

graphnet/src/graphnet/data/extractors/icecube/i3extractor.py

Line 163 in 4bcc607

primary.append(daughter)

sevmag · 2026-04-20T10:24:55Z

+                    p.energy
+                    for p in self.check_primary_energy(
+                        frame,
+                        self.get_primaries(


self.get_primaries is called at three locations. Maybe lets call it once at the beginning of call and save the primary variable and pass it to total_track_energy and total_cascade_energy

sevmag · 2026-04-20T10:25:43Z

+        """Get the total energy of track particles on entrance."""
+        e_entrance = 0
+        e_deposited = 0
+        primaries = self.get_primaries(


See comment above for get_primaries usage

sevmag · 2026-04-20T10:25:49Z

+    ) -> float:
+        """Get the total energy of cascade particles on entrance."""
+        particles = deque(
+            self.get_primaries(


See comment above for get_primaries usage

sevmag · 2026-04-20T10:33:31Z

+        # Sanity check ensuring no double counting
+        if self.daughters:
+            assert e_entrance <= sum(
+                [p.energy for p in primaries]


The sum of primary energies has already been calculated on line 70. Reuse this value for consistency.

sevmag · 2026-04-20T10:33:49Z

+            assert e_entrance <= sum(
+                [p.energy for p in primaries]
+            ), "Energy on entrance is greater than primary energy"
+            assert e_deposited <= sum(


The sum of primary energies has already been calculated on line 70. Reuse this value for consistency.

sevmag · 2026-04-20T10:37:22Z

+                except RuntimeError as e:
+                    if "particle not found" in str(e):
+                        # log warning with event header
+                        self.warning(


How can it be that the primary particle is not found anymore? Are we sure it is safe to keep this as a warning?

sevmag · 2026-04-20T10:40:01Z


-        return e_cascade, e_dep_track, e_ent_track
+            # Check if the track actually enters the volume
+            if not (


Maybe add a comment on why this check also works for starting events. (Then the intersection.first is negative, correct?)

sevmag · 2026-04-20T10:41:52Z

-        pos = pos + direc * length
+            except RuntimeError as e:
+                if (
+                    "sum of losses is smaller than "


I'm still not sure what the reason for this MuonGun error is, but I think we should not just ignore it and throw a warning, but rather set the energies to NaN to make sure that we are not just using a corrupt ground truth that might not resemble reality at all. What do you think @Aske-Rosted ?

I think it is okay to work with what we have this also only happens very rarely. I also believe that this should be fixed in newer versions of IceTray https://github.com/icecube/icetray/pull/3052

ok perfect! I still think we shouldn't just ignore that track and move on to the next. In my experience, that can lead to very strange energy labels, especially when one increases the padding of the hull to large labels. I think we should just put the energy labels to NaN, signaling that we cannot give the correct label for these events. When it is super rare, this won't affect your sample statistics anyway, and we will be on the safe side. What do you think?

sevmag · 2026-04-20T11:05:54Z

+        )

-        return self.hull.point_in_hull(pos)
+        if len(particles) == 0:


This scenario should never happen, should it? Maybe we should throw an error here

This could happen for a single through-going track event since we have removed the track particle and all the sub-particles from the mctree.

Aske-Rosted · 2026-05-27T06:05:38Z

I have some concerns regarding whether the current track handling produces the intended outcome. 🧐

Hypothetical Scenario 🧪

Consider a NuTau CC interaction that creates a tau. The tau enters the detector volume and decays into a muon mid-detector, which then exits the volume.

This results in a track_list like this:
track_list = [tau, mu]
In this case, the mu is a child in the tau's subtree.

When iterating through the list as seen here, the result seems highly dependent on the ordering of track_list:

Scenario 1: The tau is processed first 📍

In this case, we only iterate over the tau. Because the muon is in the tau's subtree, it is erased after the tau is processed.
* **`e_deposited`**: Only includes energy from the tau (entry to decay). It **misses** the energy deposited by the muon. ❌

* **`e_entrance`**: Correctly identifies the tau energy upon entering the detector. ✅
Scenario 2: The muon is processed first 📍

Here, we iterate over the muon first, then the tau.
* **`e_deposited`**: Correct; it sums the energy of both the muon and the tau. ✅

* **`e_entrance`**: Incorrect; the entrance energy of _both_ particles is counted, even though only the tau should be considered the primary entering particle. ❌
Since the behavior changes based on list order, we might need to implement a more robust check for parent-child relationships before processing. What do you think?

The ordering of MCTree is not random if the muon is a product of the tau then it will be in the subtree of the tau I.e. the tau is always processed first. The Muon.Track.Harvest iterates over the MCTree so this ordering should be conserved. https://github.com/icecube/icetray/blob/7d7982c84148d7e66541e0f953be01f80b061859/ddddr/private/ddddr/MuonGunTrack.cxx#L136

I think you might be correct in terms of stopping tracks I think then we are missing the energy of the resulting cascade, however when fixing this we also have to consider the effects of dark (non-light producing particles) produced in the cascade which might remove some energy from the deposit.

Aske-Rosted · 2026-05-27T06:46:14Z

+            e_entrance += e0
+            # get descendant ids
+            # erase particle and children from mctree
+            frame[self.mctree].erase(track.id)


The below alteration I believe should fix the missing energies for stopping particles.

Suggested change

frame[self.mctree].erase(track.id)

if (e1 == 0) or (intersections.second < particle.length):

frame[self.mctree].erase(track.id)

Particles are not simulated once they leave the detector (+ some margin), so these tracks probably won't have children anyway, most of the time. So in that case, deleting the subtree is probably unnecessary anyway, and we would probably be safer just to never do it.

sevmag · 2026-05-27T16:23:20Z

                    if (
-                        particle.shape
-                        != dataclasses.I3Particle.ParticleShape.Dark
+                        frame[self.mctree].get_primary(track.GetI3Particle())


Actually, this check is a problem.

The primaries variable that we compare against here does not necessarily need to be at the top level of the tree. This is because our get_primaries sometimes goes down the tree to find the first in-ice neutrino. The get_primary method of the icetray MCTree only gives you the primary of the very top of the tree. This is problematic since we check whether this particle is in our primaries list. This means we will ignore these tracks, which is a big problem.

we should stick to the code from main

graphnet/src/graphnet/data/extractors/icecube/i3calorimetry.py

Lines 92 to 103 in 368927d

MMCTrackList = frame[self.mmctracklist]

# Filter tracks that are not daughters of the desired

if self.daughters:

temp_MMCTrackList = []

for track in MMCTrackList:

for p in primaries:

if frame[self.mctree].is_in_subtree(

p.id, track.GetI3Particle().id

):

temp_MMCTrackList.append(track)

break

MMCTrackList = temp_MMCTrackList

it addresses this problem

sevmag · 2026-05-27T18:14:51Z

The ordering of MCTree is not random if the muon is a product of the tau then it will be in the subtree of the tau I.e. the tau is always processed first. The Muon.Track.Harvest iterates over the MCTree so this ordering should be conserved. https://github.com/icecube/icetray/blob/7d7982c84148d7e66541e0f953be01f80b061859/ddddr/private/ddddr/MuonGunTrack.cxx#L136

I think you might be correct in terms of stopping tracks I think then we are missing the energy of the resulting cascade, however when fixing this we also have to consider the effects of dark (non-light producing particles) produced in the cascade which might remove some energy from the deposit.

But as I showed in my issue before, even if the ordering is always the tau first. The labels will still be incorrect. I think we can use this approach of e_on_entrance, but for e_deposited we cannot take the shortcuts of deleting the subtree after a track.

Aske-Rosted and others added 10 commits January 13, 2026 01:05

revert calor fix

6b421d3

boolean logic fix

b9b565c

alternative fix

858c180

move assert and check intersection

79adb2b

big remake

4fdd013

Merge branch 'main' into revert_calo

93024c9

speed improvements

8096569

move DARK instantiation

14831e8

Merge branch 'graphnet-team:main' into revert_calo

30f795f

catch runtime errors

f24b6ec

Aske-Rosted requested a review from sevmag March 17, 2026 08:36

allow for mass to kinetic conversion

fe26e91

sevmag requested changes Apr 20, 2026

View reviewed changes

Aske-Rosted commented May 27, 2026

View reviewed changes

sevmag reviewed May 27, 2026

View reviewed changes

	frame[self.mctree].erase(track.id)
	if (e1 == 0) or (intersections.second < particle.length):
	frame[self.mctree].erase(track.id)

	MMCTrackList = frame[self.mmctracklist]
	# Filter tracks that are not daughters of the desired
	if self.daughters:
	temp_MMCTrackList = []
	for track in MMCTrackList:
	for p in primaries:
	if frame[self.mctree].is_in_subtree(
	p.id, track.GetI3Particle().id
	):
	temp_MMCTrackList.append(track)
	break
	MMCTrackList = temp_MMCTrackList

Conversation

Aske-Rosted commented Jan 27, 2026

Uh oh!

sevmag commented Apr 20, 2026

Uh oh!

sevmag commented Apr 20, 2026

Hypothetical Scenario 🧪

Scenario 1: The tau is processed first 📍

Scenario 2: The muon is processed first 📍

Uh oh!

sevmag left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aske-Rosted commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hypothetical Scenario 🧪

Scenario 1: The tau is processed first 📍

Scenario 2: The muon is processed first 📍

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sevmag commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aske-Rosted commented May 27, 2026 •

edited

Loading