Skip to content

fix(serializer): traverse full MRO when serialising objects with __slots__#1673

Open
rmotgi1227 wants to merge 1 commit into
langfuse:mainfrom
rmotgi1227:fix/slots-mro-traversal
Open

fix(serializer): traverse full MRO when serialising objects with __slots__#1673
rmotgi1227 wants to merge 1 commit into
langfuse:mainfrom
rmotgi1227:fix/slots-mro-traversal

Conversation

@rmotgi1227
Copy link
Copy Markdown

@rmotgi1227 rmotgi1227 commented May 27, 2026

Bug

EventSerializer._default_inner iterated obj.__slots__ when serialising an object that has __slots__. obj.__slots__ is a class attribute that contains only the slots of the immediate class — slots defined on parent classes live on each respective base class and are not included.

As a result, any attribute stored in a parent's __slots__ was silently dropped from the serialised event payload, causing silent data loss whenever a slotted class hierarchy was used (e.g., protobuf messages, dataclasses-like objects, custom slotted models).

Minimal repro

class Base:
    __slots__ = ('base_attr',)

class Child(Base):
    __slots__ = ('child_attr',)

from langfuse._utils.serializer import EventSerializer
import json

obj = Child()
obj.base_attr = 'from_base'
obj.child_attr = 'from_child'

result = json.loads(EventSerializer().encode(obj))
# Before fix: {'child_attr': 'from_child'}   ← base_attr is missing!
# After fix:  {'child_attr': 'from_child', 'base_attr': 'from_base'}

Note: This bug is distinct from the cycle-detection fix in PR #1657, which guards the __slots__ branch against infinite recursion on cyclic graphs. This PR corrects the scope of slot traversal itself.

Fix

Use the standard Python idiom for collecting all slots across an MRO:

# Before
return {slot: self.default(getattr(obj, slot, None)) for slot in obj.__slots__}

# After
all_slots = [
    slot
    for cls in type(obj).__mro__
    for slot in getattr(cls, '__slots__', ())
]
return {slot: self.default(getattr(obj, slot, None)) for slot in all_slots}

getattr(cls, '__slots__', ()) handles:

  • Classes in the MRO that don't define __slots__ (returns the empty tuple default)
  • object at the end of every MRO (no __slots__)

Tests

Three regression tests added to tests/unit/test_serializer.py:

Test Verifies
test_slots_immediate_class_serialised Child-class slots still appear
test_slots_inherited_from_parent_serialised Parent-class slots are no longer dropped (fails before fix)
test_slots_deep_inheritance_chain 3-level MRO: grandparent, parent, and child slots all present (fails before fix)

Greptile Summary

This PR fixes a silent data-loss bug in EventSerializer._default_inner where slotted objects only had their immediate class's slots serialised; slots defined on any ancestor class were silently dropped. The fix correctly traverses the full MRO via type(obj).__mro__ to collect all slots before serialising.

  • langfuse/_utils/serializer.py: Replaces obj.__slots__ iteration with a two-level list comprehension over type(obj).__mro__, using getattr(cls, '__slots__', ()) to handle classes that don't define __slots__ (including object). The fix is minimal and idiomatic.
  • tests/unit/test_serializer.py: Adds three targeted regression tests — immediate-class slots, parent-class slots, and a three-level deep inheritance chain — each of which would have failed against the pre-fix code.

Confidence Score: 4/5

Safe to merge; the core fix is correct and well-tested. One minor follow-up worth considering: filtering weakref and dict from the collected slot list prevents those implementation-detail entries from leaking into serialised payloads.

The MRO traversal fix is idiomatic and closes the described data-loss gap. The only open edge case is that special pseudo-slots (weakref, dict) that appear in ancestor slots definitions will now show up in serialised output — harmless for weakref (serialises to null) but potentially confusing for dict (nests the whole instance dict under a dict key). This was a pre-existing quirk for the immediate class and is now slightly more reachable via inheritance.

langfuse/_utils/serializer.py — specifically the slot-filtering step in the new MRO loop.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_default_inner called with obj] --> B{hasattr obj __slots__?}
    B -- Yes --> C[Collect all_slots via MRO traversal]
    C --> D[getattr each cls for __slots__]
    D --> E[Build dict slot to self.default value]
    B -- No --> F{hasattr obj __dict__?}
    F -- Yes --> G[Cycle check via self.seen]
    G --> H[Serialize vars as dict]
    F -- No --> I[Return type name string]
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
langfuse/_utils/serializer.py:158-162
Special-purpose slots like `__weakref__` and `__dict__` can appear in a class's (or ancestor's) `__slots__` definition. With the full MRO traversal these slots are now included in the serialised output for every slotted object that inherits from such a base class. `__weakref__` serialises to `None` (harmless but noisy), while `__dict__` serialises to the entire instance attribute dict nested under a `"__dict__"` key — that conflicts with the existing `__dict__`-branch logic and can produce duplicate or mis-shaped data. Filtering them out is the standard idiom.

```suggestion
                _SKIP_SLOTS = frozenset({"__dict__", "__weakref__"})
                all_slots = [
                    slot
                    for cls in type(obj).__mro__
                    for slot in getattr(cls, "__slots__", ())
                    if slot not in _SKIP_SLOTS
                ]
```

Reviews (1): Last reviewed commit: "fix(serializer): traverse full MRO when ..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

EventSerializer._default_inner iterated obj.__slots__ when encoding an
object that has __slots__.  obj.__slots__ returns only the slots of the
*immediate* class; slots defined on parent classes are not included.

As a result, any attribute stored in a parent's __slots__ was silently
omitted from the serialised payload.

Example before fix:
    class Base:
        __slots__ = ('base_attr',)

    class Child(Base):
        __slots__ = ('child_attr',)

    serializer.encode(Child())
    # -> {"child_attr": ...}   # base_attr missing!

Fix: collect slots from every class in type(obj).__mro__ using
getattr(cls, '__slots__', ()).  This is the standard Python idiom for
gathering all slots across an inheritance hierarchy:

    all_slots = [
        slot
        for cls in type(obj).__mro__
        for slot in getattr(cls, '__slots__', ())
    ]

Regression tests added in tests/unit/test_serializer.py covering a
two-level and a three-level inheritance chain.
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment on lines +158 to +162
all_slots = [
slot
for cls in type(obj).__mro__
for slot in getattr(cls, "__slots__", ())
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Special-purpose slots like __weakref__ and __dict__ can appear in a class's (or ancestor's) __slots__ definition. With the full MRO traversal these slots are now included in the serialised output for every slotted object that inherits from such a base class. __weakref__ serialises to None (harmless but noisy), while __dict__ serialises to the entire instance attribute dict nested under a "__dict__" key — that conflicts with the existing __dict__-branch logic and can produce duplicate or mis-shaped data. Filtering them out is the standard idiom.

Suggested change
all_slots = [
slot
for cls in type(obj).__mro__
for slot in getattr(cls, "__slots__", ())
]
_SKIP_SLOTS = frozenset({"__dict__", "__weakref__"})
all_slots = [
slot
for cls in type(obj).__mro__
for slot in getattr(cls, "__slots__", ())
if slot not in _SKIP_SLOTS
]
Prompt To Fix With AI
This is a comment left during a code review.
Path: langfuse/_utils/serializer.py
Line: 158-162

Comment:
Special-purpose slots like `__weakref__` and `__dict__` can appear in a class's (or ancestor's) `__slots__` definition. With the full MRO traversal these slots are now included in the serialised output for every slotted object that inherits from such a base class. `__weakref__` serialises to `None` (harmless but noisy), while `__dict__` serialises to the entire instance attribute dict nested under a `"__dict__"` key — that conflicts with the existing `__dict__`-branch logic and can produce duplicate or mis-shaped data. Filtering them out is the standard idiom.

```suggestion
                _SKIP_SLOTS = frozenset({"__dict__", "__weakref__"})
                all_slots = [
                    slot
                    for cls in type(obj).__mro__
                    for slot in getattr(cls, "__slots__", ())
                    if slot not in _SKIP_SLOTS
                ]
```

How can I resolve this? If you propose a fix, please make it concise.

@hassiebp hassiebp self-requested a review May 29, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants