fix(serializer): traverse full MRO when serialising objects with __slots__#1673
fix(serializer): traverse full MRO when serialising objects with __slots__#1673rmotgi1227 wants to merge 1 commit into
Conversation
EventSerializer._default_inner iterated obj.__slots__ when encoding an
object that has __slots__. obj.__slots__ returns only the slots of the
*immediate* class; slots defined on parent classes are not included.
As a result, any attribute stored in a parent's __slots__ was silently
omitted from the serialised payload.
Example before fix:
class Base:
__slots__ = ('base_attr',)
class Child(Base):
__slots__ = ('child_attr',)
serializer.encode(Child())
# -> {"child_attr": ...} # base_attr missing!
Fix: collect slots from every class in type(obj).__mro__ using
getattr(cls, '__slots__', ()). This is the standard Python idiom for
gathering all slots across an inheritance hierarchy:
all_slots = [
slot
for cls in type(obj).__mro__
for slot in getattr(cls, '__slots__', ())
]
Regression tests added in tests/unit/test_serializer.py covering a
two-level and a three-level inheritance chain.
|
|
| all_slots = [ | ||
| slot | ||
| for cls in type(obj).__mro__ | ||
| for slot in getattr(cls, "__slots__", ()) | ||
| ] |
There was a problem hiding this comment.
Special-purpose slots like
__weakref__ and __dict__ can appear in a class's (or ancestor's) __slots__ definition. With the full MRO traversal these slots are now included in the serialised output for every slotted object that inherits from such a base class. __weakref__ serialises to None (harmless but noisy), while __dict__ serialises to the entire instance attribute dict nested under a "__dict__" key — that conflicts with the existing __dict__-branch logic and can produce duplicate or mis-shaped data. Filtering them out is the standard idiom.
| all_slots = [ | |
| slot | |
| for cls in type(obj).__mro__ | |
| for slot in getattr(cls, "__slots__", ()) | |
| ] | |
| _SKIP_SLOTS = frozenset({"__dict__", "__weakref__"}) | |
| all_slots = [ | |
| slot | |
| for cls in type(obj).__mro__ | |
| for slot in getattr(cls, "__slots__", ()) | |
| if slot not in _SKIP_SLOTS | |
| ] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: langfuse/_utils/serializer.py
Line: 158-162
Comment:
Special-purpose slots like `__weakref__` and `__dict__` can appear in a class's (or ancestor's) `__slots__` definition. With the full MRO traversal these slots are now included in the serialised output for every slotted object that inherits from such a base class. `__weakref__` serialises to `None` (harmless but noisy), while `__dict__` serialises to the entire instance attribute dict nested under a `"__dict__"` key — that conflicts with the existing `__dict__`-branch logic and can produce duplicate or mis-shaped data. Filtering them out is the standard idiom.
```suggestion
_SKIP_SLOTS = frozenset({"__dict__", "__weakref__"})
all_slots = [
slot
for cls in type(obj).__mro__
for slot in getattr(cls, "__slots__", ())
if slot not in _SKIP_SLOTS
]
```
How can I resolve this? If you propose a fix, please make it concise.
Bug
EventSerializer._default_inneriteratedobj.__slots__when serialising an object that has__slots__.obj.__slots__is a class attribute that contains only the slots of the immediate class — slots defined on parent classes live on each respective base class and are not included.As a result, any attribute stored in a parent's
__slots__was silently dropped from the serialised event payload, causing silent data loss whenever a slotted class hierarchy was used (e.g., protobuf messages, dataclasses-like objects, custom slotted models).Minimal repro
Note: This bug is distinct from the cycle-detection fix in PR #1657, which guards the
__slots__branch against infinite recursion on cyclic graphs. This PR corrects the scope of slot traversal itself.Fix
Use the standard Python idiom for collecting all slots across an MRO:
getattr(cls, '__slots__', ())handles:__slots__(returns the empty tuple default)objectat the end of every MRO (no__slots__)Tests
Three regression tests added to
tests/unit/test_serializer.py:test_slots_immediate_class_serialisedtest_slots_inherited_from_parent_serialisedtest_slots_deep_inheritance_chainGreptile Summary
This PR fixes a silent data-loss bug in
EventSerializer._default_innerwhere slotted objects only had their immediate class's slots serialised; slots defined on any ancestor class were silently dropped. The fix correctly traverses the full MRO viatype(obj).__mro__to collect all slots before serialising.langfuse/_utils/serializer.py: Replacesobj.__slots__iteration with a two-level list comprehension overtype(obj).__mro__, usinggetattr(cls, '__slots__', ())to handle classes that don't define__slots__(includingobject). The fix is minimal and idiomatic.tests/unit/test_serializer.py: Adds three targeted regression tests — immediate-class slots, parent-class slots, and a three-level deep inheritance chain — each of which would have failed against the pre-fix code.Confidence Score: 4/5
Safe to merge; the core fix is correct and well-tested. One minor follow-up worth considering: filtering weakref and dict from the collected slot list prevents those implementation-detail entries from leaking into serialised payloads.
The MRO traversal fix is idiomatic and closes the described data-loss gap. The only open edge case is that special pseudo-slots (weakref, dict) that appear in ancestor slots definitions will now show up in serialised output — harmless for weakref (serialises to null) but potentially confusing for dict (nests the whole instance dict under a dict key). This was a pre-existing quirk for the immediate class and is now slightly more reachable via inheritance.
langfuse/_utils/serializer.py — specifically the slot-filtering step in the new MRO loop.
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[_default_inner called with obj] --> B{hasattr obj __slots__?} B -- Yes --> C[Collect all_slots via MRO traversal] C --> D[getattr each cls for __slots__] D --> E[Build dict slot to self.default value] B -- No --> F{hasattr obj __dict__?} F -- Yes --> G[Cycle check via self.seen] G --> H[Serialize vars as dict] F -- No --> I[Return type name string]Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(serializer): traverse full MRO when ..." | Re-trigger Greptile