Hello all, I've been meta-reflecting a lot lately: reflecting on reflection. My recent post on __slots__ not being picklable (and the resounding lack of response to it) inspired me to try my hand at channeling Guido and reverse- engineer some of the design decisions that went into the new-style class system. Unfortunately, the more I dug into the code, the more philosophical my questions became. So, I've written up some questions that help lay bare some of basic design questions that I've been asking myself and that you should be aware of. While there are several subtle issues I could raise, I do want some feedback on some simple and fundamental ones first. Please don't disqualify yourself from commenting because you haven't read the code or used the new features yet. I've written my examples assuming only a basic and cursor understanding of the new Python 2.2 features. [In this discussion I am only going to talk about native Python classes, not C-extension or native Python types (e.g., ints, lists, tuples, strings, cStringIO, etc.)] 1) Should class instances explicitly/directly know all of their attributes? Before Python 2.2, all object instances contained a __dict__ attribute that mapped attribute names to their values. This made pickling and some other reflection tasks fairly easy. e.g.: class Foo: def __init__(self): self.a = 1 self.b = 2 class Bar(Foo): def __init__(self): Foo.__init__(self) self.c = 3 bar = Bar() print bar.__dict__ > {'a': 1, 'c': 3, 'b': 2} I am aware that there are situations where this simple case does not hold (e.g., when implementing __setattr__ or __getattr__), but let's ignore those for now. Rather, I will concentrate on how this classical Python idiom interacts with the new slots mechanism. Here is the above example using slots: e.g.: class Foo(object): __slots__ = ['a','b'] def __init__(self): self.a = 1 self.b = 2 class Bar(Foo): __slots__ = ['c'] def __init__(self): Foo.__init__(self) self.c = 3 bar = Bar() print bar.__dict__ > AttributeError: 'Bar' object has no attribute '__dict__' We can see that the class instance 'bar' has no __dict__ attribute. This is because the slots mechanism allocates space for attribute storage directly inside the object, and thus does not use (or need) a per-object instance dictionary to store attributes. Of course, it is possible to request that a per-instance dictionary by inheriting from a new-style class that does not list any slots. e.g. continuing from above: class Baz(Bar): def __init__(self): Bar.__init__(self) self.d = 4 self.e = 5 baz = Baz() print baz.__dict__ > {'e': 5, 'd': 4} We have now created a class that has __dict__, but it only contains the attributes not stored in slots! So, should class instances explicitly know their attributes? Or more precisely, should class instances always have a __dict__ attribute that contains their attributes? Don't worry, this does not mean that we cannot also have slots, though it does have some other implications. Keep reading... 2) Should attribute access follow the same resolution order rules as methods? class Foo(object): __slots__ = ['a'] self.a def __init__(self): self.a = 1 class Bar(Foo): __slots__ = ('a',) def __init__(self): Foo.__init__(self) self.a = 2 bar = Bar() print bar.a > 2 print super(Bar,bar).a # this doesn't actually work > 2 or 1? Don't worry -- this isn't a proposal and no, this doesn't actually work. However, the current implementation only narrowly escapes this trap: print bar.__class__.a.__get__(bar) > 2 print bar.__class__.__base__.a.__get__(bar) > AttributeError: a Ok, let me explain what just happened. Slots are implemented via the new descriptor interface. In short, descriptor objects are properties and support __get__ and __set__ methods. The slot descriptors are told the offset within an object instance the PyObject* lives and proxy operations for them. So getting and setting slots involves: # print bar.a a_descr = bar.__class__.a print a_descr.__set__(bar) # bar.a = 1 a_descr = bar.__class__.a a_descr.__set__(bar, 1) So, above we get an attribute error when trying to access the 'a' slot from Bar since it was never initialized. However, with a little ugliness you can do the following: # Get the descriptors for Foo.a and Bar.a a_foo_descr = bar.__class__.__base__.a a_bar_descr = bar.__class__.a a_foo_descr.__set__(bar,1) a_bar_descr.__set__(bar,2) print bar.a > 2 print a_foo_descr.__get__(bar) > 1 print a_bar_descr.__get__(bar) > 2 In other words, the namespace for slots is not really flat, although there is no simple way to access these hidden attributes since method resolution order rules are not invoked by default. 3) Should __slots__ be immutable? The __slots__ attribute of a new-style class lists all of the slots defined by that class. It is represented as whatever sequence type what given when the object was declared: print Foo.__slots__ > ['a'] print Bar.__slots__ > ('a',) This allows us to do things like: Foo.__slots__.append('b') foo = Foo() foo.b = 42 > AttributeError: 'Foo' object has no attribute 'b' So modifying the slots does not do what one may expect. This is because slot descriptors and the space for slots are only allocated when the classes are created (i.e., when they are inherited from 'object', or from an object that descends from 'object'). 4) Should __slots__ be flat? bar.__slots__ only lists the slots specifically requested in bar, even though it inherits from 'foo', which has its own slots. Which would be the preferable behavior? class Foo(object): __slots__ = ('a','b') class Bar(object): __slots__ = ('c','d') print Bar.__slots__ > ('c','d') # current behavior or > ('a','b','c','d') # alternate behavior Clearly, this issue goes back to the ideas addressed in question 1. If slot descriptors are not stored in a per-instance dictionary, then the assumptions on how to do object reflection must change. However, which version of the following code do you prefer to print all attributes of a given object: Old style or if descriptors are stored in obj.__dict__: if hasattr(obj,'__dict__'): print ''.join([ '%s=%s' % nameval for nameval in obj.__dict__ ]) Currently in Python 2.2 (and still not quite correct): def print_slot_attrs(obj,cls=None): if not cls: cls = obj.__class__ for name,obj in cls.__dict__.items() if str(type(obj)) == "<type 'member_descriptor'>": if hasattr(obj, name): print "%s=%s" % (name,getattr(obj, name)) for base in cls.__bases__: print_slot_attrs(obj,base) if hasattr(obj,'__dict__'): print [ '%s=%s' % nameval for nameval in obj.__dict__ ] print_slot_attrs(obj) Flat and immutable slot namespace: a = [ '%s=%s' % nameval for nameval in obj.__dict__ ] a += [ '%s=%s' % (name,val) for name,val in obj.__slots__ \ if hasattr(obj, name) ] print ''.join(a) So, which one of these do you want to support or explain to a new user? Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4