Tags: __getitem__, affects, behavoir, class, hook, indexing, programming, python, tomodify, why__getitem__

__getitem__

On Programmer » Python

12,395 words with 6 Comments; publish: Sun, 04 May 2008 21:19:00 GMT; (20046.88, « »)

I understand that you can use __getitem__ as a hook to

modify indexing behavoir in a class. That's why

__getitem__ not only affects [] but also for loops,

map calls, list comprehension, etc. For loops, etc.

work by indexing a sequences from zero to a higher

index until out-of-bounds is reached. But why does

this work?

class stepper:

def __getitem__(self, i):

return self.data[i]

'p' in X

True

What does 'in' have to do with indexing?

Tutor maillist - Tutor (AT) python (DOT) org

All Comments

Leave a comment...

  • 6 Comments
    • At 12:12 PM 1/16/2006, Christopher Spears wrote:

      >I understand that you can use __getitem__ as a hook to

      >modify indexing behavoir in a class. That's why

      getitem__ not only affects [] but also for loops,

      >map calls, list comprehension, etc. For loops, etc.

      >work by indexing a sequences from zero to a higher

      >index until out-of-bounds is reached. But why does

      >this work?

      >

      class stepper:

      def __getitem__(self, i):

      return self.data[i]

      'p' in X

      >True

      >

      >What does 'in' have to do with indexing?

      What does X have to do with stepper?

      Tutor maillist - Tutor (AT) python (DOT) org

      #1; Sun, 04 May 2008 21:21:00 GMT
    • Christopher Spears wrote:

      I understand that you can use __getitem__ as a hook to

      modify indexing behavoir in a class. That's why

      __getitem__ not only affects [] but also for loops,

      map calls, list comprehension, etc. For loops, etc.

      work by indexing a sequences from zero to a higher

      index until out-of-bounds is reached. But why does

      this work?

      class stepper:

      def __getitem__(self, i):

      return self.data[i]

      'p' in X

      True

      What does 'in' have to do with indexing?

      How do you suppose 'in' works? To see if something is in a list, for

      example, you have to iterate over each element of the list and check if

      it is the item you expect.

      Under the hood, Python will use the __contains__() or __getitem__()

      special method of a class to evaluate 'x in y'.

      Loosely speaking, if y is an instance of a class that implements

      __getitem__() but not __contains__(), 'x in y' is more or less the same

      as this:

      def in(x, y):

      for i in y:

      if i == x:

      return True

      return False

      From the language reference:

      "For user-defined classes which do not define __contains__() and do

      define __getitem__(), x in y is true if and only if there is a

      non-negative integer index i such that x == y[i], and all lower integer

      indices do not raise IndexError exception. (If any other exception is

      raised, it is as if in raised that exception)."

      #l2h-432

      Kent

      Tutor maillist - Tutor (AT) python (DOT) org

      #2; Sun, 04 May 2008 21:22:00 GMT
    • map calls, list comprehension, etc. For loops, etc.

      work by indexing a sequences from zero to a higher

      index until out-of-bounds is reached.

      What makes you think that?

      So far as I know for loops work by calling next on

      an iterator until nothing gets returned, no indexes

      involved.(At least not in the for loop) But they could

      just as well work by calling the len() function and

      iterating that number of times. And len() could be

      stored as part of the data structure ala Pascal arrays.

      The point being that it is dangerous to assume how

      a language feature works internally, it can change from

      version to version.

      In this case the iterator solution means that the for

      loop can work on any iterable entity - like files for

      instance.

      But why does this work?

      class stepper:

      def __getitem__(self, i):

      return self.data[i]

      'p' in X

      True

      What does 'in' have to do with indexing?

      Nothing unless its implementation uses a while loop

      and index, but thats unlikely.

      But your code doesn't show what X is, I assume its

      an instance of stepper? (The convention is for uppercase

      class names and lower case object names so I'm slightly

      unsure about that assumption!) But we don't even have

      the whole of stepper since there is no self.data definition.

      Its kind of hard to say.

      I'm slightly confused by the question, sorry.

      Alan G

      Author of the learn to program web tutor

      Tutor maillist - Tutor (AT) python (DOT) org

      #3; Sun, 04 May 2008 21:23:00 GMT
    • Alan Gauld wrote:

      >>map calls, list comprehension, etc. For loops, etc.

      >>work by indexing a sequences from zero to a higher

      >>index until out-of-bounds is reached.

      What makes you think that?

      So far as I know for loops work by calling next on

      an iterator until nothing gets returned, no indexes

      involved.(At least not in the for loop) But they could

      just as well work by calling the len() function and

      iterating that number of times. And len() could be

      stored as part of the data structure ala Pascal arrays.

      Hmm. From the language ref description of 'for':

      "The expression list is evaluated once; it should yield an iterable object."

      which begs the question of, what is an iterable object? The iterator

      protocol was introduced in Python 2.2; the "What's New" document give a

      good description of the old and new methods of iterating. Prior to

      Python 2.2, the _only_ way to make an object iterable was to give in a

      __getitem__() method. With Python 2.2 you can alternatively define

      __iter__().

      From the language ref description of __getitem__():

      "Note: for loops expect that an IndexError will be raised for illegal

      indexes to allow proper detection of the end of the sequence."

      In fact a class that just defines __getitem__() can be iterated in a for

      loop:

      class stepper:

      def __getitem__(self, i):

      if i < 5: return i

      raise IndexError

      for i in stepper(): print i

      0

      1

      2

      3

      4

      The point being that it is dangerous to assume how

      a language feature works internally, it can change from

      version to version.

      Dangerous to assume, maybe, but special methods are there to be used,

      and the usage is generally well understood if not always well documented.

      In this case the iterator solution means that the for

      loop can work on any iterable entity - like files for

      instance.

      Yes, and for backward compatibility it also works on anything

      implementing __getitem__(). In fact strings have no __iter__() method,

      they use __getitem__():

      ''iter__

      Traceback (most recent call last):

      File "<stdin>", line 1, in ?

      AttributeError: 'str' object has no attribute '__iter__'

      ''getitem__

      <method-wrapper object at 0x00A32E50>

      >>But why does this work?

      >>

      >>

      class stepper:

      >>

      >def __getitem__(self, i):

      >return self.data[i]

      >>

      >>

      'p' in X

      >>

      >>True

      >>

      >>What does 'in' have to do with indexing?

      Nothing unless its implementation uses a while loop

      and index, but thats unlikely.

      But that is pretty close to what actually happens, according to the

      language ref docs for 'in' (see my previous post).

      Kent

      Tutor maillist - Tutor (AT) python (DOT) org

      #4; Sun, 04 May 2008 21:24:00 GMT
    • Kent Johnson wrote:

      Alan Gauld wrote:

      What does 'in' have to do with indexing?

      >>

      >>

      >>Nothing unless its implementation uses a while loop

      >>and index, but thats unlikely.

      But that is pretty close to what actually happens, according to the

      language ref docs for 'in' (see my previous post).

      I'm curious enough about this (K, I admit it, I like to be right, too

      ;) to dig in to the details, if anyone is interestedone of the

      benefits of Python being open-source is you can find out how it works

      First step, look at the bytecodes:

      import dis

      def f(x, y):

      return x in y

      dis.dis(f)

      2 0 LAD_FAST 0 (x)

      3 LAD_FAST 1 (y)

      6 CMPAREP 6 (in)

      9 RETURN_VALUE

      So 'in' is implemented as a CMPAREP. Looking in ceval.c for

      CMPAREP, it has some optimizations for a few fast compares, then

      calls cmp_outcome() which, for 'in', calls PySequence_Contains().

      PySequence_Contains() is implemented in abstract.c. If the container

      implements __contains__, that is called, otherwise

      _PySequence_IterSearch() is used.

      _PySequence_IterSearch() calls PGetIter() to constuct an

      iterator on the sequence, then goes into an infinite loop (for (;;))

      calling PyIter_Next() on the iterator until the item is found or the

      call to PyIter_Next() returns an error.

      PGetIter() is also in abstract.c. If the object has an

      __iter__() method, that is called, otherwise PySeqIter_New() is called

      to construct an iterator.

      PySeqIter_New() is implemented in iterobject.c. It's next() method is in

      iter_iternext(). This method calls __getitem__() on its wrapped object

      and increments an index for next time.

      So, though the details are complex, I think it is pretty fair to say

      that the implementation uses a while loop (in _PySequence_IterSearch())

      and a counter (wrapped in PySeqIter_Type) to implement 'in' on a

      container that defines __getitem__ but not __iter

      By the way the implementation of 'for' also calls PGetIter(), so

      it uses the same mechanism to generate an iterator for a sequence that

      defines __getitem__().

      Kent

      Tutor maillist - Tutor (AT) python (DOT) org

      #5; Sun, 04 May 2008 21:25:00 GMT
    • Nothing unless its implementation uses a while loop

      and index, but thats unlikely.

      >

      >

      >But that is pretty close to what actually happens, according to the

      >language ref docs for 'in' (see my previous post).

      in certain cases. The point I was making (or trying to) is

      that both loops actually depend on how iterators work - and

      they currently use indexes, but the loops themselves don't. And

      its quite possible, likely even, that generic iterator code could

      appear that doesn't even store an index at all.

      PySequence_Contains() is implemented in abstract.c. If the container

      implements __contains__, that is called, otherwise

      _PySequence_IterSearch() is used.

      And at this point we are out of the loop code and into iterator code

      which it where the index is stored.

      So, though the details are complex, I think it is pretty fair to say

      that the implementation uses a while loop (in _PySequence_IterSearch())

      and a counter (wrapped in PySeqIter_Type) to implement 'in' on a

      container that defines __getitem__ but not __iter

      I'd say the implementation uses a while loop which uses an iterator

      and no counter - it relies on the iterator throwing an exception to

      detect the end, the loop code has no index and neither does the 'in'

      code. The 'in' is two levels of abstraction away from the index.

      By the way the implementation of 'for' also calls PGetIter(), so

      it uses the same mechanism to generate an iterator for a sequence that

      defines __getitem__().

      Which is what I said, it relies on the iterator. (But I didn't know

      about the legacy getitem() branch! ) In each case if the way iterators

      work changes the loops will carry on as they are. I'm actually

      surprised the C implementation uses an index, - I thought it would

      just manipulate pointers - but given the getitem mechanism maybe

      its not so surprising.

      Thanks for doing the research - I was too lazy to do it myself ;-)

      Alan G.

      Tutor maillist - Tutor (AT) python (DOT) org

      #6; Sun, 04 May 2008 21:26:00 GMT