Tags: adventure, becausei, cant, compare, engine, function, itto, programming, python, similar, strings, text, xor

Why cant I xor strings?

On Programmer » Python

84,921 words with 50 Comments; publish: Wed, 26 Dec 2007 23:50:00 GMT; (200156.25, « »)

I wrote a function to compare whether two strings are "similar" because

I'm using python to make a small text adventure engine and I want to it

to be responsive to slight mispellings, like "inevtory" and the like. To

save time the first thing my function does is check if I'm comparing an

empty vs. a non-empty string, because they are to be never considered

similar. Right now I have to write out the check like this:

if str1 and str2:

if not str1 or not str2:

return 0

Because python won't let me do str1 ^ str2. Why does this only work for

numbers? Just treat empty strings as 0 and nonempty as 1.

Regardless of whether this is the best implementation for detecting if

two strings are similar, I don't see why xor for strings shouldn't be

supported. Am I missing something? Inparticular, I think it'd be cool to

have "xor" as opposed to "^". The carrot would return the resulting

value, while "xor" would act like and/or do and return the one that was

true (if any).

All Comments

Leave a comment...

  • 50 Comments
    • dataangel <k04jg02.python.todaysummary.com.kzoo.edu> writes:

      > To save time the first thing my function does is check if

      > I'm comparing an empty vs. a non-empty string, because they are to be

      > never considered similar.

      > Right now I have to write out the check like this:

      > if str1 and str2:

      > if not str1 or not str2:

      > return 0

      How about:

      def nonempty(s):

      return (len(s) > 0)

      and then

      if nonempty(str1) != nonempty(str2):

      return 0

      Of course there's other ways to write the same thing. Python 2.3 has

      a "bool" function which when called on a string, returns true iff the

      string is nonempty.

      #1; Wed, 26 Dec 2007 23:51:00 GMT
    • Hi DataAngel,

      Welcome to the Python community!

      From reading your post, it sounds like you are wanting to use XOR

      encryption because there are other ways to quickly and easily compare

      strings to see if they are equal.

      To compare a string, use the following:

      name = "Steven"

      name2 = "Steven"

      print name == name2

      The result will be "True." However, if my initial guess is correct that

      you are wanting to use XOR for strings, it probably because you are

      wanting a quick way of encrypting data so that basic users won't be able

      to view the information. If this is what you are wanting, I might have

      something for you...

      Let me know,

      Byron

      --

      dataangel wrote:

      > I wrote a function to compare whether two strings are "similar" because

      > I'm using python to make a small text adventure engine and I want to it

      > to be responsive to slight mispellings, like "inevtory" and the like. To

      > save time the first thing my function does is check if I'm comparing an

      > empty vs. a non-empty string, because they are to be never considered

      > similar. Right now I have to write out the check like this:

      > if str1 and str2:

      > if not str1 or not str2:

      > return 0

      > Because python won't let me do str1 ^ str2. Why does this only work for

      > numbers? Just treat empty strings as 0 and nonempty as 1.

      > Regardless of whether this is the best implementation for detecting if

      > two strings are similar, I don't see why xor for strings shouldn't be

      > supported. Am I missing something? Inparticular, I think it'd be cool to

      > have "xor" as opposed to "^". The carrot would return the resulting

      > value, while "xor" would act like and/or do and return the one that was

      > true (if any).

      >

      #2; Wed, 26 Dec 2007 23:52:00 GMT
    • Hi DataAngel,

      Welcome to the Python community!

      From reading your post, it sounds like you are wanting to use XOR for

      "encryption." The reason why I say this is because there are many other

      ways of comparing the content of two strings to see if they are exactly

      the same. An example of how to do such a thing is listed below:

      # How to compare two strings to see if they are exact matches.

      name = "Steven"

      name2 = "Steven"

      print name == name2

      The result will be "True."

      However, the only reason that one might want to use XOR for a string is

      if he or she wanted to use XOR based encryption in order to keep data

      semi-private.

      Let me know,

      Byron

      --

      dataangel wrote:

      > I wrote a function to compare whether two strings are "similar" because

      > I'm using python to make a small text adventure engine and I want to it

      > to be responsive to slight mispellings, like "inevtory" and the like. To

      > save time the first thing my function does is check if I'm comparing an

      > empty vs. a non-empty string, because they are to be never considered

      > similar. Right now I have to write out the check like this:

      > if str1 and str2:

      > if not str1 or not str2:

      > return 0

      > Because python won't let me do str1 ^ str2. Why does this only work for

      > numbers? Just treat empty strings as 0 and nonempty as 1.

      > Regardless of whether this is the best implementation for detecting if

      > two strings are similar, I don't see why xor for strings shouldn't be

      > supported. Am I missing something? Inparticular, I think it'd be cool to

      > have "xor" as opposed to "^". The carrot would return the resulting

      > value, while "xor" would act like and/or do and return the one that was

      > true (if any).

      >

      #3; Wed, 26 Dec 2007 23:53:00 GMT
    • On 2004-10-08, Byron <DesertLinux.python.todaysummary.com.netscape.net> wrote:

      > Hi DataAngel,

      > Welcome to the Python community!

      > From reading your post, it sounds like you are wanting to use XOR

      > encryption

      No, he wants to do an exclusive or of the boolean values of the

      strings.

      > The result will be "True." However, if my initial guess is correct that

      > you are wanting to use XOR for strings, it probably because you are

      > wanting a quick way of encrypting data so that basic users won't be able

      > to view the information.

      No, he explained exactly what he was trying to do, and it had

      nothing to do with encryption. He wants to know if exactly one

      (1) of the strings is the empty string.

      --

      Grant Edwards grante Yow! I hope something GOOD

      at came in the mail today so

      visi.com I have a REASON to live!!

      #4; Wed, 26 Dec 2007 23:54:00 GMT
    • Grant Edwards wrote:

      > No, he explained exactly what he was trying to do, and it had

      > nothing to do with encryption. He wants to know if exactly one

      > (1) of the strings is the empty string.

      Hi Grant,

      Opps, that's what happens when on skims through a post a light speed...

      lol :-)

      Byron

      --

      #5; Wed, 26 Dec 2007 23:55:00 GMT
    • Grant Edwards wrote:

      > No, he explained exactly what he was trying to do, and it had

      > nothing to do with encryption. He wants to know if exactly one

      > (1) of the strings is the empty string.

      Hi Grant,

      Opps, that's what happens when one skims through a post a light speed...

      lol :-)

      Byron

      --

      #6; Wed, 26 Dec 2007 23:57:00 GMT
    • Grant Edwards wrote:

      > No, he explained exactly what he was trying to do, and it had

      > nothing to do with encryption. He wants to know if exactly one

      > (1) of the strings is the empty string.

      Hi Grant,

      Opps, that's what happens when one skims through a post at light

      speed... lol :-)

      Byron

      --

      #7; Wed, 26 Dec 2007 23:58:00 GMT
    • Grant Edwards:

      > No, he explained exactly what he was trying to do, and it had

      > nothing to do with encryption. He wants to know if exactly one

      > (1) of the strings is the empty string.

      BTW, another way to get that is

      if bool(str1) + bool(str2) == 1:

      print "one and only one of them was empty"

      Andrew

      dalke.python.todaysummary.com.dalkescientific.com

      #8; Wed, 26 Dec 2007 23:58:00 GMT
    • Grant Edwards <grante <at> visi.com> writes:

      > No, he wants to do an exclusive or of the boolean values of the

      > strings.

      I'm guessing the OP has already guessed this solution from the variety already

      provided, but the direct translation of this statement would be:

      bool(x) ^ bool(y)

      for example:

      >>> bool("") ^ bool("a")

      True

      >>> bool("") ^ bool("")

      False

      >>> bool("a") ^ bool("a")

      False

      >>> bool("a") ^ bool("")

      True

      Steve

      #9; Thu, 27 Dec 2007 00:00:00 GMT
    • On 2004-10-09, Steven Bethard <steven.bethard.python.todaysummary.com.gmail.com> wrote:

      > Grant Edwards <grante <at> visi.com> writes:

      >> No, he wants to do an exclusive or of the boolean values of the

      >> strings.

      > I'm guessing the OP has already guessed this solution from the

      > variety already provided, but the direct translation of this

      > statement would be:

      > bool(x) ^ bool(y)

      :)

      It dawned on me later that perhaps the jump from what I wrote

      to the code you wrote wasn't as obvious as I thought.

      --

      Grant Edwards grante Yow!

      at TAILFINS!!... click...

      visi.com

      #10; Thu, 27 Dec 2007 00:00:00 GMT
    • On Fri, 08 Oct 2004 16:19:22 -0400, dataangel wrote:

      > Regardless of whether this is the best implementation for detecting if

      > two strings are similar, I don't see why xor for strings shouldn't be

      > supported. Am I missing something?

      The basic problem is that there is no obvious "xor on string" operation

      *in general*, even if you stipulate "bitwise". In particular, what does it

      mean to bitwise xor two different length strings?

      "In general" is the key, here. You actually don't have strings,

      conceptually, you have "tokens" or "commands" or something that happen to

      be strings. I recommend creating a class to match this concept by

      subclassing str... or even just write it directly without subclassing

      string. You can then make __xor__(self, other) for that class do whatever

      makes sense with your concept.

      http://docs.python.org/ref/numeric-types.html

      I can almost guarantee that you will later find other things to put into

      that class. It may very well clean up a lot of code.

      #11; Thu, 27 Dec 2007 00:02:00 GMT
    • On 2004-10-09, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >> Regardless of whether this is the best implementation for

      >> detecting if two strings are similar, I don't see why xor for

      >> strings shouldn't be supported. Am I missing something?

      > The basic problem is that there is no obvious "xor on string" operation

      Sure there is. Strings have a boolean value, and the xor

      operation on boolean values is well-defined.

      --

      Grant Edwards grante Yow! Are you mentally here

      at at Pizza Hut??

      visi.com

      #12; Thu, 27 Dec 2007 00:03:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      > On 2004-10-09, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      > >> Regardless of whether this is the best implementation for

      > >> detecting if two strings are similar, I don't see why xor for

      > >> strings shouldn't be supported. Am I missing something?

      > > The basic problem is that there is no obvious "xor on string" operation

      > Sure there is. Strings have a boolean value, and the xor

      > operation on boolean values is well-defined.

      That's an operation, but I'm not sure that's the obvious one. For my

      part, if I saw "string1 ^ string2" I'd probably expect a byte by byte

      xor with the result being a new string.

      It doesn't feel natural to me to have my strings suddenly interpreted

      as a new data type based on the operation at hand. Logical operators

      work that way but not numerics (it would be in the same vein as string

      + number interpreting the string as a number - that way lies Perl :-))

      -- David

      #13; Thu, 27 Dec 2007 00:04:00 GMT
    • On 2004-10-09, David Bolen <db3l.python.todaysummary.com.fitlinxx.com> wrote:

      >>> The basic problem is that there is no obvious "xor on string"

      >>> operation

      >>

      >> Sure there is. Strings have a boolean value, and the xor

      >> operation on boolean values is well-defined.

      > That's an operation, but I'm not sure that's the obvious one.

      > For my part, if I saw "string1 ^ string2" I'd probably expect

      > a byte by byte xor with the result being a new string.

      Only because Python lacks a logical xor operator, so you're

      used to thinking of ^ as a bitwise operator. What if you saw

      string1 xor string2?

      Wouldn't you expect it to be equivalent to

      (string1 and (not string2)) or ((not string1) and string2)

      > It doesn't feel natural to me to have my strings suddenly

      > interpreted as a new data type based on the operation at hand.

      > Logical operators work that way but not numerics

      I don't know what you mean by that. Nobody seems to have a

      problem with "and" "or" and "not" operators using the truth

      values of strings. What is there about "xor" that precludes it

      from behaving similarly?

      > (it would be in the same vein as string + number interpreting

      > the string as a number - that way lies Perl :-))

      I don't see that at all.

      --

      Grant Edwards grante Yow! Let's go to CHURCH!

      at

      visi.com

      #14; Thu, 27 Dec 2007 00:05:00 GMT
    • David Bolen wrote:

      > Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      >>On 2004-10-09, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >>>

      >>>The basic problem is that there is no obvious "xor on string" operation

      >>

      >>Sure there is. Strings have a boolean value, and the xor

      >>operation on boolean values is well-defined.

      > That's an operation, but I'm not sure that's the obvious one. For my

      > part, if I saw "string1 ^ string2" I'd probably expect a byte by byte

      > xor with the result being a new string.

      ... but you'd get a traceback. ;) As pointed out earlier in this

      thread, what works is "bool(string1) ^ bool(string2)", which

      certainly doesn't violate the law of least astonishment.

      Why would anything else be needed?

      #15; Thu, 27 Dec 2007 00:06:00 GMT
    • >>>>> "Grant" == Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      Grant> I don't know what you mean by that. Nobody seems to have a

      Grant> problem with "and" "or" and "not" operators using the truth

      Grant> values of strings. What is there about "xor" that

      Grant> precludes it from behaving similarly?

      It's just that logical xor is an extremely rare beast. I would

      probably prefer to see the operation expanded to the more typical

      and-or-nots in real code.

      --

      Ville Vainio http://tinyurl.com/2prnb

      #16; Thu, 27 Dec 2007 00:07:00 GMT
    • Ville Vainio <ville.python.todaysummary.com.spammers.com> writes:

      > It's just that logical xor is an extremely rare beast. I would

      > probably prefer to see the operation expanded to the more typical

      > and-or-nots in real code.

      I think != is appropriate in this situation.

      if bool(x) != bool(y): ...

      #17; Thu, 27 Dec 2007 00:08:00 GMT
    • Ville Vainio wrote:

      > Grant> I don't know what you mean by that. Nobody seems to have a

      > Grant> problem with "and" "or" and "not" operators using the truth

      > Grant> values of strings. What is there about "xor" that

      > Grant> precludes it from behaving similarly?

      > It's just that logical xor is an extremely rare beast. I would

      > probably prefer to see the operation expanded to the more typical

      > and-or-nots in real code.

      "imp" and "eqv", anyone?

      </F

      #18; Thu, 27 Dec 2007 00:09:00 GMT
    • Grant Edwards wrote:

      > What if you saw

      > string1 xor string2?

      > Wouldn't you expect it to be equivalent to

      > (string1 and (not string2)) or ((not string1) and string2)

      I would expect it to give a syntax error.

      If not, and 'xor' did become a new boolean operator

      in Python I would expect it to act more like

      xor_f(x, y)

      where 'xor_f' the function is defined as

      def xor_f(x, y):

      x = bool(x)

      y = bool(y)

      return (x and not y) or (not x and y)

      Why the distinction? In your code you call bool on

      an object at least once and perhaps twice. The

      truth of an object should only be checked once. You

      also have asymmetric return values. Consider

      s1 s2 s1 xor s2

      "A" "B" False

      "A" "" True

      "" "B" "B"

      "" "" False

      Esthetics suggest that either "A"/"" return "A" or that

      ""/"B" return True. Mine does the latter. Yours does

      neither. Probably the Pythonic way, assuming 'xor'

      can be considered Pythonic, is to return the object

      which gave the True value, like this

      def xor_f(x, y):

      bx = bool(x)

      by = bool(y)

      if bx:

      if not by:

      return bx

      return False

      else:

      if by:

      return by

      return False

      In any case, 'xor' the binary operator is rarely

      needed and as you've shown can be interpreted in

      a couple different ways. Each alone weighs against

      it. Both together make it almost certainly a bad

      idea for Python.

      Andrew

      dalke.python.todaysummary.com.dalkescientific.com

      #19; Thu, 27 Dec 2007 00:10:00 GMT
    • Paul Rubin wrote:

      > I think != is appropriate in this situation.

      > if bool(x) != bool(y): ...

      while I prefer my already mentioned

      if bool(x) + bool(y) == 1:

      print "One and only one given"

      That's because the pure logic ones, both != and

      xor, confuse me while arithmetic doesn't.

      It's also extensible to, say, "all three

      parameters are either empty strings or non-empty

      strings".

      if 0 < bool(x) + bool(y) + bool(z) < 3:

      print "that's not allowed"

      BTW, I've used an idiom like this with some

      success

      def blah(filename = None, infile = None, url = None):

      if (bool(filename is None) + bool(infile is None) +

      bool(url is None)) != 2:

      raise TypeError("Must specify one and only one of "

      "'filename', 'infile', or 'url'")

      if filename is not None:

      infile = open(filename)

      elif url is not None:

      infile = urllib.urlopen(url)

      ...

      Try that with an xor.

      Andrew

      dalke.python.todaysummary.com.dalkescientific.com

      #20; Thu, 27 Dec 2007 00:11:00 GMT
    • On 2004-10-10, Ville Vainio <ville.python.todaysummary.com.spammers.com> wrote:

      >>>>>> "Grant" == Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      > Grant> I don't know what you mean by that. Nobody seems to have a

      > Grant> problem with "and" "or" and "not" operators using the truth

      > Grant> values of strings. What is there about "xor" that

      > Grant> precludes it from behaving similarly?

      > It's just that logical xor is an extremely rare beast.

      Probably so, but that doesn't support the arguement that

      there's something wrong with a logical xor argument coercing

      it's operands to boolean values unless one also argues that

      the logical "and", "or" and "not" operators should also not

      coerce their operands to booleans.

      > I would probably prefer to see the operation expanded to the

      > more typical and-or-nots in real code.

      If logical operators shouldn't coerce operands, then what you

      should see is:

      if (bool(string1) and (not bool(string2)) or ((not bool(string1)) and bool(string2))

      --

      Grant Edwards grante Yow! A shapely CATHOLIC

      at SCHOOLGIRL is FIDGETING

      visi.com inside my costume...

      #21; Thu, 27 Dec 2007 00:12:00 GMT
    • On 2004-10-10, Andrew Dalke <adalke.python.todaysummary.com.mindspring.com> wrote:

      > Why the distinction? In your code you call bool on

      > an object at least once and perhaps twice. The

      > truth of an object should only be checked once. You

      > also have asymmetric return values. Consider

      > s1 s2 s1 xor s2

      > "A" "B" False

      > "A" "" True

      > "" "B" "B"

      > "" "" False

      Hey, _that_ particular bug isn't my fault. Somebody else

      decided that "and" and "or" don't return boolean values like

      they should. ;)

      > Esthetics suggest that either "A"/"" return "A" or that

      > ""/"B" return True. Mine does the latter. Yours does

      > neither. Probably the Pythonic way, assuming 'xor'

      > can be considered Pythonic, is to return the object

      > which gave the True value, like this

      > def xor_f(x, y):

      > bx = bool(x)

      > by = bool(y)

      > if bx:

      > if not by:

      > return bx

      > return False

      > else:

      > if by:

      > return by

      > return False

      That's ugly. But then again, "A" or "B" returning "A" is ugly.

      While I agree with your points, they're immaterial to the

      argument I was making. The poster to which I responded was

      arguing that "xor" didn't make sense because having it coerce

      it's arguments to booleans was "wrong".

      --

      Grant Edwards grante Yow! Zippy's brain cells

      at are straining to bridge

      visi.com synapses...

      #22; Thu, 27 Dec 2007 00:13:00 GMT
    • On Sun, 10 Oct 2004 15:30:32 +0000, Grant Edwards wrote:

      > While I agree with your points, they're immaterial to the

      > argument I was making. The poster to which I responded was

      > arguing that "xor" didn't make sense because having it coerce

      > it's arguments to booleans was "wrong".

      I didn't say "wrong", I said "non-obvious".

      And, with respect, the fact that you have to argue your point is evidence

      in my favor. Of course "proof" would require a comprehensive survey, and I

      think we can all agree this point isn't worth such effort :-)

      But I do think that a bitwise operator should silently transform to a

      logical operator for strings is not obvious.

      What is 388488839405842L ^ "hello"?

      Python says that's an "unsupported operand type(s) for ^: 'long' and

      'str'". Why is it wrong?

      This would also work if Python were more like C++ and we could define

      xor(string, string)

      xor(int, int)

      and be done with it, but in Python, there should be an obvious meaning for

      int ^ string, and there isn't.

      It is also true that I recommended the OP consider subclassing string to

      make ^ do what he wants. But it seems to be reasonably well expected that

      while user classes can do what they like with operators (as long as they

      are willing to pay the sometimes-subtle prices, especially the ones

      involved with poorly defining __cmp__), that the core language should not

      do such things.

      #23; Thu, 27 Dec 2007 00:14:00 GMT
    • On 2004-10-10, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      > But I do think that a bitwise operator should silently

      > transform to a logical operator for strings is not obvious.

      > What is 388488839405842L ^ "hello"?

      No, I wouldn't want that either. What I was talking about was

      the possibility of an "xor" logical operator that would behave

      in a manner similar to the other logical operators in that it

      coerces it's parameters to boolean values.

      --

      Grant Edwards grante Yow! I am covered with

      at pure vegetable oil and I am

      visi.com writing a best seller!

      #24; Thu, 27 Dec 2007 00:15:00 GMT
    • On Sun, 10 Oct 2004 19:25:16 GMT, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >On Sun, 10 Oct 2004 15:30:32 +0000, Grant Edwards wrote:

      >> While I agree with your points, they're immaterial to the

      >> argument I was making. The poster to which I responded was

      >> arguing that "xor" didn't make sense because having it coerce

      >> it's arguments to booleans was "wrong".

      >I didn't say "wrong", I said "non-obvious".

      >And, with respect, the fact that you have to argue your point is evidence

      >in my favor. Of course "proof" would require a comprehensive survey, and I

      >think we can all agree this point isn't worth such effort :-)

      >But I do think that a bitwise operator should silently transform to a

      >logical operator for strings is not obvious.

      >What is 388488839405842L ^ "hello"?

      >Python says that's an "unsupported operand type(s) for ^: 'long' and

      >'str'". Why is it wrong?

      I agree it is not wrong. The user is reminded that s/he should be explicit

      and create a type that behaves as desired. However (;-) ...

      What's right about accepting 3^7 ? Why should xor be defined for integers?

      IMO we have implicit subtyping of integers as vectors or column matrices

      of bools and operations element by element and implicit re-presentation

      of the result as integer.

      This dual interpretation of course reflects CPU hardware functionality, and

      I would argue that familiarity (not to mention the expediency of conciseness)

      has bred acceptance of operator notation without explicit cruft such as

      int(boolvec(3)^boolvec(7)) instead of 3^7. The trouble is that a boolvec

      should have a length, and boolvec(3) really hides implicit determination of length

      (by dropping sign bits of unsigned value or extending signed integer sign bits infinitely),

      and the ^ operation between boolvecs of different length hides implicit normalization to

      the length of the longer (or infinity with compressed representation)). Again,

      CPU hardware legacy comes into play, with the hidden imposition of length=32, sometimes

      64 now, and we are forced (happily IMO) into defining what we mean in terms of

      physical-representation-independent abstractions.

      So the question becomes

      What is boolvec(388488839405842L) ^ boolvec("hello")?

      and boolvec("hello") is the more ambiguous one. If we wrote

      boolvec("hello", as_bytes=True)

      I think most would have an idea of what it meant -- until they

      started to think about endianness ;-) I.e., what is the value of the following?

      (boolvec("hello", asbytes=True) & boolvec(0xffff)).as_string() #note '&' for simpler example

      should it be

      "he\x00\x00\x00"

      or do you see it as big-endian

      "\x00\x00\x00lo"

      and should "sign" bits be dropped, so the result would be "he" or "lo" ?

      Or -- should boolvec("hello") default to boolvec(bool("hello"), length=1) ?

      >This would also work if Python were more like C++ and we could define

      >xor(string, string)

      >xor(int, int)

      >and be done with it, but in Python, there should be an obvious meaning for

      >int ^ string, and there isn't.

      Notice that no one complains about intgr^intgr

      not being defined as int(bool(intgr)^bool(intgr)) ?;-)

      >It is also true that I recommended the OP consider subclassing string to

      >make ^ do what he wants. But it seems to be reasonably well expected that

      >while user classes can do what they like with operators (as long as they

      >are willing to pay the sometimes-subtle prices, especially the ones

      >involved with poorly defining __cmp__), that the core language should not

      >do such things.

      Except where there is an accepted legacy of such things being done already ;-)

      BTW, what is the rationale behind this:

      >>> ['',(),[],0, 0.0, 0L].count(0)

      3

      >>> ['',(),[],0, 0.0, 0L].count(())

      1

      >>> ['',(),[],0, 0.0, 0L].count([])

      1

      >>> ['',(),[],0, 0.0, 0L].count('')

      1

      >>> ['',(),[],0, 0.0, 0L].count(0.0)

      3

      >>> ['',(),[],0, 0.0, 0L].count(0L)

      3

      BTW2, just had the thought that ',' could be generalized as an operator. E.g.,

      obj, x

      would mean

      type(obj).__dict__['__comma__'](obj, x)

      unless __comma__ was undefined. And you could have __rcomma__ for x, obj.

      Returning the same type object would support multiple commas as in obj, x, y

      Of course, you would get interesting effects in such contexts as print obj,x,y

      and foo(obj,x,y) or even (obj,x,y) vs (obj,x,y,) vs ((obj,x,y),) ;-)

      Probably sequence-building should have priority and be overridden by parenthesized

      expression. So print obj,x would be effectively be print str(obj),x and

      print (obj,x,y) would be print str(type(obj).__dict__['__comma__'](obj,x)).

      Similarly for arg list formation.

      Regards,

      Bengt Richter

      #25; Thu, 27 Dec 2007 00:16:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      > Only because Python lacks a logical xor operator, so you're

      > used to thinking of ^ as a bitwise operator. What if you saw

      > string1 xor string2?

      > Wouldn't you expect it to be equivalent to

      > (string1 and (not string2)) or ((not string1) and string2)

      Yes, no problem. I was definitely working with the bitwise operator

      which is what I thought the OP was originally desiring.

      > > It doesn't feel natural to me to have my strings suddenly

      > > interpreted as a new data type based on the operation at hand.

      > > Logical operators work that way but not numerics

      > I don't know what you mean by that. Nobody seems to have a

      > problem with "and" "or" and "not" operators using the truth

      > values of strings. What is there about "xor" that precludes it

      > from behaving similarly?

      I think we're just talking about different items. I was referring to

      the bitwise (^) xor operator (numeric), but not to a logical xor

      operator. I'd have no problem with a separate logical operator that

      behaved like the other logical operators.

      The fact that saying "xor" can imply either isn't helping :-)

      -- David

      #26; Thu, 27 Dec 2007 00:17:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> writes:

      > While I agree with your points, they're immaterial to the

      > argument I was making. The poster to which I responded was

      > arguing that "xor" didn't make sense because having it coerce

      > it's arguments to booleans was "wrong".

      Well, actually I said:

      "Logical operators work that way but not numerics ..."

      so I'm not arguing against a logical "xor" working that way, but not

      the existing xor (^) operation in Python, which is bitwise.

      -- David

      #27; Thu, 27 Dec 2007 00:18:00 GMT
    • On Sun, 10 Oct 2004 22:03:08 +0000, Bengt Richter wrote:

      > What's right about accepting 3^7 ? Why should xor be defined for integers?

      > IMO we have implicit subtyping of integers as vectors or column matrices

      > of bools and operations element by element and implicit re-presentation

      > of the result as integer.

      I'm intrigued but torn by your arguments.

      For positive numbers, a number really is its vector of bits. In the

      mathematical sense of "equal" (which I usually express in English as "two

      equal things are fully substitutable with each other in all relevant

      contexts"), where all base numbers are written in base 10:

      10 base 10 = 11 base 9 = 22 base 4 = 1010 base 2

      The fact that it happens to really be a bit vector in the computer in this

      case actually doesn't count for anything, because just as that bit vector

      *is*, in every conceivable mathematical way, a base 10 number, it is also

      a base 3 number. I can and have defined ternary math before, and

      experimental computers have been built on ternary bases and ternary logic,

      and 44 in a binary computer is 44 in a ternary computer.

      Only our choice of negative numbers, practically speaking, makes a

      difference between the number-as-bitstring and number-qua-number, and

      something as high level as Python can conceptually use an extra "negative"

      bit so even that distinction goes away. In fact, playing with xor'ing

      longs makes me wonder if Python doesn't *already* do that.

      > What is boolvec(388488839405842L) ^ boolvec("hello")?

      I'd rather see this as the fairly meaningless "388488839405842L base 2",

      meaningless because the base of a number does not affect its value, and

      bitwise-xor, as the name implies, operates on the number base two.

      Since I reject the need to cast ^ in terms of boolvec, I don't feel

      compelled to try to define "boolvec" for strings. Cast it into a number

      explicitly, refusing the temptation to guess.

      > Except where there is an accepted legacy of such things being done already ;-)

      > BTW, what is the rationale behind this:

      > >>> ['',(),[],0, 0.0, 0L].count(0)

      > 3

      > >>> ['',(),[],0, 0.0, 0L].count(())

      > 1

      > >>> ['',(),[],0, 0.0, 0L].count([])

      > 1

      > >>> ['',(),[],0, 0.0, 0L].count('')

      > 1

      > >>> ['',(),[],0, 0.0, 0L].count(0.0)

      > 3

      > >>> ['',(),[],0, 0.0, 0L].count(0L)

      > 3

      Again, considered abstractly as numbers, 0 is zero no matter how you slice

      it. Abstractly, even floats have a binary representation, just as they

      have a decimal representation, and we could even xor them. Realistically,

      it is much less useful and there is the infinite-length problem of floats

      to deal with.

      Theoretically, no float should ever equal an int or a long, since an Int

      or Long conceptually identifies exactly one point and a float should be

      seen as representing a range of numbers depending on the precision that

      can be made arbitrarily small but never truly identifies one point. This

      is another one of those "practicality beats purity" things.

      #28; Thu, 27 Dec 2007 00:19:00 GMT
    • On Sun, 10 Oct 2004 22:24:57 GMT, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >On Sun, 10 Oct 2004 22:03:08 +0000, Bengt Richter wrote:

      >> What's right about accepting 3^7 ? Why should xor be defined for integers?

      >> IMO we have implicit subtyping of integers as vectors or column matrices

      >> of bools and operations element by element and implicit re-presentation

      >> of the result as integer.

      >I'm intrigued but torn by your arguments.

      >For positive numbers, a number really is its vector of bits. In the

      ^^^^^[1] [2]^^ ^^^[3] ^^^^[4]

      [1] Abstract, non-negative integers?

      [2] "is" as in "is identical with"? I doubt it.

      [3] In what sense does an abstract number posess (its) a vector of bits?

      I take it 'its' is in the sense of a 1:1 mapping to the corresponding

      (unique) vector of "bits".

      [4] A two's complement representation of an (abstract) integer, has "bits"

      as _numerical_ values in the sum series with corresponding numerical

      coefficients that are powers of two. But "bits" in the context of

      bitwise ^ or & or | do not represent abstract _numerical_ values or 0 or 1,

      they represent logical values False or True (and 0<->False, 1<->True is

      just a convention that we're used to.

      >mathematical sense of "equal" (which I usually express in English as "two

      >equal things are fully substitutable with each other in all relevant

      >contexts"), where all base numbers are written in base 10:

      >10 base 10 = 11 base 9 = 22 base 4 = 1010 base 2

      >The fact that it happens to really be a bit vector in the computer in this

      [1]^^ [2]^^^^^^^^^ ^^^[3]

      [1] The abstract integer value?

      [2] have 1:1 mapping to?

      [3] bit is short for binary _digit_, not binary logic-value -- i.e., it's numeric,

      and ties in with your base 2 representation.

      >case actually doesn't count for anything, because just as that bit vector

      >*is*, in every conceivable mathematical way, a base 10 number, it is also

      1:1 correspondence is not identity. IOW, there's no such thing as "a base 10 number"

      There is a set of abstract integers, and each can be put into 1:1 correpondence with

      a set of symbols and rules. A "base 10 number" is composed with a set of digits [0-9]

      and a rule for ordered composition and a rule for an interpretation to map back to

      the abstract value. We can discuss each part of this process, but the bottom line is

      that when we say that represented values are equal, we are saying that representations

      map back to the same unique abstract value, not that representations per se are identical.

      >a base 3 number. I can and have defined ternary math before, and

      >experimental computers have been built on ternary bases and ternary logic,

      >and 44 in a binary computer is 44 in a ternary computer.

      Sure, but again we are talking unique abstract integers vs various representations

      and the rules for composing them and interpreting them. BTW, did you ever hear

      of bi-quinary? That was a 4-bit representation of a decimal digit where the msb

      had a "coefficient" of 5 and the 3 lsb's counted in binary modulo 5 ;-)

      >Only our choice of negative numbers, practically speaking, makes a

      >difference between the number-as-bitstring and number-qua-number, and

      >something as high level as Python can conceptually use an extra "negative"

      >bit so even that distinction goes away. In fact, playing with xor'ing

      >longs makes me wonder if Python doesn't *already* do that.

      Yes, it plays the as-if-extended-infinitely-to-the-left game with sign bits,

      but the hex representation of negative longs is not much good for seeing what

      the bits are in a negative number >;-(

      >> What is boolvec(388488839405842L) ^ boolvec("hello")?

      >I'd rather see this as the fairly meaningless "388488839405842L base 2",

      >meaningless because the base of a number does not affect its value, and

      >bitwise-xor, as the name implies, operates on the number base two.

      Well, to be nit-picky, I still think it does not really operate on a number

      at all. It converts a number to an ordered set of bools, by way of an intermediate

      two's complement _representation_ whose _numeric_ bit values are mapped to bool values.

      Then, after the operating with the ordered sets of bools, the resulting set is mapped

      back to a two's complement integer representation again, which can be interpreted as

      an abstract integer value again ;-)

      >Since I reject the need to cast ^ in terms of boolvec, I don't feel

      >compelled to try to define "boolvec" for strings. Cast it into a number

      >explicitly, refusing the temptation to guess.

      >> Except where there is an accepted legacy of such things being done already ;-)

      >>

      >> BTW, what is the rationale behind this:

      >>

      >> >>> ['',(),[],0, 0.0, 0L].count(0)

      >> 3

      >> >>> ['',(),[],0, 0.0, 0L].count(())

      >> 1

      >> >>> ['',(),[],0, 0.0, 0L].count([])

      >> 1

      >> >>> ['',(),[],0, 0.0, 0L].count('')

      >> 1

      >> >>> ['',(),[],0, 0.0, 0L].count(0.0)

      >> 3

      >> >>> ['',(),[],0, 0.0, 0L].count(0L)

      >> 3

      I guess this shows what's happening

      >>> class joker(object):

      ... def __cmp__(self, other): return 0

      ...

      >>> ['',(),[],0, 0.0, 0L, joker()].count('')

      2

      >>> ['',(),[],0, 0.0, 0L, joker()].count(())

      2

      >>> ['',(),[],0, 0.0, 0L, joker()].count(0)

      4

      >>> ['',(),[],0, 0.0, 0L, joker()].count(0.0)

      4

      >>> ['',(),[],0, 0.0, 0L, joker()].count(0L)

      4

      >>> ['',(),[],0, 0.0, 0L, joker()].count(joker())

      7

      >Again, considered abstractly as numbers, 0 is zero no matter how you slice

      >it. Abstractly, even floats have a binary representation, just as they

      I think of "float" as indicating representation technique, as opposed to "real"

      which I think of as a point in the continuous abstract -+infinity interval

      of real numbers.

      >have a decimal representation, and we could even xor them. Realistically,

      >it is much less useful and there is the infinite-length problem of floats

      >to deal with.

      Infinite-length problem...must...not...go...there... ;-)

      >Theoretically, no float should ever equal an int or a long, since an Int

      >or Long conceptually identifies exactly one point and a float should be

      >seen as representing a range of numbers depending on the precision that

      IMO a float should be seen as what it is (a _representation_), which may be

      interpreted variously, just like integers ;-)

      IOW, a typical float is a 64-bit IEEE-754 double, so there are 2**64

      possible states of the representation machinery. Not all of

      those states are legal floating point number representations, but each one

      that _is_ legal corresponds to an _exact_ abstract real value. It _can_ be

      the exact value you wanted to represent, or not, depending on your programming

      problem. Inexactness in not a property of floating point representations, it is

      a property of their _relation_ to the exact abstract values the programmer is trying

      to represent. 0.0 represents the exact same abstract value as integer 0

      (perhaps by way of mapping integers conceived as separate to a subset of reals).

      >can be made arbitrarily small but never truly identifies one point. This

      I disagree. 0.0 truly identifies one point. So does

      3.141592653589793115997963468544185161590576171875

      which is the _exact_ value of math.pi in decimal representation.

      But this is _not_ the exact value of the abstract mathematical pi.

      But that is not a problem with floating point numbers' not representing

      exact values. It's that the set of available exact values is finite, and

      you have to choose one to approximate (including right on sometimes) an abstract value.

      (Or you might choose two to represent _exact_ bounds on a value, for interval math).

      >is another one of those "practicality beats purity" things.

      I think the fact that repr(math.pi) is not printed as above is

      a case of "practicality beats purity" ;-)

      >>> import math

      >>> repr(math.pi)

      '3.1415926535897931'

      >>> float(repr(math.pi))

      3.1415926535897931

      >>> float(repr(math.pi)) == float(3.141592653589793115997963468544185161590576 171875)

      True

      IMO floats have their own kind of purity, besides being practical ;-)

      Regards,

      Bengt Richter

      #29; Thu, 27 Dec 2007 00:20:00 GMT
    • On Mon, 11 Oct 2004 02:19:03 +0000, Bengt Richter wrote:

      > On Sun, 10 Oct 2004 22:24:57 GMT, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >>For positive numbers, a number really is its vector of bits. In the

      > ^^^^^[1] [2]^^ ^^^[3] ^^^^[4]

      > [1] Abstract, non-negative integers?

      Integers is more convenient here, but any one dimensional number will do.

      It so happens that computers do not encode "reals" in the naive way, to

      allow for more variation in magnitude.

      > [2] "is" as in "is identical with"? I doubt it.

      With respect, this isn't something to doubt or not doubt. There is one,

      and only one, way to represent any positive number in base two, since

      encoding sign is not an issue. Assuming an extra bit to show sign, there

      is one and only one way to represent any negative number, too. (Zero gets

      to be the exception since then you can have positive and negative zero,

      which are theoretically identical, but on some machines that implement

      them have practical differences.)

      Endianness is just some wackiness in the ordering, but does not really

      change that. 60000 + 50000 = 110000 no matter what endianess you have.

      > [3] In what sense does an abstract number posess (its) a vector of bits?

      It does not. It *is* a series of ones and zeros, in base two. It *is* a

      series of ones, zeros, and twos, in base three. It *is* a series of [0-9]

      in base ten. It is all of these things at once, and an infinite number of

      other things besides.

      1 + 1 fully *is* 2. There is no difference mathematically. If there is,

      your math is broken.

      > [4] A two's complement representation of an (abstract) integer, has

      > "bits"

      > as _numerical_ values in the sum series with corresponding numerical

      > coefficients that are powers of two. But "bits" in the context of

      > bitwise ^ or & or | do not represent abstract _numerical_ values or

      > 0 or 1, they represent logical values False or True (and 0<->False,

      > 1<->True is just a convention that we're used to.

      No. They represent both of those things at once. 1 *is* True, and 0 *is*

      false, in the domain of bitwise operators.

      There is no "either/or" here, only "both/and".

      > IOW, there's no such thing as "a base 10 number"... A "base 10 number" is composed with a set of

      > digits [0-9] and a rule for ordered composition and a rule for an

      > interpretation to map back to the abstract value.

      Well, I thank you for supplying the missing definition, I suppose.

      > We can discuss each

      > part of this process, but the bottom line is that when we say that

      > represented values are equal, we are saying that representations map

      > back to the same unique abstract value, not that representations per se

      > are identical.

      Yep, that is part of my point, see below.

      >>> What is boolvec(388488839405842L) ^ boolvec("hello")?

      >>

      >>I'd rather see this as the fairly meaningless "388488839405842L base 2",

      >>meaningless because the base of a number does not affect its value, and

      >>bitwise-xor, as the name implies, operates on the number base two.

      > Well, to be nit-picky, I still think it does not really operate on a

      > number at all. It converts a number to an ordered set of bools, by way

      > of an intermediate two's complement _representation_ whose _numeric_ bit

      > values are mapped to bool values. Then, after the operating with the

      > ordered sets of bools, the resulting set is mapped back to a two's

      > complement integer representation again, which can be interpreted as an

      > abstract integer value again ;-)

      In the domain of binary logic, I do not concede a difference between "1"

      and "True", on the grounds that there is no way in the domain of binary

      logic to distinguish the two.

      > IOW, a typical float is a 64-bit IEEE-754 double, so there are 2**64

      > possible states of the representation machinery. Not all of those states

      > are legal floating point number representations, but each one that _is_

      > legal corresponds to an _exact_ abstract real value. It _can_ be the

      > exact value you wanted to represent, or not, depending on your

      > programming problem. Inexactness in not a property of floating point

      > representations, it is a property of their _relation_ to the exact

      > abstract values the programmer is trying to represent. 0.0 represents

      > the exact same abstract value as integer 0 (perhaps by way of mapping

      > integers conceived as separate to a subset of reals).

      This is a matter of definition. My problem with your definition mostly

      lies in the fact that it is not as practically useful as mine. Someone

      operating on the "floats are almost, but not quite, exact values" is

      *much* less well equipped to handle the way floats bite you than someone

      who operates with "floats represent a value and have an error range around

      that value", because then they should understand they need to compute not

      just with the values, but track the buildup of that error range as well.

      It's a point-of-view thing.

      >>can be made arbitrarily small but never truly identifies one point. This

      > I disagree. 0.0 truly identifies one point. So does

      > 3.141592653589793115997963468544185161590576171875

      > which is the _exact_ value of math.pi in decimal representation. But

      > this is _not_ the exact value of the abstract mathematical pi.

      Python 2.3.4 (#1, Sep 28 2004, 20:15:25)

      [GCC 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6)] on linux2

      Type "help", "copyright", "credits" or "license" for more information.

      >>> 3.141592653589793115997963468544185161590576171875 8327502398752309 == 3.141592653589793115997963468544185161590576171875 00000000000

      True

      Look at the last digits. (Python is just a convenience here, this is

      fundamental to all float representation schemes, as opposed to things like

      Decimal.)

      This is why I say they represent a range. In set theory, we'd call this an

      equivalence class for equality. There is *no* way to distinguish between

      3.141592653589793115997963468544185161590576171875 8327502398752309 and

      3.141592653589793115997963468544185161590576171875 00000000000 with floats

      (of the size Python uses, add more digits for some other bit count), not

      even in theory. They are the same in every way. That's because the closest

      float represents a range that covers them both. You can shrink that range

      arbitrarily small but you can not get to a unique point, such that only

      one true real will compare equal to that number and all others do not, not

      even in theory.

      #30; Thu, 27 Dec 2007 00:21:00 GMT
    • On 2004-10-11, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      > With respect, this isn't something to doubt or not doubt.

      > There is one, and only one, way to represent any positive

      > number in base two, since encoding sign is not an issue.

      > Assuming an extra bit to show sign, there is one and only one

      > way to represent any negative number, too.

      That's news to me. I've used three different base-2

      representations for negative numbers in the past week, and I

      can think of at least one other one I've used in the past.

      > (Zero gets to be the exception since then you can have

      > positive and negative zero,

      That depends on which base-2 representation you've chosen. In

      two's compiliment and excess-N representations, there is only

      one zero value. In signed-magnitude there may be two.

      --

      Grant Edwards grante Yow! I've been WRITING

      at to SOPHIA LOREN every 45

      visi.com MINUTES since JANUARY 1ST!!

      #31; Thu, 27 Dec 2007 00:22:00 GMT
    • Andrew Dalke <adalke.python.todaysummary.com.mindspring.com> wrote:

      > Grant Edwards:

      > > No, he explained exactly what he was trying to do, and it had

      > > nothing to do with encryption. He wants to know if exactly one

      > > (1) of the strings is the empty string.

      > BTW, another way to get that is

      > if bool(str1) + bool(str2) == 1:

      > print "one and only one of them was empty"

      If this is getting into a many-ways-to-skin-the-cat context, I want to

      make sure I get my entry in:

      [str1, str2].count('') == 1

      Other possibilities (I think -- untested, and it's late here):

      len([x for x in (str1, str2) if x]) == 1

      '' == min(str1, str2) != max(str1, str2)

      len(str1+str2) == max(len(str1),len(str2)) != 0

      sum(map(bool, (str1, str2))) == 1

      not str1 ^ not str2

      (str1 or str2) and not min(str1, str2)

      not str1 != not str2

      str1 != str2 and (str1+str2) == (str2+str1)

      I guess I'd better stop here because I can't actually _prove_ the last

      one of these actually implies one of the strings is empty but I can't

      find counterexamples either...!-)

      Alex

      #32; Thu, 27 Dec 2007 00:23:00 GMT
    • Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      > On Fri, 08 Oct 2004 16:19:22 -0400, dataangel wrote:

      > > Regardless of whether this is the best implementation for detecting if

      > > two strings are similar, I don't see why xor for strings shouldn't be

      > > supported. Am I missing something?

      > The basic problem is that there is no obvious "xor on string" operation

      > *in general*, even if you stipulate "bitwise". In particular, what does it

      > mean to bitwise xor two different length strings?

      What I, personally, would expect here, would be to leave the trailing

      part of the longer string "untouched". I.e., just like:

      def bitwisexor(s1, s2):

      result = [ chr(ord(a) ^ ord(b)) for a, b in zip(s1, s2) ]

      # at most one of the following 2 stmts actually does anything:

      result.extend(s1[len(result):])

      result.extend(s2[len(result):])

      return ''.join(result)

      I realize this is the same as imagining the shorter string to be

      conceptually padded with as many '\x0' characters as needed -- that just

      seems the only "intuitively natural" way to extend "sequence xor'ing" to

      sequences of different length...

      Alex

      #33; Thu, 27 Dec 2007 00:24:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      ...

      > used to thinking of ^ as a bitwise operator. What if you saw

      > string1 xor string2?

      > Wouldn't you expect it to be equivalent to

      > (string1 and (not string2)) or ((not string1) and string2)

      Personally, I would not expect 'xor' to be non-commutative, and yet the

      expression YOU apparently would expect it to be equivalent to obviousy

      is...:

      'foo' xor ''

      would be True, while

      '' xor 'foo'

      would be 'foo'. At least, that's how the "equivalent to" expression you

      give evaluates, I believe. The problem is that "not x" is a bool for

      any x, while "x and y" isn't.

      If I had to design the least-astonishing definition of xor in Python, I

      would probably have to settle for a mix of the type-producing behavior

      of and/or (always return one of the two operands, a wonderful thing) vs

      not (always return a bool, and it's hard to imagine it doing otherwise).

      "a xor b" would return one of the two operands if exactly one of them

      evaluates to true in a boolean context, but False if both or neither do

      (sigh).

      E.g.,

      def xor(a, b):

      if (a and b) or (not a and not b): return False

      else: return a or b

      Unfortunately, it seems to me that his unholy admixture of return types

      (hard to avoid...), as well as the inevitable lack of short-circuiting,

      makes such a hypothetical 'xor' _WAY_ less useful than the existing

      'and'/'or' operators are:-(

      Alex

      #34; Thu, 27 Dec 2007 00:25:00 GMT
    • On Mon, 11 Oct 2004 04:08:49 GMT, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >On Mon, 11 Oct 2004 02:19:03 +0000, Bengt Richter wrote:

      >> On Sun, 10 Oct 2004 22:24:57 GMT, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >>>For positive numbers, a number really is its vector of bits. In the

      >> ^^^^^[1] [2]^^ ^^^[3] ^^^^[4]

      >> [1] Abstract, non-negative integers?

      >Integers is more convenient here, but any one dimensional number will do.

      >It so happens that computers do not encode "reals" in the naive way, to

      >allow for more variation in magnitude.

      True. But my point (just about everywhere in the post) was to draw attention

      to the distinction between abstract entities (e.g. numbers) and concrete representations.

      >> [2] "is" as in "is identical with"? I doubt it.

      >With respect, this isn't something to doubt or not doubt. There is one,

      >and only one, way to represent any positive number in base two, since

      I think we are focusing on different things. When you say "a number really is

      its vector of bits" I think wait: a number is a number and a vector of bits

      is a vector of bits, but you can't say one _is_ the other. You can say they

      correspond in some way. When you say of a number "its vector of bits" I presume

      that "its" refers to "its" partner in the number<->bit vector relationship,

      not in the sense that an aggregate has its components. To me, a (one-dimensional)

      number is a primitive entity, and "has" no parts, though it may correspond to

      a representation that does have parts and order etc.

      I am hard put to think of anything that is _identical_ with any of the

      possible representations that it may be put in correspondence with.

      >encoding sign is not an issue. Assuming an extra bit to show sign, there

      >is one and only one way to represent any negative number, too. (Zero gets

      >to be the exception since then you can have positive and negative zero,

      >which are theoretically identical, but on some machines that implement

      >them have practical differences.)

      >Endianness is just some wackiness in the ordering, but does not really

      >change that. 60000 + 50000 = 110000 no matter what endianess you have.

      Wackiness must be accounted for ;-)

      >> [3] In what sense does an abstract number posess (its) a vector of bits?

      >It does not. It *is* a series of ones and zeros, in base two. It *is* a

      >series of ones, zeros, and twos, in base three. It *is* a series of [0-9]

      >in base ten. It is all of these things at once, and an infinite number of

      >other things besides.

      Depends on what the meaning of "is" is ;-) Your "is" seems to mean

      "can be represented by" in my terms. That's not "is" in my book ;-)

      >1 + 1 fully *is* 2. There is no difference mathematically. If there is,

      >your math is broken.

      You are apparently ignoring representation and just thinking in terms of

      what is indicated. So yes, '1 + 1' represents something and '2' represents

      something, and for some getabstractvalue, getabstractvalue('1 + 1') _is_

      getabstractvalue('2'). But whatever your representation, there is that

      function that maps to the abstract mathematical value.

      >> [4] A two's complement representation of an (abstract) integer, has

      >> "bits"

      >> as _numerical_ values in the sum series with corresponding numerical

      >> coefficients that are powers of two. But "bits" in the context of

      >> bitwise ^ or & or | do not represent abstract _numerical_ values or

      >> 0 or 1, they represent logical values False or True (and 0<->False,

      >> 1<->True is just a convention that we're used to.

      >No. They represent both of those things at once. 1 *is* True, and 0 *is*

      >false, in the domain of bitwise operators.

      ISTM we have different concepts of "is" ;-)

      >There is no "either/or" here, only "both/and".

      >> IOW, there's no such thing as "a base 10 number"... A "base 10 number" is composed with a set of

      >> digits [0-9] and a rule for ordered composition and a rule for an

      >> interpretation to map back to the abstract value.

      >Well, I thank you for supplying the missing definition, I suppose.

      Well, I made up an ad hoc definition to try to express my thinking on it ;-)

      >> We can discuss each

      >> part of this process, but the bottom line is that when we say that

      >> represented values are equal, we are saying that representations map

      >> back to the same unique abstract value, not that representations per se

      >> are identical.

      >Yep, that is part of my point, see below.

      >>>> What is boolvec(388488839405842L) ^ boolvec("hello")?

      >>>

      >>>I'd rather see this as the fairly meaningless "388488839405842L base 2",

      >>>meaningless because the base of a number does not affect its value, and

      >>>bitwise-xor, as the name implies, operates on the number base two.

      >> Well, to be nit-picky, I still think it does not really operate on a

      >> number at all. It converts a number to an ordered set of bools, by way

      >> of an intermediate two's complement _representation_ whose _numeric_ bit

      >> values are mapped to bool values. Then, after the operating with the

      >> ordered sets of bools, the resulting set is mapped back to a two's

      >> complement integer representation again, which can be interpreted as an

      >> abstract integer value again ;-)

      >In the domain of binary logic, I do not concede a difference between "1"

      >and "True", on the grounds that there is no way in the domain of binary

      >logic to distinguish the two.

      It's a matter of interpretation. I view 1 and True as different, sort of like

      >>> type(1), type(bool(1))

      (<type 'int'>, <type 'bool'>)

      >> IOW, a typical float is a 64-bit IEEE-754 double, so there are 2**64

      >> possible states of the representation machinery. Not all of those states

      >> are legal floating point number representations, but each one that _is_

      >> legal corresponds to an _exact_ abstract real value. It _can_ be the

      >> exact value you wanted to represent, or not, depending on your

      >> programming problem. Inexactness in not a property of floating point

      >> representations, it is a property of their _relation_ to the exact

      >> abstract values the programmer is trying to represent. 0.0 represents

      >> the exact same abstract value as integer 0 (perhaps by way of mapping

      >> integers conceived as separate to a subset of reals).

      >This is a matter of definition. My problem with your definition mostly

      >lies in the fact that it is not as practically useful as mine. Someone

      >operating on the "floats are almost, but not quite, exact values" is

      I didn't say that. I said they do represent _exact_ values, and that you

      can use the available exact values as you like, including intepreting them

      as tokens representing an interval on the real axis containing all the values

      that will round to the exact value in question.

      >*much* less well equipped to handle the way floats bite you than someone

      >who operates with "floats represent a value and have an error range around

      >that value", because then they should understand they need to compute not

      >just with the values, but track the buildup of that error range as well.

      >It's a point-of-view thing.

      >>>can be made arbitrarily small but never truly identifies one point. This

      >> I disagree. 0.0 truly identifies one point. So does

      >>

      >> 3.141592653589793115997963468544185161590576171875

      >>

      >> which is the _exact_ value of math.pi in decimal representation. But

      >> this is _not_ the exact value of the abstract mathematical pi.

      >Python 2.3.4 (#1, Sep 28 2004, 20:15:25)

      >[GCC 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6)] on linux2

      >Type "help", "copyright", "credits" or "license" for more information.

      >>>> 3.141592653589793115997963468544185161590576171875 8327502398752309 == 3.141592653589793115997963468544185161590576171875 00000000000

      >True

      Two different numbers were rounded to the same exact number ;-)

      >Look at the last digits. (Python is just a convenience here, this is

      >fundamental to all float representation schemes, as opposed to things like

      >Decimal.)

      >This is why I say they represent a range. In set theory, we'd call this an

      I agree they can be used to _represent_ a range. The range (actually a half-open

      continuous interval I think) will be defined in terms of the _exact_ value

      and the neighboring _exact_ values available in the set of values with concrete

      representations in the particular hardware.

      >equivalence class for equality. There is *no* way to distinguish between

      >3.141592653589793115997963468544185161590576171875 8327502398752309 and

      >3.141592653589793115997963468544185161590576171875 00000000000 with floats

      >(of the size Python uses, add more digits for some other bit count), not

      >even in theory. They are the same in every way. That's because the closest

      >float represents a range that covers them both. You can shrink that range

      Now tell me _exactly_ what that "range" is. I know it's a messy algorithm,

      but the intevals are _exactly_ defined, so where will the _exact_ mathematical

      values of the limits come from? ;-) (we'll ignore fpu rounding modes ;-)

      >arbitrarily small but you can not get to a unique point, such that only

      >one true real will compare equal to that number and all others do not, not

      >even in theory.

      Well, yes, we can only create finite sets of representations ;-)

      But you can "get to a unique point" by switching interpretation. I.e.,

      if you take the floating point number's exact value as such, instead of

      using that value and neighbors as a means to define an interval, then

      you do have a unique point. E.g., all the exact values of a 53-bit integer

      can be represented _exactly_ by a typical 64-bit double. The floating point

      representation is no less exact than the corresponding int or long.

      Regards,

      Bengt Richter

      #35; Thu, 27 Dec 2007 00:26:00 GMT
    • aleaxit.python.todaysummary.com.yahoo.com (Alex Martelli) writes:

      >> Grant Edwards:

      >> > No, he explained exactly what he was trying to do, and it had

      >> > nothing to do with encryption. He wants to know if exactly one

      >> > (1) of the strings is the empty string.

      [...]

      > not str1 ^ not str2

      [...]

      > not str1 != not str2

      These give syntax errors, actually. I wouldn't have expected that

      either, though :)

      > str1 != str2 and (str1+str2) == (str2+str1)

      Counter example:

      >>> str1 = "x"; str2 = "xx"

      >>> str1 != str2 and (str1+str2) == (str2+str1)

      True

      Bernhard

      --

      Intevation GmbH http://intevation.de/

      Skencil http://sketch.sourceforge.net/

      Thuban http://thuban.intevation.org/

      #36; Thu, 27 Dec 2007 00:27:00 GMT
    • Bernhard Herzog <bh.python.todaysummary.com.intevation.de> wrote:

      ...

      > > not str1 ^ not str2

      > > not str1 != not str2

      > These give syntax errors, actually. I wouldn't have expected that

      > either, though :)

      Yep, you need parentheses, as in (not str1) &c.

      > > str1 != str2 and (str1+str2) == (str2+str1)

      > Counter example:

      > >>> str1 = "x"; str2 = "xx"

      Yep, excellent counterexample, thanks. So let's try instead:

      (str1+str2==str2!=str1) or (str2+str1==str1!=str2)

      I _do_ really want something based on concatenation equaling one of the

      strings iff the other's empty...

      Alex

      #37; Thu, 27 Dec 2007 00:28:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      ...

      > Probably so, but that doesn't support the arguement that

      > there's something wrong with a logical xor argument coercing

      > it's operands to boolean values unless one also argues that

      > the logical "and", "or" and "not" operators should also not

      > coerce their operands to booleans.

      One of the greatest sources of usefulness for 'and' and 'or' is exactly

      that these operators _don't_ coerce their operands -- they always return

      one of the objects that are given as their operands, never any

      'coercion' of it. This lets one code, e.g.,

      k, v = (onedict or another).popitem()

      instead of

      if onedict:

      k, v = onedict.popitem()

      else:

      k, v = another.popitem()

      Operator 'xor' couldn't possibly ensure this useful behavior, alas.

      Alex

      #38; Thu, 27 Dec 2007 00:29:00 GMT
    • On 2004-10-11, Alex Martelli <aleaxit.python.todaysummary.com.yahoo.com> wrote:

      > Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      > ...

      >> Probably so, but that doesn't support the arguement that

      >> there's something wrong with a logical xor argument coercing

      >> it's operands to boolean values unless one also argues that

      >> the logical "and", "or" and "not" operators should also not

      >> coerce their operands to booleans.

      > One of the greatest sources of usefulness for 'and' and 'or' is exactly

      > that these operators _don't_ coerce their operands

      Surely they must coerce the operands for the "test" part if not

      for the "return" part.

      --

      Grant Edwards grante Yow! Being a BALD HERO

      at is almost as FESTIVE as a

      visi.com TATTOOED KNOCKWURST.

      #39; Thu, 27 Dec 2007 00:30:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      > On 2004-10-11, Alex Martelli <aleaxit.python.todaysummary.com.yahoo.com> wrote:

      > > Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      > > ...

      > >> Probably so, but that doesn't support the arguement that

      > >> there's something wrong with a logical xor argument coercing

      > >> it's operands to boolean values unless one also argues that

      > >> the logical "and", "or" and "not" operators should also not

      > >> coerce their operands to booleans.

      > > One of the greatest sources of usefulness for 'and' and 'or' is exactly

      > > that these operators _don't_ coerce their operands

      > Surely they must coerce the operands for the "test" part if not

      > for the "return" part.

      We may have different conceptions about the meaning of the word

      "coerce", perhaps...? For example, Python may be doing len(x) when I

      use x in a boolean context, but I sure don't see computing len(x) as

      "coercing" anything. To me, Coercion, in Python, is what's documented

      at <http://docs.python.org/ref/coercion-rules.html> and is quite a

      different concept from Truth Value Testing, which is documented at

      <http://docs.python.org/lib/truth.html>.

      Alex

      #40; Thu, 27 Dec 2007 00:31:00 GMT
    • On 2004-10-11, Alex Martelli <aleaxit.python.todaysummary.com.yahoo.com> wrote:

      >> Surely they must coerce the operands for the "test" part if not

      >> for the "return" part.

      > We may have different conceptions about the meaning of the word

      > "coerce", perhaps...? For example, Python may be doing len(x) when I

      > use x in a boolean context,

      I guess I was assuming Python was doing the equivalent of

      bool(x) when x was used as an operand of a logical operator.

      That's coercion to me.

      > but I sure don't see computing len(x) as "coercing" anything.

      > To me, Coercion, in Python, is what's documented at

      > <http://docs.python.org/ref/coercion-rules.html> and is quite

      > a different concept from Truth Value Testing, which is

      > documented at <http://docs.python.org/lib/truth.html>.

      I guess I'm misusing the term coercion (at least in a Python

      context). To _me_ operands of logical operators are being

      coerced to boolean.

      --

      Grant Edwards grante Yow! I'm gliding over a

      at NUCLEAR WASTE DUMP near

      visi.com ATLANTA, Georgia!!

      #41; Thu, 27 Dec 2007 00:32:00 GMT
    • Grant Edwards <grante.python.todaysummary.com.visi.com> wrote:

      > On 2004-10-11, Alex Martelli <aleaxit.python.todaysummary.com.yahoo.com> wrote:

      > >> Surely they must coerce the operands for the "test" part if not

      > >> for the "return" part.

      > > We may have different conceptions about the meaning of the word

      > > "coerce", perhaps...? For example, Python may be doing len(x) when I

      > > use x in a boolean context,

      > I guess I was assuming Python was doing the equivalent of

      > bool(x) when x was used as an operand of a logical operator.

      > That's coercion to me.

      Well, Python is doing just the same thing it was doing before 'bool' was

      introduced (in 2.2.1, I think) -- Truth Value Testing. I believe a very

      similar transition happened, for example, in the C++ language: any

      number or pointer was and still is acceptable in a "condition" context

      (e.g., for C, 'if(x) ...'). That used to be quite different from

      coercions. Then a 'bool' type was introduced -- the language still kept

      doing exactly the same thing for 'if(x)' and the like, but even though

      nothing was changed, it ``feels'' taxonomically different (in C++ the

      case is stronger, because a new 'operator bool' was also introduced; and

      with it new and important pitfalls, cfr

      <http://www.artima.com/cppsource/safebool.html> ).

      > > but I sure don't see computing len(x) as "coercing" anything.

      > > To me, Coercion, in Python, is what's documented at

      > > <http://docs.python.org/ref/coercion-rules.html> and is quite

      > > a different concept from Truth Value Testing, which is

      > > documented at <http://docs.python.org/lib/truth.html>.

      > I guess I'm misusing the term coercion (at least in a Python

      > context). To _me_ operands of logical operators are being

      > coerced to boolean.

      Didactically, I still prefer to say they're "being truth-value tested",

      because the concept of a hypothetical coercion which happens during the

      testing then "un-happens" to determine the result is just impossible to

      get across -- even if I thought it somehow more correct or preferable,

      which I don't. Note, again, that in C++ the case to consider

      truth-value testing as coercion is stronger, because && and ||

      (shortcircuiting logical operators) _DO_ undertake to always return a

      bool, when applied to C++'s built-in types. I consider this a

      suboptimal decision, prompting people to code 'x?x:y' or 'x?y:x' where

      they might have coded, more simply, x||y or x&&y respectively, were it

      not for the coercion which the latter forms imply.

      BTW, if you consider a form such as 'x?x:y', which is a better match for

      the semantics of Python's 'x or y' than C++'s || operator, you will see

      why speaking of operandS (plural) as being 'coerced' is surely

      misleading. Under _no_ conceivable circumstances is the RHS operand

      subject to any operation whatsoever by 'x or y'. Whatever "coercion"

      is, it definitely cannot be happening here on y. Similarly for the

      analogous 'x and y' -- never is any operation at all done on y.

      Truth Value Testing (which you want to call coercion) happens on the

      first (LHS) operand, but not on the second (RHS) one, ever...

      Alex

      #42; Thu, 27 Dec 2007 00:33:00 GMT
    • On Mon, 11 Oct 2004 04:24:28 +0000, Grant Edwards wrote:

      > On 2004-10-11, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >> With respect, this isn't something to doubt or not doubt.

      >> There is one, and only one, way to represent any positive

      >> number in base two, since encoding sign is not an issue.

      >> Assuming an extra bit to show sign, there is one and only one

      >> way to represent any negative number, too.

      > That's news to me. I've used three different base-2

      > representations for negative numbers in the past week, and I

      > can think of at least one other one I've used in the past.

      I am aware of only one encoding that uses a single bit to represent sign,

      as I stipulated, and discarding endianness issues I'm having a hard time

      imagining what reasonable alternatives there are.

      >> (Zero gets to be the exception since then you can have

      >> positive and negative zero,

      > That depends on which base-2 representation you've chosen. In

      > two's compiliment and excess-N representations, there is only

      > one zero value. In signed-magnitude there may be two.

      I explicitly only discussed signed-magnitude: "Assuming an extra bit to

      show sign".

      Normally I'd exhort you to read more carefully but Bengt pointed out some

      places I wasn't rigorous with my terms. :-) So I earned this.

      #43; Thu, 27 Dec 2007 00:34:00 GMT
    • On 2004-10-11, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      >>> With respect, this isn't something to doubt or not doubt.

      >>> There is one, and only one, way to represent any positive

      >>> number in base two, since encoding sign is not an issue.

      >>> Assuming an extra bit to show sign, there is one and only one

      >>> way to represent any negative number, too.

      >>

      >> That's news to me. I've used three different base-2

      >> representations for negative numbers in the past week, and I

      >> can think of at least one other one I've used in the past.

      > I am aware of only one encoding that uses a single bit to

      > represent sign, as I stipulated, and discarding endianness

      > issues I'm having a hard time imagining what reasonable

      > alternatives there are.

      * Two's compliment.

      * One's compliment.

      * Signed-magnitude with a "1" sign bit being positive.

      * Signed-magnitude with a "1" sign bit being negative.

      * Excess-N notation.

      Four of the five are in use in the software I'm working on

      today.

      >>> (Zero gets to be the exception since then you can have

      >>> positive and negative zero,

      >>

      >> That depends on which base-2 representation you've chosen. In

      >> two's compiliment and excess-N representations, there is only

      >> one zero value. In signed-magnitude there may be two.

      > I explicitly only discussed signed-magnitude: "Assuming an

      > extra bit to show sign".

      I don't know what you mean. Two's compliment, one's

      compliment, signed-magnitude, and excess-N _all_ use a single

      bit for sign.

      --

      Grant Edwards grante Yow! Kids, don't gross me

      at off... "Adventures with

      visi.com MENTAL HYGIENE" can be

      carried too FAR!

      #44; Thu, 27 Dec 2007 00:35:00 GMT
    • On Mon, 11 Oct 2004 19:35:55 +0000, Grant Edwards wrote:

      > I don't know what you mean. Two's compliment, one's

      > compliment, signed-magnitude, and excess-N _all_ use a single

      > bit for sign.

      I meant a single bit exclusively for sign, such that

      1 00001001

      and

      0 00001001

      are the same absolute value. The complements don't work that way. I

      guess that does leave a question for whether 1 is positive or negative,

      but for the purposes of my point, that doesn't matter much.

      #45; Thu, 27 Dec 2007 00:36:00 GMT
    • On 2004-10-11, Jeremy Bowers <jerf.python.todaysummary.com.jerf.org> wrote:

      > On Mon, 11 Oct 2004 19:35:55 +0000, Grant Edwards wrote:

      >> I don't know what you mean. Two's compliment, one's

      >> compliment, signed-magnitude, and excess-N _all_ use a single

      >> bit for sign.

      > I meant a single bit exclusively for sign, such that

      > 1 00001001

      > and

      > 0 00001001

      > are the same absolute value. The complements don't work that

      > way. I guess that does leave a question for whether 1 is

      > positive or negative, but for the purposes of my point, that

      > doesn't matter much.

      OK, if you meant signed-magnitude, you should have said that. ;)

      AFAICT, the only people that use signed-magnitude for _integer_

      quantities [IEEE FP representations are signed-magnitude] are

      the analog guys who design A/D converters [and somebody needs

      to slap them and tell them that out in the real world we want

      2's compliment].

      I haven't _ever_ seen a signed-magnitude integer ALU, but I'm

      sure some existed back before I started doing this sort of

      stuff 25 years ago.

      --

      Grant Edwards grante Yow! My TOYOTA is built

      at like a... BAGEL with CREAM

      visi.com CHEESE!!

      #46; Thu, 27 Dec 2007 00:37:00 GMT
    • On Mon, 11 Oct 2004 22:25:26 +0000, Grant Edwards wrote:

      > OK, if you meant signed-magnitude, you should have said that. ;)

      > AFAICT, the only people that use signed-magnitude for _integer_

      > quantities [IEEE FP representations are signed-magnitude] are

      I never said they existed :-)

      #47; Thu, 27 Dec 2007 00:38:00 GMT
    • Grant Edwards wrote:

      >>One of the greatest sources of usefulness for 'and' and 'or' is exactly

      >>that these operators _don't_ coerce their operands

      >

      > Surely they must coerce the operands for the "test" part if not

      > for the "return" part.

      Is calling a method cosidered coercing? The "test" part must call the

      object's __nonzero__ method but no Python boolean object is ever created.

      Cheers,

      Brian

      #48; Thu, 27 Dec 2007 00:39:00 GMT
    • I suggest you have a look at the following:

      http://aspn.activestate.com/ASPN/Co...n/Recipe/252237

      I believe it does what you want (and has little to do with the

      follow-up discussion so far!)

      Andre Roberge

      dataangel <k04jg02.python.todaysummary.com.kzoo.edu> wrote in message news:<mailman.4604.1097266756.5135.python-list.python.todaysummary.com.python.org>...

      > I wrote a function to compare whether two strings are "similar" because

      > I'm using python to make a small text adventure engine and I want to it

      > to be responsive to slight mispellings, like "inevtory" and the like. To

      > save time the first thing my function does is check if I'm comparing an

      > empty vs. a non-empty string, because they are to be never considered

      > similar. Right now I have to write out the check like this:

      > if str1 and str2:

      > if not str1 or not str2:

      > return 0

      > Because python won't let me do str1 ^ str2. Why does this only work for

      > numbers? Just treat empty strings as 0 and nonempty as 1.

      > Regardless of whether this is the best implementation for detecting if

      > two strings are similar, I don't see why xor for strings shouldn't be

      > supported. Am I missing something? Inparticular, I think it'd be cool to

      > have "xor" as opposed to "^". The carrot would return the resulting

      > value, while "xor" would act like and/or do and return the one that was

      > true (if any).

      #49; Thu, 27 Dec 2007 00:40:00 GMT
    • Alex

      > str1 != str2 and (str1+str2) == (str2+str1)

      > I guess I'd better stop here because I can't actually _prove_ the last

      > one of these actually implies one of the strings is empty but I can't

      > find counterexamples either...!-)

      str1 = "a"

      str2 = "aa"

      Andrew

      dalke.python.todaysummary.com.dalkescientific.com

      #50; Thu, 27 Dec 2007 00:41:00 GMT