Tags: bysubprocess, encodage, hazard, hifrom, popen, programming, python, return, stdout, strings, subprocesspopen, time, windows
Windows, subprocess.Popen & encodage
From long time, I have problems with strings return, in Windows, by
subprocess.Popen / stdout.read()
Last night, I found, by hazard, than if the second byte equal 0, it's,
perhaps, the solution.
With a code like this:
p=subprocess.Popen(u850("cmd /u/c ...
Diffrents scripts seem run OK. I had try with:
- common dir
- dir on unicode-named-files
- various commands
But, I don't found anything, in any documentations, on this.
Sombody can confirm? Am I misled? Am I right?
* sorry for my bad english*
Leave a comment...
- 2 Comments
- > But, I don't found anything, in any documentations, on this.
> Sombody can confirm? Am I misled? Am I right?
You are right, and you are misled. The encoding of the data
that you get from Popen.read is not under the control of Python:
i.e. not only you don't know, but Python doesn't know, either.
The operating system simply has no mechanism of indicating
what encoding is used on a pipe.
So different processes may chose different encodings. Some
may produce UTF-16, others may produce CP-850, yet others
UTF-8, and so on. There really is no way to tell other than
reading the documentation *of the program you run*, and,
failing that, reading the source code of the program you
On Windows, many programs will indeed use one of the
two system code pages, or UTF-16. It's true that
UTF-16 can be quite reliably detected by looking at the
first two bytes. However, the two system code pages
(OEM CP and ANSI CP) are not so easy to tell apart.
Martin#1; Sat, 26 Apr 2008 22:52:00 GMT
- Thank you.
Michel Claveau#2; Sat, 26 Apr 2008 22:53:00 GMT