You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a follow-on from #289 and also encompasses some or all of #241, #274, #262, #332, #242, #321, and #338. My goal is to make this general story better, ASAP, while leaving remaining work clearly defined (regardless of who ends up taking a swing at it).
I need to doublecheck some of those linked issues that I've not spent a lot of time on, but the bulk of discussion is in #289 and #274, much of it driven by @pfmoore who (as a Python 3 on Windows user) serves as a very competent & useful counterweight to myself (a default-Python-2 on Unix developer). Any solution that works for both of us is likely one that serves most users.
The basics that fall out of what's been said there so far:
Where & how bytes/strings are decoded/encoded, needs to be better defined/documented/contracted;
Encoding/decoding should be pushed as far out as possible, to the "edges" of the code, to limit how often it happens & becomes an issue;
But also centralized further - the code handling it needs to be clearly labeled & no faffing about should happen outside that one area.
The default behavior should encompass as many use cases as is reasonable to do, and should degrade as gracefully as it can otherwise (re: exceptions, vs replacements, vs ???);
But users should also be given the ability to work around that default cleanly & fully whenever it doesn't function for their use case.
Ideally, we'll balance these two things well instead of using "the user can twiddle the knobs we provided" as an excuse to let the default behavior suffer. But this is hard and time is short, so...
TODO
Triple check existing locations where we explicitly or implicitly encode/decode, and move them all somewhere centralized & obvious (without otherwise altering them).
Right now the only explicit mentions of (encode|decode)\( are a bare handful in runner.py, so there's likely others that are implicit instead.
Not sure what the "most obvious" place is, but given most of this is directly tied to run() and friends, it may still just live in the Runner class or runner.py generally.
Identify & implement additional tests (integration or regular) for use cases not represented by the funky_characters_in_stdout integration tests
Those tests should have caught some of the basic e.g. cat tree -> kaboom bugs that snuck in during 0.11.x, but did not.
I identified the basic reason why & have updated it so they fail again, in a branch.
But they only represent a single slice of the problem - look at the discussion in Fixed a number of encoding issues (in run context) #274, as well as the other top level issues above, and see what other tests can be added that currently fail.
Ideally at least some of these would go in the regular, not integration, suite.
If not implied in above: probably makes sense to add more encoding config toggles (maybe up to "one for each stream") so users can override them granularly; the single encoding is nice and all, but won't work in cases where e.g. one's invoke-invoking terminal is Encoding A but the environment the subprocess is operating in is using Encoding B.
Handle, or surface into this or other tickets, any TODOs added in the branch left undone
See if @pfmoore has time for immediate feedback to those changes (sadly I wasn't able to give him a thumbs up for his offer of making more PRs on his end, back in Feb) just in case.
Re: comments below, update read_proc_output so we always expect bytes:
Enhance docstring(s) to make it clear that any file-like objects given explicitly, need to yield bytes
Remove the "if" statement
Update our test suites to use BytesIO instead of StringIO (& any related string literal changes, or maybe a wrapper if lazy).
Update changelog entry for any fully-closed tickets & their authors
Background
This is a follow-on from #289 and also encompasses some or all of #241, #274, #262, #332, #242, #321, and #338. My goal is to make this general story better, ASAP, while leaving remaining work clearly defined (regardless of who ends up taking a swing at it).
I need to doublecheck some of those linked issues that I've not spent a lot of time on, but the bulk of discussion is in #289 and #274, much of it driven by @pfmoore who (as a Python 3 on Windows user) serves as a very competent & useful counterweight to myself (a default-Python-2 on Unix developer). Any solution that works for both of us is likely one that serves most users.
The basics that fall out of what's been said there so far:
TODO
(encode|decode)\(are a bare handful inrunner.py, so there's likely others that are implicit instead.run()and friends, it may still just live in theRunnerclass orrunner.pygenerally.funky_characters_in_stdoutintegration testscat tree -> kaboombugs that snuck in during 0.11.x, but did not.runcontext) #274, as well as the other top level issues above, and see what other tests can be added that currently fail.runcontext) #274 or with something newencodingis nice and all, but won't work in cases where e.g. one's invoke-invoking terminal is Encoding A but the environment the subprocess is operating in is using Encoding B.read_proc_outputso we always expect bytes:BytesIOinstead ofStringIO(& any related string literal changes, or maybe a wrapper if lazy).