Showing content from http://mail.python.org/pipermail/python-dev/attachments/20160609/7a28ef80/attachment-0001.html below:
<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jun 9, 2016, at 7:25 AM, Larry Hastings <<a href="mailto:larry@hastings.org" class="">larry@hastings.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); float: none; display: inline !important;" class="">A problem has surfaced just this week in 3.5.1. Obviously this is a good time to fix it for 3.5.2. But there's a big argument over what is "broken" and what is an appropriate "fix".</span></div></blockquote></div><br class=""><div class=""><br class="webkit-block-placeholder"></div><div class=""><div class="">Couple clarifications:</div><div class=""><br class=""></div><div class="">random.py</div><div class="">---------</div><div class=""><br class=""></div><div class="">In the abstract it doesn't hurt to seed MT with a CSPRNG, it just doesn't</div><div class="">provide much (if any) benefit and in this case it is hurting us because of the</div><div class="">cost on import (which will exist on other platforms as well no matter what we</div><div class="">do here for Linux). There are a couple solutions to this problem:</div><div class=""><br class=""></div><div class="">* Use getrandom(GRND_NONBLOCK) for random.Random since it doesn't matter if we</div><div class=""> get cryptographically secure random numbers or not.</div><div class=""><br class=""></div><div class="">* Switch it to use something other than a CSPRNG by default since it doesn't</div><div class=""> need that.</div><div class=""><br class=""></div><div class="">* Instead of seeding itself from os.urandom on import, have it lazily do that</div><div class=""> the first time one of the random.rand* functions are called.</div><div class=""><br class=""></div><div class="">* Do nothing, and say that ``import random`` relies on having the kernel's</div><div class=""> urandom pool initialized.</div><div class=""><br class=""></div><div class="">Between these options, I have a slight preference for switching it to use a non</div><div class="">CSPRNG, but I really don't care that much which of these options we pick. Using</div><div class="">random.Random is not secure and none of the above options meaningfully change</div><div class="">the security posture of something that accidently uses it.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">SipHash and the Interpreter Startup</div><div class="">-----------------------------------</div><div class=""><br class=""></div><div class="">I have complicated thoughts on what SipHash should do. For something like, a</div><div class="">Django process, we never want it to be initialized with âbadâ entropy, however</div><div class="">reading straight from /dev/urandom, or getrandom(GRND_NONBLOCK) means that we</div><div class="">might get that if we start the process early enough in the boot process. The</div><div class="">rub here is that I cannot think of a situation where by the time youâre at the</div><div class="">point youâre starting up something like Django, youâre even remotely likely to</div><div class="">not have an initialized random pool. The other side of this issue is that we</div><div class="">have Python scripts which do not need a secure random being passed to SipHash</div><div class="">running early enough in the boot process with systemd that we need to be able</div><div class="">to have SipHash initialization not block on waiting for /dev/urandom.</div><div class=""><br class=""></div><div class="">So Iâm torn between the âPracticality beats Purityâ mindset, which says we</div><div class="">should just let SipHash seed itself with whatever quality of random from the</div><div class="">urandom pool is currently available and the âSpecial cases arenât special</div><div class="">enough to break the rulesâ mindset which says that we should just make it</div><div class="">easier for scripts in this edge case to declare they donât care about hash</div><div class="">randomization to remove the need for it (in other words, a CLI flag that</div><div class="">matches PYTHONHASHSEED in functionality). An additional wrinkle in the mix is</div><div class="">that we cannot get non-blocking random on many (any?) modern OS besides Linux,</div><div class="">so we're going to run into this same problem if say, FreeBSD decides to put a</div><div class="">Python script early enough in the boot sequence.</div><div class=""><br class=""></div><div class="">In the end, both of these choices make me happy and unhappy in different ways</div><div class="">but I would lean towards adding a CLI flag for the special case and letting the</div><div class="">systemd script that caused this problem invoke their Python with that flag. I</div><div class="">think this because:</div><div class=""><br class=""></div><div class="">* It leaves the interpreter so that it is secure by default, but provides the</div><div class=""> relevant knobs to turn off this default in cases where a user doesn't need</div><div class=""> or want it.</div><div class="">* It solves the problem in a cross platform way, that doesn't rely on the</div><div class=""> nuances of the CSPRNG interface on one particular supported platform.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">os.urandom</div><div class="">----------</div><div class=""><br class=""></div><div class="">There have been a lot of proposals thrown around, and people pointing to</div><div class="">different sections of the documentation to justify different opinions. This is</div><div class="">easily the most contentious question we have here.</div><div class=""><br class=""></div><div class="">It is my belief that reading from urandom is the right thing to do for</div><div class="">generating cryptographically secure random numbers. This is a view point held</div><div class="">by every major security expert and cryptographer that I'm aware of. Most (all?)</div><div class="">major platforms besides Linux do not allow reading from their equivalent of</div><div class="">/dev/urandom until it has been successfully initialized and it is widely held</div><div class="">by all security experts and cryptographers that I'm aware of that this property</div><div class="">is a good one, and the Linux behavior of /dev/urandom is a wart/footgun but</div><div class="">that prior to getrandom() there simply wasn't a better option on Linux.</div><div class=""><br class=""></div><div class="">With that in mind, I think that we should, to the best of our ability given the</div><div class="">platform we're on, ensure that os.urandom does not return bytes that the OS</div><div class="">does not think is cryptographically secure.</div><div class=""><br class=""></div><div class="">In practice this means that os.urandom should do one of two things in the very</div><div class="">early boot process on Linux:</div><div class=""><br class=""></div><div class="">* Block waiting on the kernel to initialize the urandom pool, and then return</div><div class=""> the now secure random bytes given to us.</div><div class="">* Raise an exception saying that the pool has not been initialized and thus</div><div class=""> os.urandom is not ready yet.</div><div class=""><br class=""></div><div class="">The key point in both of these options is that os.urandom never [1] returns</div><div class="">bytes prior to the OS believing that it can give us cryptographically secure</div><div class="">random bytes.</div><div class=""><br class=""></div><div class="">I believe I have a preference for blocking on waiting the kernel to intialize</div><div class="">the urandom pool, because that makes Linux behave similarly to the other</div><div class="">platforms that I'm aware of.</div><div class=""><br class=""></div><div class="">I do not believe that adding additional public functions like some other people</div><div class="">have expressed to be a good option. I think they muddy the waters and I think</div><div class="">that it forces us to try and convince people that "no really, yes everyone</div><div class="">says you should use urandom, but you actually want getrandom". Particularly</div><div class="">since the outcome of these two functions would be exactly the same in all but</div><div class="">a very narrow edge case on Linux.</div><div class=""><br class=""></div><div class="">Larry has suggested that os.py should only ever be thin shells around OS</div><div class="">provided functionality and thus os.urandom should simply mimic whatever the</div><div class="">behavior of /dev/urandom is on that OS. For os.urandom in particular this is</div><div class="">already not the case since it calls CryptGetRandom on Windows, but putting that</div><div class="">aside since that's a Windows vs POSIX difference, we're not talking about</div><div class="">adding a great amount of functionality around something provided by the OS.</div><div class="">We're only talking about using a different interface to access the same</div><div class="">underlying functionality. In this case, an interface that better suits the</div><div class="">actual use of os.urandom in the wild and provides better properties all around.</div><div class=""><br class=""></div><div class="">He's also pointed out that the documentation does not guarantee that the result</div><div class="">of os.urandom will be cryptographically strong in the following quote:</div><div class=""><br class=""></div><div class=""> This function returns random bytes from an OS-specific randomness source.</div><div class=""> The returned data should be unpredictable enough for cryptographic</div><div class=""> applications, though its exact quality depends on the OS implementation. </div><div class=""><br class=""></div><div class="">My read of this quote, is that this is a hedge against operating systems that</div><div class="">have implemented their urandom pool in such a way that it does not return</div><div class="">cryptographically secure random numbers that you don't come back and yell at</div><div class="">Python for it. In other words, it's a hedge against /dev/urandom being</div><div class=""><a href="https://xkcd.com/221/" class="">https://xkcd.com/221/</a>. I do not think this documentation excuses us from using</div><div class="">a weaker interface to the OS-specific randomness source simply because it's</div><div class="">name happens to match the name of the function. Particularly since earlier on</div><div class="">in that documentation it states:</div><div class=""><br class=""></div><div class=""> Return a string of n random bytes suitable for cryptographic use.</div><div class=""><br class=""></div><div class="">and the Python standard library, and the entire ecosystem as I know it, as well</div><div class="">as all security experts and crypto experts believe you should treat it as such.</div><div class="">This is largely because if your urandom pool is implemented in a way that, in</div><div class="">the general case it provides insecure random values, then you're beyond the</div><div class="">pale and there's nothing that Python, or anyone but your OS vendor, can do to</div><div class="">help you.</div><div class=""><br class=""></div><div class="">Further more, I think that the behavior I want (that os.urandom is secure by</div><div class="">default to the best of our abilities) is tricker to get right, and requires</div><div class="">interfacing with C code. However, getting the exact semantics of /dev/urandom</div><div class="">on Linux is trivial to do with a single line of Python code:</div><div class=""><br class=""></div><div class=""> def urandom(amt): open("/dev/urandom", "rb").read(amt)</div><div class=""><br class=""></div><div class="">So if you're someone who is depending on the Linux urandom behavior in an edge</div><div class="">case that almost nobody is going to hit, you can trivially get the old behavior</div><div class="">back. Even better, if you're someone depending on this, you're going to get an</div><div class="">*obvious* failure rather than silently getting insecure bytes. On top of all of</div><div class="">that, this only matters in a small edge case, most likely to only ever been hit</div><div class="">by OS vendors themselves, who are in the best position to make informed</div><div class="">decisions about how to work around the fact the urandom entropy pool hasn't</div><div class="">already been initialized rather than expecting every other user to have to try</div><div class="">and ensure that they don't start their Python script too early.</div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">[1] To the best of our ability, given the interfaces and implementation</div><div class=""> provided to us by the OS.</div></div><div class=""><br class=""></div><div class=""><br class=""></div><div class="">
<div style="color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class="">â<br class="">Donald Stufft<br class=""></div><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""></div><br class="Apple-interchange-newline">
</div>
<br class=""></body></html>
RetroSearch is an open source project built by @garambo
| Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4