Improve hash() builtin docstring with caveats.#125229
Improve hash() builtin docstring with caveats.#125229gpshead wants to merge 3 commits intopython:mainfrom
hash() builtin docstring with caveats.#125229Conversation
Mention its return type and that the value can be expected to change between processes (hash randomization). Why? The `hash` builtin gets reached for and used by a lot of people whether it is the right tool or not. IDEs surface docstrings and people use pydoc and `help(hash)`. There are more possible caveats we could go into here such as classes implementing their own dunder methods like `__eq__` or `__hash__` naturally being able to violate the constraint stated in this docstring. But _that_ feels like too much for a beginner friendly docstring.
sobolevn
left a comment
There was a problem hiding this comment.
Should we explicitly say that hash should not be used for hashing in crypto operations?
This is something I've noticed several times.
However, sometimes adding a note to not do something actually make people do this even more 🙈
ncoghlan
left a comment
There was a problem hiding this comment.
General idea seems sound, wordsmithing is tricky :(
|
I like the "within this process" language, and while I can see that it is also covered by something in another paragraph... I also think this is a thing that is empirically very easy for people to miss or get confused by, and I think the extra couple of words to make sure that someone who reads one paragraph and stops there because they think they know what the function does are probably a very good investment in reduced user pain. |
|
This PR is stale because it has been open for 30 days with no activity. |
Merges current upstream main into the branch (the PR had gone stale since Oct 2024) and rewords the hash() builtin docstring to incorporate review comments from @JelleZijlstra, @ncoghlan, @nedbat, and others: - Drop "within this process" from the first sentence (Jelle, ncoghlan): the process caveat is still stated plainly in the body, so the first-line qualifier is redundant. - Drop the "dict and set hash tables" sentence (Jelle, ncoghlan): implementation details of built-in containers don't belong in the user- facing docstring for hash(). - Add the "Not all objects are hashable; ... raises TypeError" note (Jelle's suggestion, phrasing informed by ncoghlan's rewrite). - Keep the original "equal => same hash, but not the reverse" formulation, which is what the Python data model documents. - Do not add a cryptographic warning (sobolevn raised the question but also flagged that such warnings can encourage misuse). https://claude.ai/code/session_01UumhMLqHB8fqUuNWsC8RCe
Mention its return type and that the value can be expected to change between processes (hash randomization).
Why? The
hashbuiltin gets reached for and used by a lot of people whether it is the right tool or not. IDEs surface docstrings and people use pydoc andhelp(hash).There are more possible caveats we could go into here such as classes implementing their own dunder methods like
__eq__or__hash__naturally being able to violate the constraint stated in this docstring. But that feels like too much for a beginner friendly docstring.