Skip to content

Crash in _interpreters.create() when config string has an unpaired surrogate #148798

@rajkripal

Description

@rajkripal

Crash Report

Crash description

_interpreters.create() segfaults when the config object exposes a string attribute (e.g. gil) containing an unpaired surrogate. The C helper _config_dict_copy_str calls PyUnicode_AsUTF8() and passes the result straight to strncpy() without a NULL check. When the string can't be UTF-8 encoded, PyUnicode_AsUTF8() returns NULL and sets UnicodeEncodeError, but the NULL then reaches strncpy and the interpreter crashes.

Lone surrogates are reachable from pure Python ('\udc80', chr(0xDC80)), and also show up naturally via surrogateescape — e.g. filenames, env vars, or argv with non-UTF-8 bytes that get forwarded into a config dict.

Related precedent: gh-126221 (same module, same module crash class from pure Python input).

Reproducer

import _interpreters

class BadConfig:
    use_main_obmalloc = False
    allow_fork = False
    allow_exec = False
    allow_threads = False
    allow_daemon_threads = False
    check_multi_interp_extensions = False
    own_gil = True
    gil = 'own\udc80'

_interpreters.create(BadConfig())

Expected: UnicodeEncodeError.
Actual: segfault (exit 139).

Reproduced on 3.14.3 and current main.

Error messages

zsh: segmentation fault  python3 repro.py

Your environment

Linked PRs

(fix ready, will link once this is filed)

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)topic-subinterpreterstype-crashA hard crash of the interpreter, possibly with a core dump

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions