Like in the mktemp thread earlier, I would request a threat model (what use cases are supposed to be protected (in this case, by reporting rather than preventing) and from what threats) -- in the discussion, and eventually, in the PEP. Without one, any claims and talks about whether something would be an effective security measure are pointless -- 'cuz you would never know if you accounted for everything and would not even have the definition of that "everything". On 29.03.2019 1:35, Steve Dower wrote: > Hi all > > Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into Python 3.8. Here's the current text for review and comment before I > submit to the Steering Council. > > The formatted text is at https://www.python.org/dev/peps/pep-0578/ (update just pushed, so give it an hour or so, but it's fundamentally > the same as what's there) > > No Discourse post, because we don't have a python-dev equivalent there yet, so please reply here for this one. > > Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and my backport to 3.7 > (https://github.com/zooba/cpython/tree/pep-578-3.7/) is already getting some real use (though this will not be added to 3.7, unless people > *really* want it, so the backport is just for reference). > > Cheers, > Steve > > ===== > > PEP: 578 > Title: Python Runtime Audit Hooks > Version: $Revision$ > Last-Modified: $Date$ > Author: Steve Dower <steve.dower at python.org> > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 16-Jun-2018 > Python-Version: 3.8 > Post-History: > > Abstract > ======== > > This PEP describes additions to the Python API and specific behaviors > for the CPython implementation that make actions taken by the Python > runtime visible to auditing tools. Visibility into these actions > provides opportunities for test frameworks, logging frameworks, and > security tools to monitor and optionally limit actions taken by the > runtime. > > This PEP proposes adding two APIs to provide insights into a running > Python application: one for arbitrary events, and another specific to > the module import system. The APIs are intended to be available in all > Python implementations, though the specific messages and values used > are unspecified here to allow implementations the freedom to determine > how best to provide information to their users. Some examples likely > to be used in CPython are provided for explanatory purposes. > > See PEP 551 for discussion and recommendations on enhancing the > security of a Python runtime making use of these auditing APIs. > > Background > ========== > > Python provides access to a wide range of low-level functionality on > many common operating systems. While this is incredibly useful for > "write-once, run-anywhere" scripting, it also makes monitoring of > software written in Python difficult. Because Python uses native system > APIs directly, existing monitoring tools either suffer from limited > context or auditing bypass. > > Limited context occurs when system monitoring can report that an > action occurred, but cannot explain the sequence of events leading to > it. For example, network monitoring at the OS level may be able to > report "listening started on port 5678", but may not be able to > provide the process ID, command line, parent process, or the local > state in the program at the point that triggered the action. Firewall > controls to prevent such an action are similarly limited, typically > to process names or some global state such as the current user, and > in any case rarely provide a useful log file correlated with other > application messages. > > Auditing bypass can occur when the typical system tool used for an > action would ordinarily report its use, but accessing the APIs via > Python do not trigger this. For example, invoking "curl" to make HTTP > requests may be specifically monitored in an audited system, but > Python's "urlretrieve" function is not. > > Within a long-running Python application, particularly one that > processes user-provided information such as a web app, there is a risk > of unexpected behavior. This may be due to bugs in the code, or > deliberately induced by a malicious user. In both cases, normal > application logging may be bypassed resulting in no indication that > anything out of the ordinary has occurred. > > Additionally, and somewhat unique to Python, it is very easy to affect > the code that is run in an application by manipulating either the > import system's search path or placing files earlier on the path than > intended. This is often seen when developers create a script with the > same name as the module they intend to use - for example, a > ``random.py`` file that attempts to import the standard library > ``random`` module. > > This is not sandboxing, as this proposal does not attempt to prevent > malicious behavior (though it enables some new options to do so). > See the `Why Not A Sandbox`_ section below for further discussion. > > Overview of Changes > =================== > > The aim of these changes is to enable both application developers and > system administrators to integrate Python into their existing > monitoring systems without dictating how those systems look or behave. > > We propose two API changes to enable this: an Audit Hook and Verified > Open Hook. Both are available from Python and native code, allowing > applications and frameworks written in pure Python code to take > advantage of the extra messages, while also allowing embedders or > system administrators to deploy builds of Python where auditing is > always enabled. > > Only CPython is bound to provide the native APIs as described here. > Other implementations should provide the pure Python APIs, and > may provide native versions as appropriate for their underlying > runtimes. Auditing events are likewise considered implementation > specific, but are bound by normal feature compatibility guarantees. > > Audit Hook > ---------- > > In order to observe actions taken by the runtime (on behalf of the > caller), an API is required to raise messages from within certain > operations. These operations are typically deep within the Python > runtime or standard library, such as dynamic code compilation, module > imports, DNS resolution, or use of certain modules such as ``ctypes``. > > The following new C APIs allow embedders and CPython implementors to > send and receive audit hook messages:: > > # Add an auditing hook > typedef int (*hook_func)(const char *event, PyObject *args, > void *userData); > int PySys_AddAuditHook(hook_func hook, void *userData); > > # Raise an event with all auditing hooks > int PySys_Audit(const char *event, PyObject *args); > > # Internal API used during Py_Finalize() - not publicly accessible > void _Py_ClearAuditHooks(void); > > The new Python APIs for receiving and raising audit hooks are:: > > # Add an auditing hook > sys.addaudithook(hook: Callable[[str, tuple]]) > > # Raise an event with all auditing hooks > sys.audit(str, *args) > > > Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time, > including before ``Py_Initialize()``, or by calling > ``sys.addaudithook()`` from Python code. Hooks cannot be removed or > replaced. > > When events of interest are occurring, code can either call > ``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The > string argument is the name of the event, and the tuple contains > arguments. A given event name should have a fixed schema for arguments, > which should be considered a public API (for each x.y version release), > and thus should only change between feature releases with updated > documentation. > > For maximum compatibility, events using the same name as an event in > the reference interpreter CPython should make every attempt to use > compatible arguments. Including the name or an abbreviation of the > implementation in implementation-specific event names will also help > prevent collisions. For example, a ``pypy.jit_invoked`` event is clearly > distinguised from an ``ipy.jit_invoked`` event. > > When an event is audited, each hook is called in the order it was added > with the event name and tuple. If any hook returns with an exception > set, later hooks are ignored and *in general* the Python runtime should > terminate. This is intentional to allow hook implementations to decide > how to respond to any particular event. The typical responses will be to > log the event, abort the operation with an exception, or to immediately > terminate the process with an operating system exit call. > > When an event is audited but no hooks have been set, the ``audit()`` > function should impose minimal overhead. Ideally, each argument is a > reference to existing data rather than a value calculated just for the > auditing call. > > As hooks may be Python objects, they need to be freed during > ``Py_Finalize()``. To do this, we add an internal API > ``_Py_ClearAuditHooks()`` that releases any Python hooks and any > memory held. This is an internal function with no public export, and > we recommend it raise its own audit event for all current hooks to > ensure that unexpected calls are observed. > > Below in `Suggested Audit Hook Locations`_, we recommend some important > operations that should raise audit events. > > Python implementations should document which operations will raise > audit events, along with the event schema. It is intentional that > ``sys.addaudithook(print)`` be a trivial way to display all messages. > > Verified Open Hook > ------------------ > > Most operating systems have a mechanism to distinguish between files > that can be executed and those that can not. For example, this may be an > execute bit in the permissions field, a verified hash of the file > contents to detect potential code tampering, or file system path > restrictions. These are an important security mechanism for preventing > execution of data or code that is not approved for a given environment. > Currently, Python has no way to integrate with these when launching > scripts or importing modules. > > The new public C API for the verified open hook is:: > > # Set the handler > typedef PyObject *(*hook_func)(PyObject *path, void *userData) > int PyImport_SetOpenForImportHook(hook_func handler, void *userData) > > # Open a file using the handler > PyObject *PyImport_OpenForImport(const char *path) > > The new public Python API for the verified open hook is:: > > # Open a file using the handler > importlib.util.open_for_import(path : str) -> io.IOBase > > > The ``importlib.util.open_for_import()`` function is a drop-in > replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is > to open a file for raw, binary access. To change the behaviour a new > handler should be set. Handler functions only accept ``str`` arguments. > The C API ``PyImport_OpenForImport`` function assumes UTF-8 encoding. > > A custom handler may be set by calling ``PyImport_SetOpenForImportHook()`` > from C at any time, including before ``Py_Initialize()``. However, if a > hook has already been set then the call will fail. When > ``open_for_import()`` is called with a hook set, the hook will be passed > the path and its return value will be returned directly. The returned > object should be an open file-like object that supports reading raw > bytes. This is explicitly intended to allow a ``BytesIO`` instance if > the open handler has already read the entire file into memory. > > Note that these hooks can import and call the ``_io.open()`` function on > CPython without triggering themselves. They can also use ``_io.BytesIO`` > to return a compatible result using an in-memory buffer. > > If the hook determines that the file should not be loaded, it should > raise an exception of its choice, as well as performing any other > logging. > > All import and execution functionality involving code from a file will > be changed to use ``open_for_import()`` unconditionally. It is important > to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go > through this function - an audit hook that includes the code from these > calls is the best opportunity to validate code that is read from the > file. Given the current decoupling between import and execution in > Python, most imported code will go through both ``open_for_import()`` > and the log hook for ``compile``, and so care should be taken to avoid > repeating verification steps. > > There is no Python API provided for changing the open hook. To modify > import behavior from Python code, use the existing functionality > provided by ``importlib``. > > API Availability > ---------------- > > While all the functions added here are considered public and stable API, > the behavior of the functions is implementation specific. Most > descriptions here refer to the CPython implementation, and while other > implementations should provide the functions, there is no requirement > that they behave the same. > > For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but > may do nothing. This allows code to make calls to ``sys.audit()`` > without having to test for existence, but it should not assume that its > call will have any effect. (Including existence tests in > security-critical code allows another vector to bypass auditing, so it > is preferable that the function always exist.) > > ``importlib.util.open_for_import(path)`` should at a minimum always > return ``_io.open(path, 'rb')``. Code using the function should make no > further assumptions about what may occur, and implementations other than > CPython are not required to let developers override the behavior of this > function with a hook. > > Suggested Audit Hook Locations > ============================== > > The locations and parameters in calls to ``sys.audit()`` or > ``PySys_Audit()`` are to be determined by individual Python > implementations. This is to allow maximum freedom for implementations > to expose the operations that are most relevant to their platform, > and to avoid or ignore potentially expensive or noisy events. > > Table 1 acts as both suggestions of operations that should trigger > audit events on all implementations, and examples of event schemas. > > Table 2 provides further examples that are not required, but are > likely to be available in CPython. > > Refer to the documentation associated with your version of Python to > see which operations provide audit events. > > .. csv-table:: Table 1: Suggested Audit Hooks > :header: "API Function", "Event Name", "Arguments", "Rationale" > :widths: 2, 2, 3, 6 > > ``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new > audit hooks are being added. > " > ``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", " > Detects any attempt to set the ``open_for_import`` hook. > " > "``compile``, ``exec``, ``eval``, ``PyAst_CompileString``, > ``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", " > Detect dynamic code compilation, where ``code`` could be a string or > AST. Note that this will be called for regular imports of source > code, including those that were opened with ``open_for_import``. > " > "``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", " > Detect dynamic execution of code objects. This only occurs for > explicit calls, and is not raised for normal function invocation. > " > ``import``, ``import``, "``(module, filename, sys.path, > sys.meta_path, sys.path_hooks)``", "Detect when modules are > imported. This is raised before the module name is resolved to a > file. All arguments other than the module name may be ``None`` if > they are not used or available. > " > "``open``", ``open``, "``(path, mode, flags)``", "Detect when a file > is about to be opened. *path* and *mode* are the usual parameters to > ``open`` if available, while *flags* is provided instead of *mode* > in some cases. > " > ``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is > injecting trace functions. Because of the implementation, exceptions > raised from the hook will abort the operation, but will not be > raised in Python code. Note that ``threading.setprofile`` eventually > calls this function, so the event will be audited for each thread. > " > ``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is > injecting trace functions. Because of the implementation, exceptions > raised from the hook will abort the operation, but will not be > raised in Python code. Note that ``threading.settrace`` eventually > calls this function, so the event will be audited for each thread. > " > "``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``, > ``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``"," > ``object.__setattr__``","``(object, attr, value)``","Detect monkey > patching of types and objects. This event > is raised for the ``__class__`` attribute and any attribute on > ``type`` objects. > " > "``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object, > attr)``","Detect deletion of object attributes. This event is raised > for any attribute on ``type`` objects. > " > "``Unpickler.find_class``",``pickle.find_class``,"``(module_name, > global_name)``","Detect imports and global name lookup when > unpickling. > " > > > .. csv-table:: Table 2: Potential CPython Audit Hooks > :header: "API Function", "Event Name", "Arguments", "Rationale" > :widths: 2, 2, 3, 6 > > ``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies > hooks they are being cleaned up, mainly in case the event is > triggered unexpectedly. This event cannot be aborted. > " > ``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", " > Detect dynamic creation of code objects. This only occurs for > direct instantiation, and is not raised for normal compilation. > " > ``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect > dynamic creation of function objects. This only occurs for direct > instantiation, and is not raised for normal compilation. > " > "``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, " > ``(module_or_path,)``", "Detect when native modules are used. > " > ``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", " > Collect information about specific symbols retrieved from native > modules. > " > ``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect > when code is accessing arbitrary memory using ``ctypes``. > " > "``new_mmap_object``",``mmap.__new__``,"``(fileno, map_size, access, > offset)``", "Detects creation of mmap objects. On POSIX, access may > have been calculated from the ``prot`` and ``flags`` arguments. > " > ``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect > when code is accessing frames directly. > " > ``sys._current_frames``, ``sys._current_frames``, "", "Detect when > code is accessing frames directly. > " > "``socket.bind``, ``socket.connect``, ``socket.connect_ex``, > ``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``, > ``socket.sendto``", ``socket.address``, "``(address,)``", "Detect > access to network resources. The address is unmodified from the > original call. > " > "``member_get``, ``func_get_code``, ``func_get_[kw]defaults`` > ",``object.__getattr__``,"``(object, attr)``","Detect access to > restricted attributes. This event is raised for any built-in > members that are marked as restricted, and members that may allow > bypassing imports. > " > "``urllib.urlopen``",``urllib.Request``,"``(url, data, headers, > method)``", "Detects URL requests. > " > > Performance Impact > ================== > > The important performance impact is the case where events are being > raised but there are no hooks attached. This is the unavoidable case - > once a developer has added audit hooks they have explicitly chosen to > trade performance for functionality. Performance impact with hooks added > are not of interest here, since this is opt-in functionality. > > Analysis using the Python Performance Benchmark Suite [1]_ shows no > significant impact, with the vast majority of benchmarks showing > between 1.05x faster to 1.05x slower. > > In our opinion, the performance impact of the set of auditing points > described in this PEP is negligible. > > Rejected Ideas > ============== > > Separate module for audit hooks > ------------------------------- > > The proposal is to add a new module for audit hooks, hypothetically > ``audit``. This would separate the API and implementation from the > ``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and > ``PyAudit_Audit`` rather than the current variations. > > Any such module would need to be a built-in module that is guaranteed to > always be present. The nature of these hooks is that they must be > callable without condition, as any conditional imports or calls provide > opportunities to intercept and suppress or modify events. > > Given it is one of the most core modules, the ``sys`` module is somewhat > protected against module shadowing attacks. Replacing ``sys`` with a > sufficiently functional module that the application can still run is a > much more complicated task than replacing a module with only one > function of interest. An attacker that has the ability to shadow the > ``sys`` module is already capable of running arbitrary code from files, > whereas an ``audit`` module could be replaced with a single line in a > ``.pth`` file anywhere on the search path:: > > import sys; sys.modules['audit'] = type('audit', (object,), > {'audit': lambda *a: None, 'addhook': lambda *a: None}) > > Multiple layers of protection already exist for monkey patching attacks > against either ``sys`` or ``audit``, but assignments or insertions to > ``sys.modules`` are not audited. > > This idea is rejected because it makes it trivial to suppress all calls > to ``audit``. > > Flag in sys.flags to indicate "audited" mode > -------------------------------------------- > > The proposal is to add a value in ``sys.flags`` to indicate when Python > is running in a "secure" or "audited" mode. This would allow > applications to detect when some features are enabled or when hooks > have been added and modify their behaviour appropriately. > > Currently, we are not aware of any legitimate reasons for a program to > behave differently in the presence of audit hooks. > > Both application-level APIs ``sys.audit`` and > ``importlib.util.open_for_import`` are always present and functional, > regardless of whether the regular ``python`` entry point or some > alternative entry point is used. Callers cannot determine whether any > hooks have been added (except by performing side-channel analysis), nor > do they need to. The calls should be fast enough that callers do not > need to avoid them, and the program is responsible for ensuring that > any added hooks are fast enough to not affect application performance. > > The argument that this is "security by obscurity" is valid, but > irrelevant. Security by obscurity is only an issue when there are no > other protective mechanisms; obscurity as the first step in avoiding > attack is strongly recommended (see `this article > <https://danielmiessler.com/study/security-by-obscurity/>`_ for > discussion). > > This idea is rejected because there are no appropriate reasons for an > application to change its behaviour based on whether these APIs are in > use. > > Why Not A Sandbox > ================= > > Sandboxing CPython has been attempted many times in the past, and each > past attempt has failed. Fundamentally, the problem is that certain > functionality has to be restricted when executing the sandboxed code, > but otherwise needs to be available for normal operation of Python. For > example, completely removing the ability to compile strings into > bytecode also breaks the ability to import modules from source code, and > if it is not completely removed then there are too many ways to get > access to that functionality indirectly. There is not yet any feasible > way to generically determine whether a given operation is "safe" or not. > Further information and references available at [2]_. > > This proposal does not attempt to restrict functionality, but simply > exposes the fact that the functionality is being used. Particularly for > intrusion scenarios, detection is significantly more important than > early prevention (as early prevention will generally drive attackers to > use an alternate, less-detectable, approach). The availability of audit > hooks alone does not change the attack surface of Python in any way, but > they enable defenders to integrate Python into their environment in ways > that are currently not possible. > > Since audit hooks have the ability to safely prevent an operation > occuring, this feature does enable the ability to provide some level of > sandboxing. In most cases, however, the intention is to enable logging > rather than creating a sandbox. > > Relationship to PEP 551 > ======================= > > This API was originally presented as part of > `PEP 551 <https://www.python.org/dev/peps/pep-0551/>`_ Security > Transparency in the Python Runtime. > > For simpler review purposes, and due to the broader applicability of > these APIs beyond security, the API design is now presented separately. > > PEP 551 is an informational PEP discussing how to integrate Python into > a secure or audited environment. > > References > ========== > > .. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_ > > .. [2] Python Security model - Sandbox `<https://python-security.readthedocs.io/security.html#sandbox>`_ > > Copyright > ========= > > Copyright (c) 2019 by Microsoft Corporation. This material may be > distributed only subject to the terms and conditions set forth in the > Open Publication License, v1.0 or later (the latest version is presently > available at http://www.opencontent.org/openpub/). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vano%40mail.mipt.ru -- Regards, Ivan
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4