SoS can interact with any Jupyter kernel. As shown in the SoS notebook tutorial, SoS can
%expand
magic to prepare input before sending to the kernel%capture
magic to capture the output from the kernel%render
magic to render output from the kernelwithout knowing what the kernel does.
However, if the kernel supports the concept of variable (not all kernel does), a language module for the kernel would allow SoS to work more efficiently with the kernel. More specifically, SoS can
%cd
magic%put
, %get
and %with
%expand --in
%preview
%sessioninfo
Whereas data exchange among subkernels is really powerful, it is important to understand that, SoS does not tranfer any variables among kernels, it creates independent homonymous variables of similar types that are native to the destination language. For example, if you have the following two variables
in R and executes a magic
in a SoS cell, SoS actually execute the following statements, in the background, to create variables a
and b
in Python
These variables are independent so that changing the value of variables a
or b
in one kernel will not affect another. We also note that a
and b
are of different types in Python although they are of the same numeric
type in R
(a
is technically speaking an array of size 1).
The best way to start a new language module is to read the source code of an existing language module and adapt it to your language. Our github oraganization has a number of language modules. Module sos-r
is a good choice and you should try to match the corresponding items with code in kernel.py
when going through this tutorial.
To support a new language, you will need to write a Python package that defines a class, say mylanguage
, that provides the following class attributes:
supported_kernels
supported_kernels
should be a dictionary of language and names of the kernels that the language supports. For example, ir
is the name of kernel for language R
so this attribute should be defined as:
supported_kernels = {'R': ['ir']}
If multiple kernels are supported, SoS will look for a kernel with matched name in the order that is specified. This is the case for JavaScript
where multiple kernels are available:
supported_kernels = {'JavaScript': ['ijavascript', 'inodejs']}
Multiple languages can be specified if a language module supports multiple languages. For example, MATLAB
and Octave
share the same language module
supported_kernels = {'MATLAB': ['imatlab', 'matlab'], 'Octave': ['octave']}
Wildcard characters are allowd in kernel names, which are useful for kernels that contain version numbers:
supported_kernels = {'Julia': ['julia-?.?']}
Finally, if SoS cannot find any kernel that it recognizes, it will look into the language
information of the kernelspec.
background_color
background_color
should be a name or #XXXXXX
value for a color that will be used in the prompt area of cells that are executed by the subkernel. An empty string can be used for using default notebook color. If the language module defines multiple languages, a dictionary {language: color}
can be used to specify different colors for supported languages. For example,
background_color = {'MATLAB': '#8ee7f1', 'Octave': '#dff8fb'}
is used for MATLAB
and Octave
.
cd_command
cd_command
is a command to change current working directory, specified with {dir}
intepolated with option of magic %cd
. For example, the command for R is
cd_command = 'setwd({dir!r})'
where !r
quotes the provided dir
. Note that { }
are used as a Python f-string but no f
prefix should be used.
options
A Python dictionary with options that will be passed to the frontend. Currently two options variable_pattern
and assignment_pattern
are supported. Both options should be regular expressions in JS style.
Option variable_pattern
is used to identify if a statement is a simple variable (nothing else). If this option is defined and the input text (if executed at the side panel) matches the pattern, SoS will prepend %preview
to the code. This option is useful only when %preview var
displays more information than var
.
Option assignment_pattern
is used to identify if a statement is an assignment operation. If this option is defined and the input text matches the pattern, SoS will prepend %preview var
to the code where var
should be the first matched portion of the pattern (use ( )
). This mechanism allows SoS to automatically display result of an assignment when you step through the code.
Both options
are optional.
__version__
This attribute, if provided, will be included in the debug message when the language module is loaded. This helps you, for example, to check if the correct version of the language module has been loaded if you have multiple instances of python, sos, and/or language module available.
An instance of the class would be initialized with the sos kernel and the name of the subkernel, which does not have to be one of the supported_kernels
(could be self-defined) and should provide the following attributes and functions. Because these attributes are instantiated with kernel name, they can vary (slightly) from kernel to kernel.
init_statement
init_statements
is a statement that will be executed by the sub-kernel when the kernel starts. This statement usually defines a number of utility functions.
get_vars(self, names)
should be a Python function that transfer specified Python variables to the subkernel. We will discussion this in detail in the next section.
Functionput_vars(self, items, to_kernel=None)
Function put_vars
should be a Python function that put one or more variables in the subkernel to SoS or another subkernel. We will discussion this in detail in the next section.
expand(self, text, sigil)
(new in SoS Notebook 0.20.8)
Function expand
should be a Python function that passes text
(most likely in Markdown format) with inline expressions, evaluate the expressions in the subkernel and return expanded text. This can be used by the markdown kernel for the execution of inline expressions of, for example, R markdown text.
preview(self, item)
Function preview
accepts a name, which should be the name of a variable in the subkernel. This function should return a tuple of two items (desc, preview)
where
desc
should be a text (can be empty) that describes the type, size, dimension, or other general information of the variable, which will be displayed after variable name.preview
can be
str
that are printed as stdout
text/plain
, text/html
, image/png
and corresponding data. The data will be sent directly as display_data
and allows you to return different types of preview result, even images.data
dictionary, and the second being the metadata
directionary for a display_data
message.sessioninfo(self)
Function sessioninfo
should a Python function that returns information of the running kernel, usually including version of the language, the kernel, and currently used packages and their versions. For R
, this means a call to sessionInfo()
function. The return value of this function can be
(key, value)
pairs, orThe function will be called by the %sessioninfo
magic of SoS.
The get_vars
function should be defined as
def get_vars(self, var_names)
where
self
is the language instance with access to the SoS kernel, andvar_names
are names in the sos dictionary.This function is responsible for probing the type of Python variable and create a similar object in the subkernel.
For example, to create a Python object b = [1, 2]
in R
(magic %get
), this function could
b <- c(1, 2)
)b
in it.Note that the function get_vars
can change the variable name because a valid variable name in Python might not be a valid variable name in another language. The function should give a warning (call self.sos_kernel.warn()
) if this happens.
The put_vars
function should be defined as
def put_vars(self, var_names, to_kernel=None)
where
self
is the language instance with access to the SoS kernelvar_name
is a list of variables that should exist in the subkernel.to_kernel
is the destination kernel to which the variables should be passed.Depending on destination kernel, this function can:
So basically, a language can start with an implementation of put_vars(to_kernel='sos')
and let SoS handle the rest. If needs arise, it can
NOTE: SoS Notebook before version 0.20.5 supports a feature called automatic variable transfer, which automatically transfers variables with names starting with sos
between kernels. This feature has been deprecated. (#253).
For example, to send a R
object b <- c(1, 2)
from subkernel R
to SoS
(magic %put
), this function can
"{'b': [1, 2]}"
.The R
sos extension provides a good example to get you started.
NOTE: Unlike other language extension mechanisms in which the python module can get hold of the "engine" of the interpreter (e.g. saspy
and matlab's Python extension start the interpreter for direct communication) or have access to lower level API of the language (e.g. rpy2
), SoS only have access to the interface of the language and perform all conversions by executing commands in the subkernels and intercepting their response. Consequently,
Also, although it can be more efficient to save large datasets to disk files and load in another kernel, this method does not work for kernels that do not share the same filesystem. We currently ignore this issue and assume all kernels have access to the same file system.
With access to an instance of SoS kernel, you can call various functions of this kernel. However, the SoS kernel does not provide a stable API yet so you are advised to use only the following functions:
sos_kernel.warn(msg)
This function produces a warning message.
sos_kernel.run_cell(statement, True, False, on_error='msg')
Execute a statement
in the current subkernel, with True
, False
indicating that the execution should be done in the background and no output should be displayed. A message on_error
will be displayed if the statement
fails to execute.
sos_kernel.get_response(statement, msg_type, name)
This function executes the statement and collects messages send back from the subkernel. Only messages in specified msg_type
are kept (e.g. stream
, display_data
), and name
can be one or both of stdout
and stderr
when stream
is specified.
The returned value is a list of
msg_type, msg_data
msg_type, msg_data
...
so
self.sos_kernel.get_response('ls()', ('stream', ), name=('stdout', ))[0][1]
runs a function ls()
in the subkernel, collects stdout
, and get the content of the first message.
If you are having trouble in figuring out what messages have been returned (e.g. display_data
and stream
can look alike) from subkernels, you can use the %capture
magic to show them in the console panel.
You can also define environment variable SOS_DEBUG=MESSAGE
(or MESSAGE,KERNEL
etc) before starting the notebook server. This will cause SoS to, among other things, log messages processed by the get_response
function to ~/.sos/sos_debug.log
.
If you would like to add your own debug messages to the log file, you can
from sos.utils import env env.log_to_file('VARIABLE', f'Processing {var} of type {var.__class__.__name__}.')
If the log message can be expensive to format, you can check if SOS_DEBUG
is defined before logging to the log file:
if 'VARIABLE' in env.config['SOS_DEBUG'] or 'ALL' in env.config['SOS_DEBUG']: env.log_to_file('VARIABLE', f'Processing {var} of type {var.__class__.__name__}.')
Although you can test your language module in many ways, it is highly recommended that you adopt a standard set of selenium-based tests that are executed by pytest
. To create and run these tests, you should
selenium
and pytest
JUPYTER_TEST_BROWSER
to live
if you would like to the test running. Otherwise the tests will be run in a virtual chrome browser without display.sos-r
and adapt them for your language.The test suite contains three files:
conftest.py
This is the configuration file for pytest
that defines how to start a Jupyter server with the notebook with the right kernel. You can simply copy this file for your purpose.
test_interface.py
This file contains tests on the interface of the language module, including
%cd
%put
and %get
sos
variables (variables with names starting with sos
%preview
magic%sessioninfo
magictest_data_exchange.py
This file should contain tests for data exchange between SoS (Python) and the language, and optionally between subkernels. It should separate by data types and direction of data transfer.
All tests should be derived from NotebookTest
derived from sos_notebook.test_utils
, and use a pytest fixture notebook
as follows:
from sos_notebook.test_utils import NotebookTest class TestDataExchange(NotebookTest): def test_something(self, notebook): passThe
notebook
fixture
The notebook
fixture that is passed to each test function contains a notebook instance that you can operate on. Although there are a large number of functions, you most likely only need to learn two of them for your tests:
notebook.call(statement, kernel, expect_error=False)
This function append a new cell to the end of notebook, insert the specified statement
as its content, change the kernel of the cell to kernel
, and executes the cell. It automatically dedent statement
so you can indent multiple statements and cal
notebook.call('''\ %put df --to R import pandas as pd import numpy as np arr = np.random.randn(1000) arr[::10] = np.nan df = pd.DataFrame({'column_{0}'.format(i): arr for i in range(10)}) ''', kernel='SoS')
This function returns the index of the cell so that you can call notebook.get_cell_output(idx)
if needed. If you are supposed to see some warning messages, use expect_error=True
. Otherwise the function will raise an exception that fails the test.
notebook.check_output(statement, kernel, expect_error=False, selector=None, attribute=None)
This function calls the notebook.call(statement, kernel)
and then notebook.get_cell_output(idx, selector, attribute)
to get the output. The output contains all the text
of the output, and additional text from non-text elements. For example, selector='img', attribute='src'
would return text in <img src="blah">
output. Using this function, most of your unittests can look like the following
def test_sessioninfo(self, notebook): assert 'R version' in notebook.check_output( '%sessioninfo', kernel="SoS")Registering the new language module
To register a language module with SoS, you will need to add your module to an entry point under section sos-language
. This can be done by adding the something like the following to your setup.py
:
entry_points=''' [sos_language] Perl = sos_perl.kernel:sos_Perl '''
With the installation of this package, sos
would be able to import a class sos_Perl
from module sos_perl.kernel
, and use it to work with the Perl
language.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4