%cd
changes directory of all subkernels%expand
expands expressions in the cell input before sending it to subkernels%capture
captures output from subkernels and save them into Python variables%render
renders output from subkernels in different formatsA SoS kernel is a master kernel that can start, stop, and interact with any Jupyter kernel, which are called subkernels by SoS. Although most of the times SoS just passes user input to subkernels and sendes outputs from subkernels to the frontend (notebook), you can use a few SoS magics to modify user inputs before they are sent to the subkernel, and process outputs before they are sent to the frontend.
Change workding directory (magic%cd
)
When working with multiple kernels in a notebook, it could be confusing if kernels have different working directory. The rule of thumb in SoS are as follows:
First, subkernels starts at the current working directory of the SoS kernel
Out[1]:
'/Users/bpeng1/sos/sos-docs/src/user_guide'
'/Users/bpeng1/sos/sos-docs/src/user_guide'
Second, changing working directory in one subkernel will not affect other living kernels
'/Users/bpeng1/sos/sos-docs/src'
Out[5]:
'/Users/bpeng1/sos/sos-docs/src/user_guide'
Third, magic %cd
changes working directory of all subkernels
/Users/bpeng1/sos/sos-docs
Out[7]:
'/Users/bpeng1/sos/sos-docs'
'/Users/bpeng1/sos/sos-docs'
Expand input (magic%expand
)
Script in SoS cells are by default sent to SoS or the subkernels in verbatim. For example,
A parameter {par} is specified.
With variable par
defined in the SoS kernel,
the %expand
treats the content of the following cell as a Python f-string and expands expressions inside braces:
[1] "A parameter 100 is specified."
If the script contains { }
, which is quite common in R, you can double the braces
[1] "A parameter 100 greater than 50 is specified."
If there are multiple braces, it is obviously better to use a different sigil, such as ${ }
to interpolate the script
[1] "A parameter 100 greater than 50 is specified."
Although not the topic of this tutorial, it is worth mentioning that the usage of the %expand
magic is the same as the expand
option of SoS actions so that you can convert the above script that was executed in a R session to an R action in a SoS workflow as follows:
[1] "A parameter 100 greater than 50 is specified."Expand input in subkernels (magic
%expand --in
)
If a language module supports the expand
interface (not all of them does), you can expand the content of a cell in specified kernel. This is most naturally used in a markdown
kernel when the expanded text are treated as markdown text.
By default the markdown kernel processes the input text literally:
With the %expand
magic, you can expand expressions with variables in SoS kernel (default).
Hello, the value of a is 5
If you have variables in R,
You can expand the content of the cell using variables (and expressions) in R:
Hello, the value of a**2 is 10000
If you are more familiar with delimiters in RMarkdown, you can specify the delimiter as option of %expand
.
Hello, the value of a**2 is 10000
Capture cell output (magic%capture
)
Magic %capture
all or part of the output of a cell to a SoS variable. To understand how this magic works, you will need to understand how Jupyter works. Briefly, after a cell is executed, the kernel sends one or more of messages stream
, display_data
, and other controlling messages before it sends execute_result
to conclude the execution. The stream
message type can contain standard output (stdout
) and standard error output (stderr
), and the display_data
message can contain a lot more complex data under a dictionary with keys text/html
, text/plain
, text/markdown
etc, and the frontend will decide how to display these messages.
The %capture
magic can capture the following types of information
stdout
stdout
of stream
messages stderr
stderr
of stream
messages text
text/plain
of display_data
or execute_result
messages html
text/html
of display_data
or execute_result
messages markdown
text/markdown
of display_data
or execute_result
messages error
evalue
of error
message raw
All above messages
The first step to capture output from a cell is to determine what types of messages are sent by the cell. If you are uncertain, you can open the console panel (right click and select New Console for Notebook
if you are using Jupyter Lab), and use the %capture
magic without option (or with the raw
option).
The messages that has been returned by the cell will be displayed in the console window
[('stream', {'name': 'stdout', 'text': 'I am from Bash\n'})]
for this cell, from which you can see that the message is of type stdout
. You can then specify the stdout
type,
The captured result is by default saved to a variable __captured
in the SoS kernel:
You can use option -t
(--to
) to assign the name of the variable
As a more complex example, the following cell runs a SPARQL query and returns multiple messages.
Return format: JSON
Display: table
Endpoint set to: http://dbpedia.org/sparql
The __captured
variable shows all returned messages
Out[27]:
[('display_data', {'data': {'text/html': '<div class="krn-spql"><div class="magic">Return format: JSON</div><div class="magic">Display: table</div><div class="magic">Endpoint set to: http://dbpedia.org/sparql</div></div>', 'text/plain': 'Return format: JSON\nDisplay: table\nEndpoint set to: http://dbpedia.org/sparql\n'}, 'metadata': {}}), ('display_data', {'data': {'text/html': '<div class="krn-spql"><table><tr class=hdr><th>person</th>\n<th>name</th></tr><tr class=odd><td class=val><a href="http://dbpedia.org/resource/Paul_Da_Vinci" target="_other">http://dbpedia.org/resource/Paul_Da_Vinci</a></td>\n<td class=val>Paul Da Vinci</td></tr><tr class=even><td class=val><a href="http://dbpedia.org/resource/Leonardo_da_Vinci" target="_other">http://dbpedia.org/resource/Leonardo_da_Vinci</a></td>\n<td class=val>Leonardo da Vinci</td></tr></table><div class="tinfo">Total: 2, Shown: 2</div></div>'}, 'metadata': {}})]
You can then save the content of text/html
to a variable html_table
Return format: JSON
Display: table
Endpoint set to: http://dbpedia.org/sparql
which contains the text/html
data of two messages
<div class="krn-spql"><div class="magic">Return format: JSON</div><div class="magic">Display: table</div><div class="magic">Endpoint set to: http://dbpedia.org/sparql</div></div><div class="krn-spql"><table><tr class=hdr><th>person</th> <th>name</th></tr><tr class=odd><td class=val><a href="http://dbpedia.org/resource/Paul_Da_Vinci" target="_other">http://dbpedia.org/resource/Paul_Da_Vinci</a></td> <td class=val>Paul Da Vinci</td></tr><tr class=even><td class=val><a href="http://dbpedia.org/resource/Leonardo_da_Vinci" target="_other">http://dbpedia.org/resource/Leonardo_da_Vinci</a></td> <td class=val>Leonardo da Vinci</td></tr></table><div class="tinfo">Total: 2, Shown: 2</div></div>
Then, if you would like to process the output programatically, you can use one of the many powerful Python modules
Out[30]:
[<a href="http://dbpedia.org/resource/Paul_Da_Vinci" target="_other">http://dbpedia.org/resource/Paul_Da_Vinci</a>, <a href="http://dbpedia.org/resource/Leonardo_da_Vinci" target="_other">http://dbpedia.org/resource/Leonardo_da_Vinci</a>]Capture formatted content
If the output of a cell is well-formatted, it is possible to capture the output as variables in a type other than str
.
For example, if you would like to capture the size of some files from a few notebook files. Instead of using Python scripts, you could possibly use a shell command as follows
> !ls -l ex*.ipynb | awk '{printf("%s,%d\n", $10, $6)}'
ls: ex*.ipynb: No such file or directory
The output is well formatted so you can capture it in csv format as follows
> !ls -l ex*.ipynb | awk '{printf("%s,%d\n", $10, $6)}'
ls: ex*.ipynb: No such file or directory
The resulting variable is a Pandas DataFrame but unfortunately treated the first data line as header, which is not entirely correct here.
Out[33]:
ls: ex*.ipynb: No such file or directoryThe %capture
magic can capture data in text
(default), json
, csv
, and tsv
format, and can append to instead of replacing an existing variable (option -a
). Please refer to the SoS Magics reference or command %capture -h
for a comlete list of options.
%render
)
The %render
magic intercepts the output of a cell, convert it to certain format before displaying it in the notebook. The format can be any format supported by the IPython.display
module and is default to Markdown
.
For example, if you have a dataset
You can format it in HTML format
<table> <tr> <th>Firstname</th> <th>Lastname</th> <th>Age</th> </tr> <tr> <td>John</td> <td>Smith</td> <td>50</td> </tr> <tr> <td>Eve</td> <td>Jackson</td> <td>35</td> </tr> </table>
and render it as a HTML table.
Out[36]:
Firstname Lastname Age John Smith 50 Eve Jackson 35Currently %render
only renders stdout
(of stream
messages, default) and text
(text/plain
of display_data
messages) contents, and you should probably use %capture raw
to check the type of output before you %render
.
The %render
magic accepts any renderer that is defined in the IPython.display
module. The following cell lists all renderers,
Options of magic %render * DisplayObject * TextDisplayObject * Pretty * HTML * Markdown * Math * Latex * SVG * ProgressBar * JSON * GeoJSON * Javascript * Image * Video * Audio * Code
and of course a %render
magic would treat the output as markdown format and display the items as bullet points:
Options of magic %render
The ability to render text output as markdown text alleviatea a problem with the Jupyter notebooks in that its markdown cells cannot contain variables, so you cannot really mix results with their descriptions as easily as what Rmarkdown inline expressions do. However, with the %render
magic, you can write markdown text as a string in any kernel, and use the %render
magic to display it.
For example, if you have res
obtained from some analysis
You can report the result by generating a markdown text programmatically and use the %render
magic to render it
Array of size 5
This is less intuitive than writing down markdown text directly but you have the flexibility to generate the markdown text using any language and you can use conditions and loops to automate the output of long reports.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4