Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
Documentation for the latest release is hosted on readthedocs.
Here are some good things about gokart.
pkl
file with hash value
pandas.DataFrame
type and column checking during I/OAll the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.
Here are some non-goal / downside of the gokart.
Within the activated Python environment, use the following command to install gokart.
A minimal gokart tasks looks something like this:
import gokart class Example(gokart.TaskOnKart): def run(self): self.dump('Hello, world!') task = Example() output = gokart.build(task) print(output)
gokart.build
return the result of dump by gokart.TaskOnKart
. The example will output the following.
We introduce type-annotations to make a gokart pipeline robust. Please check the following example to see how to use type-annotations on gokart. Before using this feature, ensure to enable mypy plugin feature in your project.
import gokart # `gokart.TaskOnKart[str]` means that the task dumps `str` class StrDumpTask(gokart.TaskOnKart[str]): def run(self): self.dump('Hello, world!') # `gokart.TaskOnKart[int]` means that the task dumps `int` class OneDumpTask(gokart.TaskOnKart[int]): def run(self): self.dump(1) # `gokart.TaskOnKart[int]` means that the task dumps `int` class TwoDumpTask(gokart.TaskOnKart[int]): def run(self): self.dump(2) class AddTask(gokart.TaskOnKart[int]): # `a` requires a task to dump `int` a: gokart.TaskOnKart[int] = gokart.TaskInstanceParameter() # `b` requires a task to dump `int` b: gokart.TaskOnKart[int] = gokart.TaskInstanceParameter() def requires(self): return dict(a=self.a, b=self.b) def run(self): # loading by instance parameter, # `a` and `b` are treated as `int` # because they are declared as `gokart.TaskOnKart[int]` a = self.load(self.a) b = self.load(self.b) self.dump(a + b) valid_task = AddTask(a=OneDumpTask(), b=TwoDumpTask()) # the next line will show type error by mypy # because `StrDumpTask` dumps `str` and `AddTask` requires `int` invalid_task = AddTask(a=OneDumpTask(), b=StrDumpTask())
This is an introduction to some of the gokart. There are still more useful features.
Please See Documentation .
Have a good gokart life.
Gokart is a proven product.
gokart is a wrapper for luigi. Thanks to luigi and dependent projects!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4