composer
PyPi package!
Composer v0.13.1 is released!
Composer can also now be installed using the new composer
PyPi package via pip
:
pip install composer==0.13.1
The legacy package name still works via pip
:
pip install mosaicml==0.13.1
Note: The mosaicml==0.13.0
PyPi package was yanked due to some minor packaging issues discovered after release. The package was re-released as Composer v0.13.1, thus these release notes contain details for both v0.13.0 and v0.13.1.
๐ค New and Updated Callbacks
New HealthChecker
Callback (#2002)
The callback will log a warning if the GPUs on a given node appear to be in poor health (low utilization). The callback can also be configured to send a Slack message!
from composer import Trainer from composer.callbacks import HealthChecker # Warn if GPU utilization difference drops below 10% health_checker = HealthChecker( threshold = 10 ) # Construct Trainer trainer = Trainer( ..., callbacks=health_checker, ) # Train! trainer.fit()
Updated MemoryMonitor
to use GigaBytes (GB) units (#1940)
New RuntimeEstimator
Callback (#1991)
Estimate the remaining runtime of your job! Approximates the time remaining by observing the throughput and comparing to the number of batches remaining.
from composer import Trainer from composer.callbacks import RuntimeEstimator # Construct trainer with RuntimeEstimator callback trainer = Trainer( ..., callbacks=RuntimeEestimator(), ) # Train! trainer.fit()
Updated SpeedMonitor
throughput metrics (#1987)
Expands throughput metrics to track relative to several different time units and per device:
throughput/batches_per_sec
and throughput/device/batches_per_sec
throughput/tokens_per_sec
and throughput/device/tokens_per_sec
throughput/flops_per_sec
and throughput/device/flops_per_sec
throughput/device/samples_per_sec
Also adds throughput/device/mfu
metric to compute per device MFU. Simply enable the SpeedMonitor
callback per usual to log these new metrics! Please see SpeedMonitor documentation for more information.
โฃฟ FSDP Sharded Checkpoints (#1902)
Users can now specify the state_dict_type
in the fsdp_config
dictionary to enable sharded checkpoints. For example:
from composer import Trainer fsdp_confnig = { 'sharding_strategy': 'FULL_SHARD', 'state_dict_type': 'local', } trainer = Trainer( ..., fsdp_config=fsdp_config, save_folder='checkpoints', save_filename='ba{batch}_rank{rank}.pt', save_interval='10ba', )
Please see the PyTorch FSDP docs and Composer's Distributed Training notes for more information.
๐ค HuggingFace Improvements
HuggingFaceModel
class to support encoder-decoder batches without decoder_input_ids
(#1950)HuggingFaceModel
directly (#1971)HuggingFaceModel
and write out the expected config.json
and pytorch_model.bin
in the HuggingFace pretrained folder (#1974)๐ Nvidia H100 Alpha Support - Added amp_fp8
data type
In preparation for H100's arrival, we've added the amp_fp8
precision type. Currently setting amp_fp8
specifies a new precision context using transformer_engine.pytorch.fp8_autocast.
For more details, please see Nvidia's new Transformer Engine and the specific fp8 recipe we utilize.
from composer import Trainer trainer = Trainer( ..., precision='amp_fp8', )
The torchmetrics
package has been upgraded to 0.11.x.
The torchmetrics.Accuracy
metric now requires a task
argument which can take on a value of binary
, multiclass
or multilabel
. Please see Torchmetrics Accuracy docs for details.
Additonally, since specifying value='multiclass'
requires an additional field of num_classes
to be specified, we've had to update ComposerClassifier
to accept the additional num_classes
argument. Please see PR's #2017 and #2025 for additional details
Surgery algorithms used in functional form return a value of None
(#1543)
ProgressBarLogger
and ConsoleLogger
to loggers (#1846)HuggingFaceModel
crashes if config.return_dict = False
(#1948)epoch
metric name to trainer/epoch
(#1986)mosaicml/pytorch:1.12.1*
, mosaicml/pytorch:1.11.0*
, mosaicml/pytorch_vision:1.12.1*
and mosaicml/pytorch_vision:1.11.0*
images are impacted and currently supported for legacy use cases. We recommend users upgrade to images with PyTorch >1.13. The affected images will be removed in the next Composer release.Full Changelog: v0.12.1...v0.13.1
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4