Hugging Face Accelerate
DVCLive allows you to add experiment tracking capabilities to your Hugging Face Accelerate projects.
Usage
If you have dvclive
installed, the DVCLive Tracker
will be used for
tracking experiments and logging metrics, parameters, and plots for
accelerate>=0.25.0
. Unlike with Hugging Face Transformers callback, you must
specify what to log. The following snippet shows a full example:
from accelerate import Accelerator
# optional, `log_with` defaults to "all"
accelerator = Accelerator(log_with="dvclive")
# log hyperparameters
hps = {"num_iterations": 5, "learning_rate": 1e-2}
accelerator.init_trackers("my_project", config=hps)
# log metrics
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8})
# log model
accelerator.save_state("checkpoint_dir")
if accelerator.is_main_process:
live = accelerator.get_tracker("dvclive", unwrap=True)
live.log_artifact("checkpoint_dir")
# end logging
accelerator.end_training()
Initialize
from accelerate import Accelerator
# optional, `log_with` defaults to "all"
accelerator = Accelerator(log_with="dvclive")
accelerator.init_trackers(project_name="my_project")
To customize tracking, include arguments to be passed to the Live
instance
using init_kwargs
like:
accelerator.init_trackers(
project_name="my_project",
init_kwargs={"dvclive": {"dir": "my_directory"}}
)
Log parameters
To log hyperparameters, add them using config
like:
hps = {"num_iterations": 5, "learning_rate": 1e-2}
accelerator.init_trackers("my_project", config=hps)
Log metrics
To log metrics:
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8})
Optionally pass the step (if not passed, it will be auto-incremented):
accelerator.log({"train_loss": 1.12, "valid_loss": 0.8}, step=1)
Additional logging
To log models, other artifacts, or images, retrieve the Live
instance from the
tracker:
accelerator.save_state("checkpoint_dir")
if accelerator.is_main_process:
live = accelerator.get_tracker("dvclive", unwrap=True)
live.log_artifact("checkpoint_dir")
accelerator.is_main_process
ensures it is only called on the main process.
This is handled automatically by the Accelerate Tracker
in the other methods
above, but in this example we are directly calling Live
instead.
unwrap=True
returns the Live
instance instead of the Tracker
, so that
you can call any Live
methods.
End
Finally, end the experiment to trigger Live.end()
:
accelerator.end_training()