Troubleshooting
Here we provide help for some of the problems that DVC user might stumble upon.
Failed to pull data from the cloud
Users may encounter errors when running dvc pull
and dvc fetch
, like
WARNING: Cache 'xxxx' not found.
or
ERROR: failed to pull data from the cloud
. The most common cause is changes
pushed to Git without the corresponding data being uploaded to the DVC
remote. Make sure to dvc push
from the original project, and try
again.
Too many open files error
A known problem some users run into with the dvc pull
, dvc fetch
and
dvc push
commands is [Errno 24] Too many open files
(most common for S3
remotes on macOS). The more --jobs
specified, the more file descriptors need
to be open on the host file system for each download thread, and the limit may
be reached, causing this error.
To solve this, it's often possible to increase the open file descriptors limit,
with ulimit
on UNIX-like system (for example ulimit -n 1024
), or
increasing Handles limit
on Windows. Otherwise, try using a lower JOBS
value.
Unable to find credentials
Make sure that you have your AWS credentials setup either through
usual AWS configuration
or by setting access_key_id
and secret_access_key
with dvc remote modify
.
Unable to connect
Make sure you are online and able to access your
AWS S3 endpoint, or
endpointurl
if explicitly set with dvc remote modify
.
Bucket does not exist
Make sure your bucket
exists
in the correct region
and/or endpointurl
(see dvc remote modify
).
Unable to detect cache type
Unable to detect supported link types, as the
cache directory doesn't
exist. It is usually created automatically by DVC commands that need it, but you
can create it manually (e.g. mkdir .dvc/cache
) to enable this check.
Unable to acquire lock
You may encounter an error message saying Unable to acquire lock
if you have
another DVC process running in the project. If that is not the case, it usually
means that DVC was terminated abruptly and manually removing the lock file in
.dvc/tmp/lock
should resolve the issue.
If the issue still persists then it may be the case that you are running DVC on
some network filesystem like NFS, Lustre, etc. If so, the solution is to enable
core.hardlink_lock
which can be done by running following command:
$ dvc config core.hardlink_lock true
Cannot add files in symlinked directory
DVC only supports symlinked files as
valid targets for dvc add
. If the target path is a directory symlink, or if
the target path contains any intermediate directory symlinks, dvc add
will
fail.
No possible cache types
You may encounter this error if DVC cannot find a valid file link type to use when linking data files from cache into your workspace. To resolve the issue, you may need to reconfigure DVC to use alternative link types which are supported on your machine.
After reconfiguring cache types, you can re-link data files in your workspace using:
$ dvc checkout --relink
DVC cannot authenticate to Git remote
Some commands require authenticating to a Git remote (for example,
experiment sharing).
You may need write (dvc exp push
) or read (dvc exp list
, dvc exp pull
)
permissions.
Using SSH to authenticate is recommended. If you use HTTP, you can configure a Git credential helper (like Git Credential Manager) to authenticate to private HTTP Git remotes.
Commands like dvc import
will clone a different Git repo than the one in which
you are working. In order for these commands to read from a credential helper,
it must be configured using git config --global
or git config --system
.
If your SSH configuration includes UseKeychain
on macOS, you
will need to decrypt the SSH key and add it to the SSH agent for DVC to be able
to use it:
ssh-add --apple-load-keychain ~/.ssh/ed255
Could not open pickled 'index/md5/links' cache
You may encounter this error when using DVC on different Python versions with the same DVC project directory, for example having created the project on Python 3.8. in one environment and later attempting to update it from a Python 3.7 env. This is due to temporary internal directories that can be incompatible with older Python versions once created.
In these rare situations, it is safe to remove the corresponding tmp directory and retry the DVC command. Specifically, one of:
.dvc/tmp/index
.dvc/tmp/md5s
.dvc/tmp/links
DVC Experiments may fail in Git shallow clones
In dvc<=2.45.1
, dvc exp` commands use internal Git operations which may not
work properly in
shallow clones.
Upgrade to a newer version of DVC to resolve these errors.