Edit on GitHub

How to Run DVC on Windows

Different issues can arise when running DVC on Microsoft Windows, mainly involving system performance. Some, for example, have to do with NTFS file system characteristics and Windows built-in security mechanisms. Below are some workarounds that can help avoid these potential problems:

Did you know that DVC is available for Microsoft Visual Studio Code? More details here!

POSIX-like command line experience

The regular Command Prompt (cmd) in Windows will most likely not help you use DVC effectively, nor help you follow the examples in our docs. Here are some alternatives:

Line endings

If you are using Windows and you are working with people or environments (e.g. production) that are not, you’ll probably run into line-ending issues at some point. This is because Windows uses both a carriage-return character and a linefeed character for newlines in its files (CRLF), whereas macOS and Linux systems use only the linefeed character (LF).

Since DVC is using content-based checksums for your pipeline dependencies, depending on your Git configuration (see core.autocrlf and core.eol config options), DVC might see Git-tracked files as changed, thus triggering pipeline reproduction on dvc repro on one system and not on another. Thus we strongly recommend sticking with LF line endings when doing cross-platform work.

Configure your editor to use LF line endings

Many editors on Windows will use CRLF line endings by default or even replace existing LF with CRLF. It is recommended that you configure your editor to always stick to LF line endings.

For VS Code, add

{
  "files.eol": "\n"
}

to your global settings.json or to your project's .vscode/settings.json.

Set up LF Line Endings with .gitattributes

To enhance DVC compatibility on Windows, it is advisable to employ a .gitattributes file with the eol attribute to configure line endings.

Add the following line to your .gitattributes:

* text=auto eol=lf

This configuration tells Git to treat all files as text files and use LF line endings regardless of the platform.

Configure Git for LF Line Endings

Set core.autocrlf to false and core.eol to lf

$ git config --global core.autocrlf false
$ git config --global core.eol lf

Now Git will handle line endings consistently.

Use pre-commit hook to check and fix line endings

Add this to your .pre-commit-config.yaml hooks:

- id: mixed-line-ending
  args: [--fix=lf]

to make pre-commit check and automatically replace all line endings with LF.

Symlinks are one of the possible file link types that DVC can use for optimization purposes. They're available on Windows, but the Create symbolic links user privilege is needed. It's granted to the Administrators group by default, so running dvc in an admin terminal is a good option for occasional usage. For regular users, it can be granted using the Local policy settings.

This is done automatically by DVC's Windows installer, but you may want to do it manually after any other installation method (choco, conda, pip).

Whitelist in Windows Security

Windows 10 includes the Windows Security antivirus. If user wants to avoid antivirus scans on specific folders or files to improve the performance, then whitelist them in Windows Security as per this guide. For example, we can whitelist DVC binary files on Windows to speed up the processes.

Enable long folder/file paths

DVC commands (e.g. dvc pull, dvc repro) may fail when the folder path is longer than 260 characters. This may happen with the error [Errno 2] No such file or directory. Starting in Windows 10, path length limitations have been removed from common file and directory functions. However, you must opt-in to the new behavior. The user can explicitly enable long paths either by editing Group Policy or by editing registry keys following this guide.

Fix or disable Search Indexing

Search Indexing can also slow down file I/O operations on Windows. Try fixing or disabling this feature if you don't need it.

Disable short-file name generation

With NTFS, users may want to disable 8dot3 as per this article to disable the short-file name generation. It is important to do so for better performance when the user has over 300K files in a single directory.

Avoid directories with large number of files

The performance of NTFS degrades while handling large volumes of files in a directory, as explained in this issue.

Enabling paging with less

By default, DVC tries to use Less as pager for the output of dvc dag. Windows doesn't have the less command available however. Fortunately, there is a easy way of installing it via Chocolatey. After installing Chocolatey, run:

$ choco install less

less can be installed in other ways, just make sure it's available in the command line environment where you run dvc. (This usually means adding the directory where less is installed to the PATH environment variable.)