Why should you track everything

DVC

Install DVC on Fedora/RHEL/CentOS

sudo wget https://dvc.org/rpm/dvc.repo -O /etc/yum.repos.d/dvc.repo
sudo yum update
sudo yum install dvc

Add DVC files in the Git (It is not required but it is good to have project and dvc config in the same directory)

Workflow for Model Packaging

Pushing model files

dvc init
dvc remote add -d gss-rdu-remote ssh://msivanes@gss-rdu-repo.usersys.redhat.com:/var/www/html/repo/config/ulmfit

dvc add cases_small_sbr_08-06-2020.pkl
git commit -am “Add ulmfit model to project”
dvc push -v

Pulling the model

git clone $REPO
git pull
dvc pull # Pulls the data from remote-storage. Equivalent to dvc fetch followed by dvc checkout
dvc checkout #Update model files

Food for thought

References