Best practices
Keep your workspace organised
The idea is that dataset : configuration : script should be 1 : 1 : 1:
Keep your raw data (i.e. full images) in
/data/my_imgs.Make a new configuration file with your settings in
/configsby copying/configs/mzb_example_config.yamland using it as a template.Make a new workflow file with your running parameters in
/workflows, using one of the examples as template.For your output folders, put your processed images in
/data/my_imgs/derived, and the model predictions in/results/my_imgs.Name all the files related to one dataset in a similar way:
configs/MZB_is_awesome_config.yml,data/MZB_is_awesome,data/derived/MZB_is_awesomeand so on.If you change parameters, name your output folders differently so that you don’t mix the outputs of your experiments!
If (re-)training ML models, it’s important to keep track of (hyper)parameters and logs, see Logging your model’s training
Hint
Read more about project organisation here
Logging your model’s training
To be able to tell whether a model is learning properly and/or is overfitting, it’s necessary to log its progress while training. We support two loggers for this:
For Weights & Biases, you will need to create (free) account and install the necessary dependencies; refer to the documentation here. After installing all requirements, run
wandb loginand provide your credentials when prompted.For TensorBoard, please follow the installation instructions here. You will also need to specify which logger to use in the
model_loggerparameter in the configuration file (see Configuration).