lab
lab

Concepts

Lab is centred around three core concepts: Reproducibility, Logging, and Model Persistence. Lab is designed to integrate with your existing training scripts, with imposing as few constraints as possible.

Reproducibility

Lab Projects are designed to be shared and re-used. This feature makes havy use of Python’s virtualenv module, enabling users to precisely define modules and environments that are required to run the associated experiments.

Every Project is initiated using a requirements.txt file.

Logging

Lab was designed to benchmark multiple predictive models and hyperparameters. To accomplish this, it implements a simple API that stores:

  • Feature names
  • Hyperparameters
  • Performance metrics
  • Model files

Model Persistence

Models are logged using the joblib module. This applies to both sklearn and keras experiments. This simple structure allows for a quick performance assessment and deployment of a model of choice into production.

Example Use Cases

At Bering, we use Lab for a number of use cases:

Data Scientists track individual experiments locally on their machine, consistently organising all files and artefacts for reproducibility. By setting up a naming schema, Teams can work together on the same datasets to benchmark performance of novel ML algorithms.

Production Engineers assess model performances and decide on the best possible model to be served in production environments. Lab’s strict model versioning serves as a link between research and development environment and evolving production components.

ML Researchers can publish code to GitHub as a Lab Project, making it easy for others to reproduce findings.