.. _concepts: Concepts ======== Lab is centred around three core concepts: *Reproducibility*, *Logging*, and *Model Persistence*. Lab is designed to integrate with your existing training scripts, with imposing as few constraints as possible. Reproducibility --------------- Lab Projects are designed to be shared and re-used. This feature makes havy use of Python's ``virtualenv`` module, enabling users to precisely define modules and environments that are required to run the associated experiments. Every Project is initiated using a `requirements.txt `_ file. Logging ------- Lab was designed to benchmark multiple predictive models and hyperparameters. To accomplish this, it implements a simple API that stores: - Feature names - Hyperparameters - Performance metrics - Model files Model Persistence ----------------- Models are logged using the ``joblib`` module. This applies to both ``sklearn`` and ``keras`` experiments. This simple structure allows for a quick performance assessment and deployment of a model of choice into production. Example Use Cases ----------------- At Bering, we use Lab for a number of use cases: **Data Scientists** track individual experiments locally on their machine, consistently organising all files and artefacts for reproducibility. By setting up a naming schema, Teams can work together on the same datasets to benchmark performance of novel ML algorithms. **Production Engineers** assess model performances and decide on the best possible model to be served in production environments. Lab's strict model versioning serves as a link between research and development environment and evolving production components. **ML Researchers** can publish code to GitHub as a Lab Project, making it easy for others to reproduce findings.