Metadata-Version: 2.1
Name: logexp
Version: 0.1.3
Summary: simple experiment manager for machine learning.
Home-page: https://github.com/altescy/logexp
Author: altescy
Author-email: altescy@fastmail.com
License: MIT License
Description: # logexp
        [![Actions Status](https://github.com/altescy/logexp/workflows/build/badge.svg)](https://github.com/altescy/logexp)
        [![Python version](https://img.shields.io/pypi/pyversions/logexp)](https://github.com/altescy/logexp)
        [![pypi version](https://img.shields.io/pypi/v/logexp)](https://pypi.org/project/logexp/)
        [![license](https://img.shields.io/github/license/altescy/logexp)](https://github.com/altescy/logexp/blob/master/LICENSE)
        
        ## Quick Links
        
        - [Installation](#Installation)
        - [Tutorial](#Tutorial)
        - [scikit-learn example](https://github.com/altescy/logexp/tree/master/examples/scikit-learn)
        - [PyTorch example](https://github.com/altescy/logexp/tree/master/examples/pytorch)
        
        ## Introduction
        
        `logexp` is a simple experiment manager for machine learning.
        You can manage your experiments and executions from command line interface.
        
        - Features
          - **track experiments**: `logexp` tracks experiments and environment.
          - **manage parameters**: Import / export worker parameters with JSON format.
          - **capture stdout / stderr**: Capture stdout / stderr during execution automatically.
          - **search logs**: You can search your runs with [`jq`](https://stedolan.github.io/jq/) command.
          - **written in pure Python**: `logexp` has no external dependencies.
        
        
        ## Installation
        
        Installing the library is simple using `pip`.
        ```
        pip install logexp
        ```
        
        ## Tutorial
        
        In this tutorial we'll implement a simple worker for machine learning with [`scikit-learn`](https://scikit-learn.org/).
        And then, let me introduce some operations to manage experiments and executions.
        
        ### 1. Create worker
        
        This worker trains `RandomForestClassifier` and saves a trained model.
        
        Worker needs to inherit `logexp.BaseWorker`.
        In `config` method, you can define worker parameters, that are logged automatically.
        Write your task in `run` method, and return `logexp.Report` which describes quick result if you need.
        
        `BaseWorker.storage` is an artifact storage.
        You can save any files by using this storage.
        
        ```
        $ cat << EOF > iris.py
        import logexp
        import numpy as np
        import pickle
        from sklearn.datasets import load_iris
        from sklearn.model_selection import train_test_split
        from sklearn.ensemble import RandomForestClassifier
        
        ex = logexp.Experiment("sklearn-iris")
        
        @ex.worker("train-rfc")
        class TrainRandomForest(logexp.BaseWorker):
            def config(self):
                self.rfc_params = {
                    "n_estimators": 100,
                    "min_samples_leaf": 1,
                    "random_state": 0,
                }
                self.test_size = 0.3
                self.random_seed = 0
        
            def run(self):
                np.random.seed(self.random_seed)
        
                X, y = load_iris(return_X_y=True)
        
                X_train, X_valid, y_train, y_valid = \
                    train_test_split(X, y, test_size=self.test_size)
        
                model = RandomForestClassifier(**self.rfc_params)
                model.fit(X_train, y_train)
        
                with self.storage.open("rfc.pkl", "wb") as f:
                    pickle.dump(model, f)
        
                train_accuracy = model.score(X_train, y_train)
                valid_accuracy = model.score(X_valid, y_valid)
        
                report = logexp.Report()
                report["train_size"] = len(X_train)
                report["valid_size"] = len(X_valid)
                report["train_accuracy"] = train_accuracy
                report["valid_accuracy"] = valid_accuracy
        
                return report
        EOF
        ```
        
        
        ### 2. Initialize experiment
        
        Following command creates log-store directory (`./.logexp` by default) and returns `experiment_id`.
        
        ```
        $ logexp init -m iris -e sklearn-iris
        experiment id: 0
        ```
        
        
        ### 3. Edit parameters
        
        Export default parameters with JSON format via:
        ```
        $ logexp params -m iris -e sklearn-iris -w train-rfc > params.json
        $ cat params.json
        {
          "rfc_params": {
            "n_estimators": 100,
            "min_samples_leaf": 1,
            "random_state": 0
          },
          "test_size": 0.3,
          "random_seed": 0
        }
        ```
        
        You can also export params from specified run:
        
        ```
        $ logexp params -r [ RUN_ID ]
        ```
        
        Edit `params.json` file if you need.
        
        
        ### 4. Run worker
        
        Run worker via `$ logexp run` command and see quick result like bellow:
        
        ```
        $ logexp run -m iris -e 0 -w train-rfc -p params.json
        ** WORKER REPORT **
        {
          "train_size": 105,
          "valid_size": 45,
          "train_accuracy": 1.0,
          "valid_accuracy": 0.9777777777777777
        }
        
        ** SUMMARY **
        run_id     : 7fcd37ef38104715ad60bd55b7e1023d
        name       :
        module     : iris
        experiment : sklearn-iris
        worker     : train-rfc
        status     : finished
        artifacts  : {'rootdir': '/src/.logexp/0/train-rfc/7fcd37ef38104715ad60bd55b7e1023d/artifacts'}
        start_time : 2020-01-19 05:14:05.246681
        end_time   : 2020-01-19 05:14:05.430199
        ```
        
        ### 5. View logs
        
        Following command lists up executions:
        
        ```
        $ logexp list -e 0 --sort start_time
        run_id                           name exp_id exp_name     worker    status   start_time          end_time            note
        ================================ ==== ====== ============ ========= ======== =================== =================== ====
        7fcd37ef38104715ad60bd55b7e1023d      0      sklearn-iris train-rfc finished 2020-01-19 05:14:05 2020-01-19 05:14:05
        5300f7fc32b949bba6775c5899e09ae9      0      sklearn-iris train-rfc finished 2020-01-19 05:44:04 2020-01-19 05:44:04
        ```
        
        `$ logexp logs` command exports all logs with JSON format.
        Using [`jq`](https://stedolan.github.io/jq/) command, you can do more complex search.
        
        ```
        $ logexp logs -e 0 | jq '
          map(select(.status == "finished"))
            | sort_by(.report.valid_accuracy)
            | reverse
            | .[]
            | {run_id: .uuid, valid_accuracy: .report.valid_accuracy}'
        {
          "run_id": "7fcd37ef38104715ad60bd55b7e1023d",
          "valid_accuracy": 0.9777777777777777
        }
        {
          "run_id": "5300f7fc32b949bba6775c5899e09ae9",
          "valid_accuracy": 0.9555555555555556
        }
        ```
        
Keywords: machine learning experiment manager
Platform: UNKNOWN
Classifier: Intended Audience :: Science/Research
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.7.3
Description-Content-Type: text/markdown
