Metadata-Version: 2.1
Name: rocks-classifier
Version: 0.0.8
Summary: Rock classifier deployed on railway and monitored using Weights and Biases!
Home-page: https://github.com/udaylunawat/Whats-this-rock
Author: udaylunawat
Author-email: udaylunawat@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Provides-Extra: dev
Provides-Extra: gpu
Provides-Extra: cpu
License-File: LICENSE

Whats-this-rock
================

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
<p>
This project deploys a telegram bot that classifies rock images into 1
of 7 types.</br>
<img src="https://i.imgur.com/cDrrfqF.jpg" alt="What's my name?" width=20% align="right"/>
</p>

![GitHub Workflow
Status](https://img.shields.io/github/workflow/status/udaylunawat/Whats-this-rock/Lint%20Code%20Base.png)
![GitHub
issues](https://img.shields.io/github/issues-raw/udaylunawat/Whats-this-rock.png)
[![GitHub
Super-Linter](https://github.com/nvuillam/npm-groovy-lint/workflows/Lint%20Code%20Base/badge.svg)](https://github.com/marketplace/actions/super-linter)

![code-size](https://img.shields.io/github/languages/code-size/udaylunawat/Whats-this-rock.png)
![repo-size](https://img.shields.io/github/repo-size/udaylunawat/Whats-this-rock.png)
![top-language](https://img.shields.io/github/languages/top/udaylunawat/Whats-this-rock.png)

![Python](https://img.shields.io/badge/python-v3.8.0+-success.svg)
![Tensorflow](https://img.shields.io/badge/tensorflow-v2.9.0+-success.svg)

[![contributions
welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/dwyl/esta/issues)
[![HitCount](https://hits.dwyl.com/udaylunawat/Whats-this-rock.svg?style=flat)](http://hits.dwyl.com/udaylunawat/Whats-this-rock)

![](https://img.shields.io/twitter/follow/udaylunawat?style=social.png)

This package uses [tensorflow](https://github.com/tensorflow/tensorflow)
to accelerate deep learning experimentation.

MLOps workflow like

- Experiment Tracking
- Model Management
- Hyperparameter Tuning

was all done using [Weights & Biases](https://wandb.ai)

Additionally, [nbdev](https://github.com/fastai/nbdev) was used to

- develop the package
- produce documentation based on a series of notebooks.
- CI
- publishing to [PyPi](https://pypi.org/project/rocks-classifier/)

# Inspiration

> [The common complaint that you need massive amounts of data to do deep
> learning  can be a very long way from the
> truth!](https://youtu.be/J6XcP4JOHmk?t=2029)

> You very often don’t need much data at all, a lot of people are
> looking for ways to share data and aggregate data, but that’s
> unnecessary.They assume they need more data than they do, cause
> they’re not familiar with the basics of transfer learning which is
> this critical technique for needing orders of magnitudes less data.

> [Jeremy
> Howards](https://en.wikipedia.org/wiki/Jeremy_Howard_(entrepreneur))

## Installation & Training Steps

### Install

To install, use `pip`:

    pip install git+https://github.com/udaylunawat/Whats-this-rock.git

### Use the Telegram Bot

You can try the bot [here](https://t.me/test7385_bot) on Telegram.

> Type `/help` to get instructions in chat.

### Deploy Telegram Bot

``` bash
rocks_deploy_bot
```

### Download and process data

``` bash
%%bash
rocks_process_data remove_bad= True \
                    remove_misclassified= True \
                    remove_duplicates= True \
                    remove_corrupted= True \
                    remove_unsupported= True \
                    sampling=None \
                    train_split=0.8 \
```

### Train Model

Run these commands

``` bash
rocks_train_model wandb.project=Whats-this-rock \
                    wandb.mode=offline \
                    wandb.use=False \
                    dataset_id=[1,2] \
                    epochs=30 \
                    lr=0.005 \
                    augmentation=None \
                    monitor=val_loss \
                    loss=categorical_crossentropy \
                    backbone=resnet \
                    lr_schedule=cosine_decay_restarts \
                    lr_decay_steps=300 \
                    trainable=False \
```

You can try different models and parameters by editing `config.json`.

By using Hydra it’s now much more easier to override parameters like
this

``` bash
rocks_train_model wandb.project=Whats-this-rockv \
                  dataset_id=[1,2] \
                  epochs=50 \
                  backbone=resnet
```

<p align="left">
<img src="https://i.imgur.com/1nBpPC5.png" alt="result" width=100%/>
</p>

### Wandb Sweeps (Hyperparameter Tuning)

Edit configs/sweeps.yaml

``` bash
wandb sweep \
--project Whats-this-rock \
--entity udaylunawat \
configs/sweep.yaml
```

This will return a command with \$sweepid

``` bash
wandb agent udaylunawat/Whats-this-rock/$sweepid
```

## Demo

|                                                                                                                                                                          |                                                                                                                                              |                                                                                                                                                                                     |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ![alt colab](https://www.tensorflow.org/images/colab_logo_32px.png)[Run in Colab](https://colab.research.google.com/drive/1N1CIqdOKlJSJla5PU53Yn9KWSao47eMv?usp=sharing) | ![alt Source](https://www.tensorflow.org/images/GitHub-Mark-32px.png)[View Source on GitHub](https://github.com/udaylunawat/Whats-this-rock) | ![alt noteboook](https://www.tensorflow.org/images/download_logo_32px.png)[Download Notebook](https://github.com/udaylunawat/Whats-this-rock/blob/main/notebooks/03_training.ipynb) |

## Features

<table border="0" class="left">
<tr>
<td>
<b>\<style=‘font-size:37px’\>Features added</b>
</td>
<td>
<b>\<style=‘font-size:37px’\>Features planned</b>
</td>
</tr>
<tr>
<td>

- Wandb

- Datasets

  - 4 Datasets

- Augmentation

  - keras-cv
  - Regular Augmentation

- Sampling

  - Oversampling
  - Undersampling
  - Class weights

- Remove Corrupted Images

- Try Multiple Optimizers (Adam, RMSProp, AdamW, SGD)

- Generators

  - TFDS datasets
  - ImageDataGenerator

- Models

  - ConvNextTiny
  - BaselineCNN
  - Efficientnet
  - Resnet101
  - MobileNetv1
  - MobileNetv2
  - Xception

- LRScheduleer, LRDecay

  - Baseline without scheduler
  - Step decay
  - Cosine annealing
  - Classic cosine annealing with bathc steps w/o restart

- Model Checkpoint, Resume Training

- Evaluation

  - Confusion Matrix
  - Classification Report

- Deploy Telegram Bot

  - Heroku - Deprecated
  - Railway
  - Show CM and CL in bot

- Docker

- GitHub Actions

  - Deploy Bot when bot.py is updated.
  - Lint code using GitHub super-linter

- Configuration Management

  - ml-collections
  - Hydra

- Performance improvement

  - Convert to tf.data.Dataset

- Linting & Formatting

  - Black
  - Flake8
  - isort
  - pydocstyle

- Add Badges

  - Linting

- found the classes that the model is performing terribly on

- nbdev

- CI

- documentation

  </td>
  <td>

- [ ] Deploy to Huggingface spaces

- [ ] Accessing the model through FastAPI (Backend)

- [ ] Streamlit (Frontend)

- [ ] convert models.py to Classes and more OOP style

- [ ] Group Runs

  - [ ] kfold cross validation

- [ ] [WandB
  Tables](https://twitter.com/ayushthakur0/status/1508962184357113856?s=21&t=VRL-ZXzznXV_Hg2h7QnjuA)

- [ ] find the long tail examples or hard examples,

- [ ] Add Badges

  - [ ] Railway

  </td>

</tr>
</table>

## Technologies Used

|                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                 |                                                                                                                                                                                                  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [![Google Colab](https://img.shields.io/badge/Compute-Google%20Colab-F9AB00?logo=googlecolab&logoColor=fff&style=for-the-badge.png)](https://colab.research.google.com/drive/1N1CIqdOKlJSJla5PU53Yn9KWSao47eMv?usp=sharing "Google collaboratory") | [![python-telegram-bot](https://img.shields.io/badge/ChatBot-Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=black.png)](https://github.com/python-telegram-bot/python-telegram-bot "Telegram Bot") | [![Railway](https://img.shields.io/badge/Deployment-Railway-131415?style=for-the-badge&logo=railway&logoColor=black.png)](https://railway.app "Railway")                                         |
| [![Jupyter Notebook](https://img.shields.io/badge/Coding-jupyter-%23FA0F00.svg?style=for-the-badge&logo=jupyter&logoColor=black)](https://jupyter.org "Jupyter")                                                                                   | [![Python](https://img.shields.io/badge/Language-python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54.png)](https://www.python.org/ "Python")                                                         | [![GitHub Actions](https://img.shields.io/badge/CI-github%20actions-%232671E5.svg?style=for-the-badge&logo=githubactions&logoColor=black)](https://github.com/features/actions "Github Actions") |
| [![Weights & Biases](https://img.shields.io/badge/MLOps-Weights%20%26%20Biases-FFBE00?logo=weightsandbiases&logoColor=000&style=for-the-badge.png)](http://wandb.ai "Weights & Biases")                                                            | [![TensorFlow](https://img.shields.io/badge/ML_Framework-TensorFlow-%23FF6F00.svg?style=for-the-badge&logo=TensorFlow&logoColor=black)](https://www.tensorflow.org/ "Tensorflow")                               | [![macOS](https://img.shields.io/badge/OS-mac%20os-000000?style=for-the-badge&logo=macos&logoColor=F0F0F0.png)](https://apple.com/macos "macOS")                                                 |
| [![Docker](https://img.shields.io/badge/Container-docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=black)](http://docker.com "Docker")                                                                                               | [![Git](https://img.shields.io/badge/Version_Control-git-%23F05033.svg?style=for-the-badge&logo=git&logoColor=black)](https://git-scm.com "Git")                                                                | [![Hydra](https://img.shields.io/badge/config-hydra1.1-89b8cd?style=for-the-badge&labelColor=gray)](http://hydra.cc "Hydra")                                                                     |
| [![Black](https://img.shields.io/badge/code%20style-black-black.svg?style=for-the-badge&labelColor=gray)](http://github.com/psf/black "Black")                                                                                                     |                                                                                                                                                                                                                 |                                                                                                                                                                                                  |

<!-- ## Technologies Used
::: {layout-ncol=3}

[![][jupyter-shield]][Jupyter]

[![][wandb-shield]][wandb]

[![][git-shield]][git]

[![][black-shield]][black]

[![][hydra-shield]][Hydra]

[![][docker-shield]][Docker]

[![][colab-shield]][googlecolab]

[![][telegram-shield]][python-telegram-bot]

[![][railway-shield]][Railway]

[![][python-shield]][Python]

[![][githubactions-shield]][GitHubActions]

[![][tensorflow-shield]][TensorFlow]

[![][mac-shield]][Macos]

::: -->

## Directory Tree

    ├── imgs                              <- Images for skill banner, project banner and other images
    │
    ├── configs                           <- Configuration files
    │   ├── configs.yaml                  <- config for single run
    │   └── sweeps.yaml                   <- confguration file for sweeps hyperparameter tuning
    │
    ├── data
    │   ├── corrupted_images              <- corrupted images will be moved to this directory
    │   ├── misclassified_images          <- misclassified images will be moved to this directory
    │   ├── bad_images                    <- Bad images will be moved to this directory
    │   ├── duplicate_images              <- Duplicate images will be moved to this directory
    │   ├── sample_images                 <- Sample images for inference
    │   ├── 0_raw                         <- The original, immutable data dump.
    │   ├── 1_external                    <- Data from third party sources.
    │   ├── 2_interim                     <- Intermediate data that has been transformed.
    │   └── 3_processed                   <- The final, canonical data sets for modeling.
    │
    ├── notebooks                         <- Jupyter notebooks. Naming convention is a number (for ordering),
    │                                        the creator's initials, and a short `-` delimited description, e.g.
    │                                        1.0-jqp-initial-data-exploration`.
    │
    │
    ├── rocks_classifier                  <- Source code for use in this project.
    │   │
    │   ├── data                          <- Scripts to download or generate data
    │   │   ├── download.py
    │   │   ├── preprocess.py
    │   │   └── utils.py
    │   │
    │   ├── callbacks                     <- functions that are executed during training at given stages of the training procedure
    │   │   └── callbacks.py
    │   │
    │   ├── models                        <- Scripts to train models and then use trained models to make
    │   │   │                                predictions
    │   │   ├── evaluate.py
    │   │   ├── models.py
    │   │   ├── predict.py
    │   │   ├── train.py
    │   │   └── utils.py
    │   │
    │   │
    │   └── visualization                 <- Scripts for visualizations
    │
    ├── .dockerignore                     <- Docker ignore
    ├── .gitignore                        <- GitHub's excellent Python .gitignore customized for this project
    ├── LICENSE                           <- Your project's license.
    ├── README.md                         <- The top-level README for developers using this project.
    ├── CHANGELOG.md                      <- Release changes.
    ├── CODE_OF_CONDUCT.md                <- Code of conduct.
    ├── CONTRIBUTING.md                   <- Contributing Guidelines.
    ├── settings.ini                      <- configuration.
    ├── README.md                         <- The top-level README for developers using this project.
    ├── requirements.txt                  <- The requirements file for reproducing the analysis environment, e.g.
    │                                        generated with `pip freeze > requirements.txt`
    └── setup.py                          <- makes project pip installable (pip install -e .) so src can be imported

## Bug / Feature Request

If you find a bug (the site couldn’t handle the query and / or gave
undesired results), kindly open an issue
[here](https://github.com/udaylunawat/Whats-this-rock/issues) by
including your search query and the expected result.

If you’d like to request a new function, feel free to do so by opening
an issue [here](https://github.com/udaylunawat/Whats-this-rock/issues).
Please include sample queries and their corresponding results.

<!-- CONTRIBUTING -->

## Contributing

- Contributions make the open source community such an amazing place to
  learn, inspire, and create.
- Any contributions you make are **greatly appreciated**.
- Check out our [contribution guidelines](../CONTRIBUTING.md) for more
  information.

## License

LinkFree is licensed under the MIT License - see the [LICENSE](LICENSE)
file for details.

## Credits

- [Dataset 1 - by Mahmoud
  Alforawi](https://www.kaggle.com/datasets/mahmoudalforawi/igneous-metamorphic-sedimentary-rocks-and-minerals)
- [Dataset 2 - by
  salmaneunus](https://www.kaggle.com/datasets/salmaneunus/rock-classification)
- nbdev inspiration - [tmabraham](https://github.com/tmabraham/UPIT)
- 

## Support

This project needs a ⭐️ from you. Don’t forget to leave a star ⭐️

<br>
<p align="center">
Walt might be the one who knocks <br> but Hank is the one who rocks.
</br>
</p>
