Metadata-Version: 2.1
Name: dcargs
Version: 0.1.1
Summary: Strongly typed, zero effort CLIs
Home-page: http://github.com/brentyi/dcargs
Author: brentyi
Author-email: brentyi@berkeley.edu
License: MIT
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Provides-Extra: testing
Provides-Extra: type-checking
License-File: LICENSE

# dcargs

![build](https://github.com/brentyi/dcargs/workflows/build/badge.svg)
![mypy](https://github.com/brentyi/dcargs/workflows/mypy/badge.svg?branch=master)
![lint](https://github.com/brentyi/dcargs/workflows/lint/badge.svg)
[![codecov](https://codecov.io/gh/brentyi/dcargs/branch/master/graph/badge.svg)](https://codecov.io/gh/brentyi/dcargs)

<!-- vim-markdown-toc GFM -->

* [Overview](#overview)
* [Examples](#examples)
  * [Functions](#functions)
  * [Dataclasses](#dataclasses)
  * [Nested dataclasses](#nested-dataclasses)
* [Serialization](#serialization)
* [Alternative tools](#alternative-tools)

<!-- vim-markdown-toc -->

## Overview

**`dcargs`** is a library for strongly-typed argument parsers and configuration
objects.

```bash
pip install dcargs
```

Our core interface generates CLI interfaces from type-annotated callables, which
may be functions, classes, or dataclasses. The goal is a tool that's lightweight
enough for simple interactive scripts, but flexible enough to replace heavier
frameworks typically used to build hierarchical configuration systems.

<table><tr><td>
<details>
    <summary>
    <code><strong>dcargs.cli</strong>(f: Callable[..., T], *, description:
    Optional[str], args: Optional[Sequence[str]], default_instance: Optional[T]) -> T</code>
    </summary>

<!-- prettier-ignore-start -->
<pre><code>Call `f(...)`, with arguments populated from an automatically generated CLI
interface.

`f` should have type-annotated inputs, and can be a function, class, or dataclass.
Note that if `f` is a class, `dcargs.cli()` returns an instance.

The parser is generated by populating helptext from docstrings and types from
annotations; a broad range of core type annotations are supported...
    - Types natively accepted by `argparse`: str, int, float, pathlib.Path, etc.
    - Default values for optional parameters.
    - Booleans, which are automatically converted to flags when provided a default
      value.
    - Enums (via `enum.Enum`).
    - Various container types. Some examples:
      - `typing.ClassVar`.
      - `typing.Optional`.
      - `typing.Literal`.
      - `typing.Sequence`.
      - `typing.List`.
      - `typing.Tuple`, such as `typing.Tuple[T1, T2, T3]` or
        `typing.Tuple[T, ...]`.
      - `typing.Set`.
      - `typing.Final` and `typing.Annotated`.
      - Nested combinations of the above: `Optional[Literal[T]]`,
        `Final[Optional[Sequence[T]]]`, etc.
    - Nested dataclasses.
      - Simple nesting.
      - Unions over nested dataclasses (subparsers).
      - Optional unions over nested dataclasses (optional subparsers).
    - Generic dataclasses (including nested generics).

Args:
    f: Callable.

Keyword Args:
    description: Description text for the parser, displayed when the --help flag is
        passed in. If not specified, `f`'s docstring is used. Mirrors argument
        from `argparse.ArgumentParser()`.
    args: If set, parse arguments from a sequence of strings instead of the
        commandline. Mirrors argument from `argparse.ArgumentParser.parse_args()`.
    default_instance: An instance of `T` to use for default values; only supported
        if `T` is a dataclass type. Helpful for merging CLI arguments with values loaded
        from elsewhere. (for example, a config object loaded from a yaml file)

Returns:
    The output of `f(...)`.</code></pre>
<!-- prettier-ignore-end -->

</details>
</td></tr></table>

Notably, `dcargs.cli()` supports _nested_ classes and dataclasses, which enable
expressive hierarchical configuration objects built on standard Python features.

Our ultimate goal is an interface that's:

- **Low-effort.** Type annotations, docstrings, and default values can be used
  to automatically generate argument parsers with informative helptext. This
  includes bells and whistles like enums, containers, etc.
- **Strongly typed.** Unlike dynamic configuration namespaces produced by
  libraries like `argparse`, `YACS`, `abseil`, `hydra`, or `ml_collections`,
  statically typed outputs mean that IDE-assisted autocomplete, rename,
  refactor, go-to-definition operations work out-of-the-box, as do static
  checking tools like `mypy` and `pyright`.
- **Modular.** Most approaches to configuration objects require a centralized
  definition of all configurable fields. Supporting hierarchically nested
  configuration classes/dataclasses, however, makes it easy to distribute
  definitions, defaults, and documentation of configurable fields across modules
  or source files. A model configuration dataclass, for example, can be
  co-located in its entirety with the model implementation and dropped into any
  experiment configuration with an import — this eliminates redundancy and makes
  the entire module easy to port across codebases.

## Examples

A series of example scripts can be found in [./examples](./examples).

### Functions

```python
# examples/0_simple_function.py
import dcargs


def main(
    field1: str,
    field2: int,
    flag: bool = False,
) -> None:
    """Function, whose arguments will be populated from a CLI interface.

    Args:
        field1: First field.
        field2: Second field.
        flag: Boolean flag that we can set to true.
    """
    print(field1, field2, flag)


if __name__ == "__main__":
    dcargs.cli(main)
```

---

```console
$ python 0_simple_function.py --help
usage: 0_simple_function.py [-h] --field1 STR --field2 INT [--flag]

Function, whose arguments will be populated from a CLI interface.

required arguments:
  --field1 STR  First field.
  --field2 INT  Second field.

optional arguments:
  -h, --help    show this help message and exit
  --flag        Boolean flag that we can set to true.
```

### Dataclasses

```python
# examples/1_simple_dataclass.py
import dataclasses

import dcargs


@dataclasses.dataclass
class Args:
    """Description.
    This should show up in the helptext!"""

    field1: str  # A string field.
    field2: int  # A numeric field.
    flag: bool = False  # A boolean flag.


if __name__ == "__main__":
    args = dcargs.cli(Args)
    print(args)
```

---

```console
$ python 1_simple_dataclass.py --help
usage: 1_simple_dataclass.py [-h] --field1 STR --field2 INT [--flag]

Description.
This should show up in the helptext!

required arguments:
  --field1 STR  A string field.
  --field2 INT  A numeric field.

optional arguments:
  -h, --help    show this help message and exit
  --flag        A boolean flag.
```

### Nested dataclasses

```python
# examples/6_nested_dataclasses.py
import dataclasses
import enum

import dcargs


class OptimizerType(enum.Enum):
    ADAM = enum.auto()
    SGD = enum.auto()


@dataclasses.dataclass(frozen=True)
class OptimizerConfig:
    # Gradient-based optimizer to use.
    algorithm: OptimizerType = OptimizerType.ADAM

    # Learning rate to use.
    learning_rate: float = 3e-4

    # Coefficient for L2 regularization.
    weight_decay: float = 1e-2


@dataclasses.dataclass(frozen=True)
class ExperimentConfig:
    """A nested experiment configuration. Note that the argument parser description is
    pulled from this docstring by default, but can also be overrided with
    `dcargs.cli()`'s `description=` argument."""

    # Experiment name to use.
    experiment_name: str

    # Various configurable options for our optimizer.
    optimizer: OptimizerConfig

    # Random seed. This is helpful for making sure that our experiments are all
    # reproducible!
    seed: int = 0


if __name__ == "__main__":
    config = dcargs.cli(ExperimentConfig)
    print(config)
    print(dcargs.to_yaml(config))
```

---

```console
usage: 6_nested_dataclasses.py [-h] --experiment-name STR [--optimizer.algorithm {ADAM,SGD}]
                               [--optimizer.learning-rate FLOAT] [--optimizer.weight-decay FLOAT]
                               [--seed INT]

A nested experiment configuration. Note that the argument parser description is
pulled from this docstring by default, but can also be overrided with
`dcargs.cli()`'s `description=` flag.

required arguments:
  --experiment-name STR
                        Experiment name to use.

optional arguments:
  -h, --help            show this help message and exit
  --seed INT            Random seed. This is helpful for making sure that our experiments are all
                        reproducible! (default: 0)

optional optimizer arguments:
  Various configurable options for our optimizer.

  --optimizer.algorithm {ADAM,SGD}
                        Gradient-based optimizer to use. (default: ADAM)
  --optimizer.learning-rate FLOAT
                        Learning rate to use. (default: 0.0003)
  --optimizer.weight-decay FLOAT
                        Coefficient for L2 regularization. (default: 0.01)
```

## Serialization

As a secondary feature aimed at enabling the use of `dcargs.cli()` for general
configuration use cases, we also introduce functions for human-readable
dataclass serialization:

- <code><strong>dcargs.from_yaml</strong>(cls: Type[T], stream: Union[str,
  IO[str], bytes, IO[bytes]]) -> T</code> and
  <code><strong>dcargs.to_yaml</strong>(instance: T) -> str</code> convert
  between YAML-style strings and dataclass instances.

The functions attempt to strike a balance between flexibility and robustness —
in contrast to naively dumping or loading dataclass instances (via pickle,
PyYAML, etc), explicit type references enable custom tags that are robust
against code reorganization and refactor, while a PyYAML backend enables
serialization of arbitrary Python objects.

## Alternative tools

The core functionality of `dcargs` --- generating argument parsers from type
annotations --- can be found as a subset of the features offered by many other
libraries. A summary of some distinguishing features:

|                                                                                                              | Choices from literals                                    | Generics | Docstrings as helptext | Nesting | Subparsers | Containers |
| ------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------- | -------- | ---------------------- | ------- | ---------- | ---------- |
| **dcargs**                                                                                                   | ✓                                                        | ✓        | ✓                      | ✓       | ✓          | ✓          |
| **[datargs](https://github.com/roee30/datargs)**                                                             | ✓                                                        |          |                        |         | ✓          | ✓          |
| **[tap](https://github.com/swansonk14/typed-argument-parser)**                                               | ✓                                                        |          | ✓                      |         | ✓          | ✓          |
| **[simple-parsing](https://github.com/lebrice/SimpleParsing)**                                               | [soon](https://github.com/lebrice/SimpleParsing/pull/86) |          | ✓                      | ✓       | ✓          | ✓          |
| **[argparse-dataclass](https://pypi.org/project/argparse-dataclass/)**                                       |                                                          |          |                        |         |            |            |
| **[argparse-dataclasses](https://pypi.org/project/argparse-dataclasses/)**                                   |                                                          |          |                        |         |            |            |
| **[dataclass-cli](https://github.com/malte-soe/dataclass-cli)**                                              |                                                          |          |                        |         |            |            |
| **[clout](https://pypi.org/project/clout/)**                                                                 |                                                          |          |                        | ✓       |            |            |
| **[hf_argparser](https://github.com/huggingface/transformers/blob/master/src/transformers/hf_argparser.py)** |                                                          |          |                        |         |            | ✓          |
| **[pyrallis](https://github.com/eladrich/pyrallis/)**                                                        |                                                          |          | ✓                      | ✓       |            | ✓          |

Note that most of these other libraries are generally aimed specifically at
_dataclasses_ rather than general typed callables, but offer other features that
you might find useful, such as registration for custom types (`pyrallis`),
different approaches for serialization and config files (`tap`, `pyrallis`),
simultaneous parsing of multiple dataclasses (`simple-parsing`), etc.


