Metadata-Version: 2.1
Name: validify
Version: 0.0.15
Summary: validify is a rule-based validation module for assessing the structure of an xml tree, built on top of the lxml library. It currently covers a subset of the XML Schema 1.1 Definition.
Home-page: https://github.com/olivergoetze/validify
Author: Oliver Götze
Author-email: oliver.goetze@mailbox.org
License: MIT
Description: ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/validify) 
        ![PyPI - Status](https://img.shields.io/pypi/status/validify)
        ![PyPI - License](https://img.shields.io/pypi/l/validify)
        
        `validify` is a rule-based validation module for assessing the structure of an xml tree, written in Python and built on top of the [lxml](https://lxml.de/) library. It currently covers a subset of the XML Schema 1.1 Definition.
        
        ### Requirements
        - Python 3.5+
        - [lxml](https://pypi.org/project/lxml/)
        - [loguru](https://pypi.org/project/loguru/)
        
        ### Installation
        The `validify` module can be found on [PyPI](https://pypi.org/project/validify/).
        It can be installed by using pip:
        
        `pip install validify`
        
        Dependencies will be automatically fetched by pip.
        
        ### Basic usage
        ```python
        import validify
        validation_result = validify.validate(input_file="validify/test.xml", input_elementtree=None, xmlns_def=None, validation_rules=None, message_lang=None, log_to_console=True, log_debug=False)
        ```
        
        #### Parameters
        - input_file (default: `None`): path to xml file which should be validated.
        - input_elementtree (default: `None`): an etree.ElementTree object (parameter 'input_elementtree') can also be passed, instead of an input file string.
        - xmlns_def (default: `None`): a namespace definition can be supplied as a python dictionary object (`{None: "default_namespace", "namespace_prefix": "another_namespace"}}`). 
        - validation_rules (default: `None`): a python dictionary object containing the validation rules (see "Defining validation rules" below for an example.). An example rules dictionary is used if no value is supplied here.
        - message_lang (default: `de`): language for validation message strings. Supported values are `en` and `de`.
        - log_to_console (default: `True`): if `True`, validation and status messages are logged to console. If false, validation messages are only added to the results dict returned by `validfy`.
        - log_debug (default: `False`): if `True`, debug status messages are logged to console. 
        
        One of the paramaters `input_file` or `input_elementtree` should be passed for the library to produce validation results.
        
        #### Defining validation rules
        Validation rules are defined in a dictionary object (JSON-like structure):
        ```python
        validation_rules = {}
        
        validation_rules["element"] = []
        ruleset = {}
        
        ruleset["element_content_optional"] = False
        ruleset["element_children_optional"] = False
        ruleset["optional_attributes"] = ["valid_optional_attribute_01", "valid_optional_attribute_02"]
        ruleset["obligatory_attributes"] = ["obligatory_attribute_01", "obligatory_attribute_02"]
        ruleset["optional_subelements"] = ["optional_subelement_01", "optional_subelement_02"]
        ruleset["obligatory_subelements"] = ["obligatory_subelement_01", "obligatory_subelement_02"]
        ruleset["max_occurence"] = 2
        ruleset["text_character_content_allowed"] = True
        ruleset["tail_character_content_allowed"] = False
        
        ruleset["allowed_values"] = ["valid_value_01", "valid_value_02"]
        ruleset["allowed_patterns"] = ["^test-\d{4}$", "^test-\d{3}$"]
        ruleset["allowed_datatypes"] = []
        ruleset["attribute_def"] = []
        ruleset["attribute_def"].append({"attribute_name": "valid_optional_attribute_01", "allowed_values": ["valid_value_01", "valid_value_02"],
                                         "allowed_patterns": ["^test-\d{4}$", "^test-\d{3}$"]})
        ruleset["attribute_def"].append(
            {"attribute_name": "obligatory_attribute_01", "allowed_values": ["valid_value_01", "valid_value_02"],
             "allowed_patterns": []})
        
        
        ruleset["rule_conditions"] = []
        ruleset["rule_conditions"].append(
            {"text_values": ["valid_text_value"],
             "attribute_def": [{"attribute_name": "valid_attribute_name", "allowed_values": ["valid_value"]},
                               {"attribute_name": "another_valid_attribute_name", "allowed_values": ["valid_value"]}],
             "reference_elements": [{"element_name": "reference_element", "attribute_def": [{"attribute_name": "reference_test", "allowed_values": ["valid_value"]}],"preceding_elements": 1}]})
        
        
        validation_rules["element"].append(ruleset)
        ```
        Each element can be provided with one or more rulesets.
        The `rule_conditions` definiton can be used when the ruleset should only be applied if the validated element contains the defined attribute(s) and attribute value(s). Besides the validated element, a reference element can also be defined and checked for attribute values. Currently, it must be a parent element of the validated element (parent level defined by `preceding_elements`).
        
        #### Validation output
        `validify.validate` returns a list containing the validation messages as dictionaries:
        ```python
        [{'message_id': '0001', 'message_text ': 'Element example_element does not contain any subelements, although one or more subelements are expected.', 'element_name': '{namespace}example_element', 'local_name': 'example_element', 'element_sourceline': '23'}]
        ```
        
        ### XML Schema feature coverage
        For now, a small subset of the [XML Schema](https://www.w3.org/TR/xmlschema11-1/) features is provided:
        - Test if a ruleset applies by checking a reference element's text and attribute values
        - Define if element childen and content are optional
        - Define optional and obligatory attributes
        - Define optional and obligatory subelements
        - Define maximum occurence of an element
        - Define if character content is allowed
        - Define an element's allowed values (~ xs:enumeration)
        - Define an element's allowed patterns (~ xs:pattern)
        - Define an attribute's allowed values and patterns
        
        This module is currently used for validating data deliveries in the [EAD XML application profile](https://wiki.deutsche-digitale-bibliothek.de/pages/viewpage.action?pageId=19010180), which are processed for ingesting in the metadata portals [Deutsche Digitale Bibliothek](https://www.ddb.de) and [Archivportal-D](https://www.archivportal-d.de). Therefore, supported features currently are nowhere near those provided by the XML Schema standard. Feature support is supposed to be gradually expanded, however.
        
        The following features are planned for a future release:
        - checking max/min text and attribute values
        - validating string length
        - support for pre-defined data types (i.e. `xs:ID`, `xs:NMTOKEN`)
        - rule conditions: direct support for XPath and lxml's `itersiblings` and `iterancestors` methods.
        
        ### Development status
        This package is in an early development stage. It should already work reliably for intended use cases, but documentation and stability of API are still lacking.
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Intended Audience :: Developers
Requires-Python: >=3.5
Description-Content-Type: text/markdown
