VALVE
VALVE is a Lightweight validation Engine.
https://github.com/ontodev/valve.rs
Status:
- active development
- “alpha” state
Key Features:
- work with invalid data in the process of being cleaned
- work with directories of TSV files
- work with SQL databases: SQLite and Postgres
- configure with tables
- validate in batches and interactively
- support Nanobot web interface
Planned Features:
- undo/redo history
- work with Excel files
- work with Google Sheets
Standard Workflow:
- you install a single Rust binary
- you tell VALVE to read the
table.tsvfile and create a SQLite database - VALVE loads and checks the configuration
- VALVE builds SQL tables and views
- VALVE loads tables into SQL: validating in batch mode
- you review the VALVE validation messages
- you edit with Nanobot web interface, validating in interactive mode
- you tell VALVE to save to TSV
- “perfect” round-trips, important for version control
- not relevant for large tables
- you use the SQL database with another application
A VALVE table:
- has a header row
- has distinct headers for each column
- each row has the same number of cells
- each column has a single datatype (like a SQL database)
A VALVE SQL database:
- has a “normal” table for each source table
- SQL datatypes and constraints: primary, unique, foreign
- data in the normal table may not be valid, but it meets SQL constraints
- has a “foo_conflict” table for each normal table
- stores rows that violate SQL constraints
- has a “foo_view” table for each normal table
- the union of the normal and conflict tables
- joins the message table
- has a “message” table listing validation messages
- errors, warnings, information, debugging messages
- specifies the validation rule and a message for the user
- has a “history” table for undo/redo
Standard data flow:
- load TSV file as a list of dictionaries from column name to cell value
- all values are text
- use VALVE configuration to enrich cells as JSON:
- value
- datatype
- nulltype
- validation status
- SQL conflict
- list of validation messages
- history of changes
- batch insert into SQL:
- insert into normal table or conflict table
- add messages to message table
- interactive edit/add/delete rows
- validate row and dependants
- allow invalid data
- move row between normal and conflict tables, as required
- add to undo/redo history
- export to TSV
- convert rich datatypes to text cells
- aim for perfect roundtrip conversion
Nanobot is:
- a web interface for VALVE tables and ontologies
- represents tables and trees
- table filtering, paging, linking
- allows interactive validation and editing
- supports custom actions
- installed as a single Rust binary
- runs as a local web application, or part of a larger web application