/portaljs-check-data-quality
/portaljs-check-data-quality is the auditing skill for the dataset contract. Before you
add a file to the catalog with /portaljs-add-dataset or
describe it with /portaljs-define-schema, this skill
looks at the data itself and reports what's wrong with it — missing values, duplicate rows,
columns whose types don't line up, numbers that look like errors, and date fields that won't
sort. It reads; it never edits the file or the catalog.
It accepts one CSV or TSV at a time, by local path or http/https URL, and returns a
structured JSON report you can act on — or hand to /portaljs-define-schema to bake the
findings into a schema.
When to use it
Run it when you're about to publish a dataset other people will read or reuse, and you want to know what to fix first — or when a showcase looks wrong (blank cells, garbled numbers, a date column that won't sort) and you need to find the cause. Skip it for a quick throwaway preview. It only runs when you explicitly ask to check or audit data quality.
What it checks
- row count, and null / blank values per column
- duplicate rows, and duplicate values in likely identifier columns
- numeric ranges (min/max) and negative values where they're suspect
- likely year / date field quality
- ambiguous overlapping year columns (e.g.
calendar yearvsfiscal year) - mixed-type columns, inferred from sampled values
Inputs
| Input | Required | Notes |
|---|---|---|
| File | Yes | One CSV or TSV, as a local path or an http/https URL. |
python3 | Yes | Used to run the audit; the skill is read-only and installs nothing. |
Example
/portaljs-check-data-quality ./public/data/trash.csv
The skill profiles the file and returns a JSON report:
status—ok,warning, orcriticalfile,file_name,source_type,row_count,column_countfindings— the structured issues it foundrecommendations— suggested next stepscolumn_profiles— a per-column summary
A remote file works the same way:
/portaljs-check-data-quality https://example.com/trash.csv
Where to go next
/portaljs-define-schema— turn the findings into a typed Frictionless schema with the right constraints./portaljs-add-dataset— add the cleaned dataset to the catalog.