A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/Data-Liberation-Front/csvlint.rb below:

Data-Liberation-Front/csvlint.rb: The gem behind http://csvlint.io

A ruby gem to support validating CSV files to check their syntax and contents. You can either use this gem within your own Ruby code, or as a standalone command line application

ruby version 3.4

The codebase includes both rspec and cucumber tests, which can be run together using:

or separately:

$ rake spec
$ rake features

When the cucumber tests are first run, a script will create tests based on the latest version of the CSV on the Web test suite, including creating a local cache of the test files. This requires an internet connection and some patience. Following that download, the tests will run locally; there's also a batch script:

which will run the tests from the command line.

If you need to refresh the CSV on the Web tests:

$ rm bin/run-csvw-tests
$ rm features/csvw_validation_tests.feature
$ rm -r features/fixtures/csvw

and then run the cucumber tests again or:

$ ruby features/support/load_tests.rb

Add this line to your application's Gemfile:

And then execute:

Or install it yourself as:

You can either use this gem within your own Ruby code, or as a standalone command line application

After installing the gem, you can validate a CSV on the command line like so:

You may need to add the gem exectuable directory to your path, by adding '/usr/local/lib/ruby/gems/2.6.0/bin' or whatever your version is, to your .bash_profile PATH entry. like so

You will then see the validation result, together with any warnings or errors e.g.

myfile.csv is INVALID
1. blank_rows. Row: 3
1. title_row.
2. inconsistent_values. Column: 14

You can also optionally pass a schema file like so:

csvlint myfile.csv --schema=schema.json

Add to your .pre-commit-config.yaml file :

repos: # `pre-commit autoupdate` to get latest available tags

  - repo: https://github.com/Data-Liberation-Front/csvlint.rb
    rev: v1.2.0
    hooks:
      - id: csvlint

pre-commit install to enable it on your repository.

To force a manual run of pre-commit use the command :

Currently the gem supports retrieving a CSV accessible from a URL, File, or an IO-style object (e.g. StringIO)

require 'csvlint'

validator = Csvlint::Validator.new( "http://example.org/data.csv" )
validator = Csvlint::Validator.new( File.new("/path/to/my/data.csv" ))
validator = Csvlint::Validator.new( StringIO.new( my_data_in_a_string ) )

When validating from a URL the range of errors and warnings is wider as the library will also check HTTP headers for best practices

#invoke the validation
validator.validate

#check validation status
validator.valid?

#access array of errors, each is an Csvlint::ErrorMessage object
validator.errors

#access array of warnings
validator.warnings

#access array of information messages
validator.info_messages

#get some information about the CSV file that was validated
validator.encoding
validator.content_type
validator.extension
validator.row_count

#retrieve HTTP headers from request
validator.headers

The validator supports configuration of the CSV Dialect used in a data file. This is specified by passing a dialect hash to the constructor:

dialect = {
	"header" => true,
	"delimiter" => ","
}
validator = Csvlint::Validator.new( "http://example.org/data.csv", dialect )

The options should be a Hash that conforms to the CSV Dialect JSON structure.

While these options configure the parser to correctly process the file, the validator will still raise errors or warnings for CSV structure that it considers to be invalid, e.g. a missing header or different delimiters.

Note that the parser will also check for a header parameter on the Content-Type header returned when fetching a remote CSV file. As specified in RFC 4180 the values for this can be present and absent, e.g:

Content-Type: text/csv; header=present

The validator provides feedback on a validation result using instances of Csvlint::ErrorMessage. Errors are divided into errors, warnings and information messages. A validation attempt is successful if there are no errors.

Messages provide context including:

The following types of error can be reported:

The following types of warning can be reported:

There are also information messages available:

The library supports validating data against a schema. A schema configuration can be provided as a Hash or parsed from JSON. The structure currently follows JSON Table Schema with some extensions and rudinmentary CSV on the Web Metadata.

An example JSON Table Schema schema file is:

{
	"fields": [
		{
			"name": "id",
			"constraints": {
				"required": true,
				"type": "http://www.w3.org/TR/xmlschema-2/#integer"
			}
		},
		{
			"name": "price",
			"constraints": {
				"required": true,
				"minLength": 1
			}
		},
		{
			"name": "postcode",
			"constraints": {
				"required": true,
				"pattern": "[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}"
			}
		}
	]
}

An equivalent CSV on the Web Metadata file is:

{
	"@context": "http://www.w3.org/ns/csvw",
	"url": "http://example.com/example1.csv",
	"tableSchema": {
		"columns": [
			{
				"name": "id",
				"required": true,
				"datatype": { "base": "integer" }
			},
			{
				"name": "price",
				"required": true,
				"datatype": { "base": "string", "minLength": 1 }
			},
			{
				"name": "postcode",
				"required": true
			}
		]
	}
}

Parsing and validating with a schema (of either kind):

schema = Csvlint::Schema.load_from_json(uri)
validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, schema )
CSV on the Web Validation Support

This gem passes all the validation tests in the official CSV on the Web test suite (though there might still be errors or parts of the CSV on the Web standard that aren't tested by that test suite).

JSON Table Schema Support

Supported constraints:

Supported data types (this is still a work in progress):

Use of an unknown data type will result in the column failing to validate.

Schema validation provides some additional types of error and warning messages:

You can also provide an optional options hash as the fourth argument to Validator#new. Supported options are:

options = {
  limit_lines: 100
}
validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, options )
    options = {
      lambda: ->(validator) { puts validator.current_line }
    }
    validator = Csvlint::Validator.new( "http://example.org/data.csv", nil, nil, options )
    => 1
    2
    3
    4
    .....

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4