Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunks syntax: characters allowed for types, names, and ids #7

Open
tidoust opened this issue Jun 4, 2020 · 2 comments
Open

Chunks syntax: characters allowed for types, names, and ids #7

tidoust opened this issue Jun 4, 2020 · 2 comments

Comments

@tidoust
Copy link
Member

tidoust commented Jun 4, 2020

The chunk.js implementation suggests that names are composed of letters and digits, as well as a restricted set of punctuation characters.

However, the description of @rdfmap suggests that chunk property values could be IRIs:

@rdfmap {
  dog http://example.com/ns/dog
  cat http://example.com/ns/cat
}

In practice, I wonder what are allowed characters for types, names, and ids. It seems to me that allowing IRIs (as done in JSON-LD) could also help mapping with the semantic world, and that it would allow reasoning about things. For instance, I could have

website https://example.org/ {
  name "An example page"
}

One problem is that commas are allowed in IRIs, which makes them problematic for use in a comma separated list of property values. A solution is to simply use space as a separator between values, or to mandate excaping of commas in IRIs.

@draggett
Copy link
Member

draggett commented Jun 4, 2020

The JavaScript implementation currently uses the following regular expressions:

number: /^[-+]?[0-9]+.?[0-9]*([eE][-+]?[0-9]+)?$/
name: /^(*|(@)?[\w|\d|\.|_|-|\/|:]+)$/
iso8061: /^\d{4}(-\d\d(-\d\d(T\d\d:\d\d(:\d\d)?(.\d+)?(([+-]\d\d:\d\d)|Z)?)?)?)?$/

Chunk identifiers are names, so your example with a URL for a chunk ID is fine.

Commas are really convenient for list item separators, so to allow IRIs, any commas within them should be escaped.

@tidoust
Copy link
Member Author

tidoust commented Jun 4, 2020

I guess we can start with a restricted set of characters and open things up later on.

FWIW, \w is equivalent to [A-Za-z0-9_] and thus already includes \d and _.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants