[Architecture] Standards for defining value space in Registry

Antranig Basman antranig.basman at colorado.edu
Wed Feb 20 19:10:16 EST 2013


On 20/02/2013 12:14, Christophe Strobbe wrote:
> Hi Antranig, All,
>
>
> Am Mo, 18.02.2013, 21:23 schrieb Antranig Basman:
>> (...) Our
>> architectural plan has been to
>> adopt as many of the provisions of JSON Schema where possible, and where
>> not possible, to make modest
>> specialised extensions. This standard has recently gone to a new IETF
>> draft as of Jan 30th 2013, and the
>> section which seems most relevant to your question is this:
>>
>> http://json-schema.org/latest/json-schema-validation.html
>
> Thanks for reminding me of JSON Schema.
> Is this something you recommend because much of the GPII architecture is
> in JavaScript?

I would like to think that the recommendation stems from "common cause" rather than necessarily from the 
implication that the GPII architecture is in JavaScript, and I think that it is genuinely the case that 
JSON-based technologies are easier to adopt from a wider family of languages even given the huge incumbency 
advantages of XML-based technologies. I think there are various ways of demonstrating this - firstly the 
fact that an adequate JSON parser could be knocked up by a bright teenager on a "rainy afternoon" - whereas 
you may recall that there were significant arguments about whether any XML parser could be considered 
actually compliant with the standard even a decade after its introduction :)

Another interesting way of measuring the "excess baggage" of an interchange format is to look at its 
meta-schema. Obviously the complete ghetto in this case is the original DTD specification for XML which 
can't even itself be encoded in a DTD schema at all. The JSON meta-schema is here, a reasonably tidy thing 
comprising a couple of pages of JSON:

http://json-schema.org/schema

Whereas here is the normative XML Schema schema:
http://www.w3.org/TR/xmlschema-1/#normative-schemaSchema

Embarrassingly, it *still* can't be expressed without some DTD elements, and appears to me as a resoundingly 
unmanageable garbage dump.

I have less belief than Roy Fielding (the inventor of REST) in the practical value of "meta-schemas" but as 
well as being a useful benchmark of the intrinsic "tax" in using a technology, I believe they are something 
that we will find at least some kind of use for in the GPII. Clients I believe will from time to time 
produce schemas and we will want to validate them... even in the case that we come to believe that the 
expressive power of JSON schema is strictly less than that of an alternative, I think it is also far clearer 
how we would set about defining our own extensions and custom validation rules within the scope of that 
standard than others that I know about.

>> The primitive types these rules are referred to are these:
>>
>> http://json-schema.org/latest/json-schema-core.html#anchor8
>>
>> Where values are scalars it seems they can be easily accommodated in this
>> system. More complex types wuch as
>> languages and colours would require appropriate references to other
>> relevant specifications - for example
>> last year I believe you determined that our choice for language value
>> spaces would be covered by IETF BCP 47
>> (mailing of Oct 11 2012)
>
> Right. The kinds of data that may be stored as values is essentially
> open-ended, and I think this has implications for the level of validation
> we can expect to happen. For example, for language tags that should
> conform to IETF BCP 47:
> * a superficial check may be just checking that the tag is a string,
> * a more thorough check may be a regular expression (which dialect??) that
> checks against the language tag against the syntax described in IETF BCP
> 47,
> * the "grand cru" check (on top of checking syntax) would involve checking
> that the value of each subtag in the language tag is valid, i.e. that it
> is registered at the corresponding authority.
>
> The last kind of check would be possible for certain known datatypes, but
> most implementations would probably never go that far.
>
> Any other thoughts?

Yes, I think this is quite correct. We should expect to have a rich "ecology" of different levels of 
stringency of schema and other validation depending on the purpose and context, and prepare for such an 
ecology.

Cheers,
Antranig



More information about the Architecture mailing list