More about OpenDJ support for JSON attribute values

In a previous post, I introduced the new JSON syntax, JSON query and matching rules that are delivered as part of the OpenDJ LDAP directory server. Today, I will give more insights on how to customise the syntax, tune the matching rules for smarter and more efficient indexing, and I will highlight some best practices with using the JSON syntax.

JSON Syntax Validation

When defining an attribute with a JSON syntax, the server will validate that the JSON value is compliant with JSON RFC.  OpenDJ offers a few options to relax some of the constraints of a valid JSON. To change the settings of the syntax, you must use dsconfig --advanced.

>>>> Configure the properties of the Core Schema

Property Value(s)
 ----------------------------------------------------------------------
 1) allow-attribute-types-with-no-sup-or-syntax true
 2) allow-zero-length-values-directory-string false
 3) disabled-matching-rule NONE
 4) disabled-syntax NONE
 5) enabled true
 6) java-class org.opends.server.schema.CoreSchemaProvider
 7) json-validation-policy strict
 8) strict-format-certificates true
 9) strict-format-country-string true
 10) strict-format-jpeg-photos false
 11) strict-format-telephone-numbers false
 12) strip-syntax-min-upper-bound-attribute-type-description false

?) help
 f) finish - apply any changes to the Core Schema
 c) cancel
 q) quit

Enter choice [f]: 7


>>>> Configuring the "json-validation-policy" property

Specifies the policy that will be used when validating JSON syntax values.

Do you want to modify the "json-validation-policy" property?

1) Keep the default value: strict
 2) Change it to the value: disabled
 3) Change it to the value: lenient

?) help
 q) quit

Enter choice [1]:

Strict is the default mode.

Disabled means that the server will not try to validate the content of a JSON value.

Lenient means that it will validate the JSON value, but tolerate comments, single quotes and unquoted control characters.

JSON Matching Rule and Indexing

Like any attribute in the OpenDJ server, attributes with a JSON syntax can be indexed.

$ dsconfig -h localhost -p 4444 \
  -D "cn=Directory Manager" -w secret12 -X -n \
 set-backend-index-prop \--backend-name userRoot \
 --index-name json --set index-type:equality

By default, the server actually indexes each field of all JSON values. If the values are large and complex, indexing will  result in many disk I/O, possibly impacting performances for write operations.

If you know which fields of the JSON values will be queried for by the client applications, you can optimise the index and specify the JSON fields that are indexed. This is by creating a new custom schema provider for the JSON query. You can choose to overwrite the default JSON query matching rules (as illustrated below), and this will affect all JSON attributes, or you can choose to create a new rule (with a new name and OID).

In the example below, the custom schema provider overwrites the default caseIgnoreJsonQueryMatch, and only indexes the JSON fields _id and name with its subfields.

$ dsconfig -h localhost -p 4444 \
  -D "cn=Directory Manager" -w secret12 -X -n \
 create-schema-provider --provider-name "Json Schema" \
 --type json-schema --set enabled:true \
 --set case-sensitive-strings:false \
 --set ignore-white-space:true \
 --set matching-rule-name:caseIgnoreJsonQueryMatch \
 --set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.1 \
 --set indexed-field:_id \
 --set "indexed-field:name/**" 

When you overwrite the default matching rule, or you define a new one, you need to rebuild the indexes for all attributes that are making use of it.

Best Practices

The support for JSON attributes in OpenDJ is very new, but yet, we can recommend how to best use them.

The first thing, is to use the JSON syntax for attributes that are single valued. Indexing is designed to associate values with entries. Because JSON query indexes are built for all fields of the JSON objects, an entry will be returned if a query matches all fields, even though they are in different objects.

The JSON syntax is handy to store complex JSON objects in a single attribute and query them, through any field. However, the larger the values, the  more impact on the directory server’s performances. As, by default, all JSON fields are indexed, the more fields, the more expensive will be indexing. Also, because the JSON objects are LDAP attributes, the only way to change a value is to replace the value with a new one (or delete the value and add a new one, which are operations with even more bytes). There is no patch operation on the value. Finally, OpenDJ stores all attributes of an entry in a single database record. So any change in the entry itself will require to write the whole entry again.

As we’ve seen above, OpenDJ proposes a way to customise the JSON queries and the JSON fields that are indexed. We suggest that you make use of this capability and optimise the indexing of JSON objects for the queries run by the client applications.

If you plan to store different kinds of JSON objects in an OpenDJ directory service, define different attributes with the JSON syntax, and use a custom JSON query per attribute. For example, lets assume you will have entries that are persons with an address attribute with a JSON syntax, and some other entries that represent OAuth2 tokens, and the token main attribute has a JSON syntax. You should define an address attribute and a token attribute, both with the JSON syntax, but their specific matching rules, like below.

attributeTypes: ( 1.3.6.1.4.1.36733.2.1.1.999 NAME 'address'
  SYNTAX 1.3.6.1.4.1.36733.2.1.3.1
  EQUALITY caseIgnoreJsonAddressQueryMatch SINGLE-VALUE )

attributeTypes: ( 1.3.6.1.4.1.36733.2.1.1.999 NAME 'token'
  SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 
  EQUALITY caseIgnoreJsonTokenQueryMatch SINGLE-VALUE )

where the matching rules are defined as such:

$ dsconfig -h localhost -p 4444 \
  -D "cn=Directory Manager" -w secret12 -X -n \
 create-schema-provider \
 --provider-name "Address Json Schema" \
 --type json-schema --set enabled:true \
 --set case-sensitive-strings:false \
 --set ignore-white-space:true \
 --set matching-rule-name:caseIgnoreJsonAddressQueryMatch \
 --set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.998

and

$ dsconfig -h localhost -p 4444 \
  -D "cn=Directory Manager" -w secret12 -X -n \
 create-schema-provider \
 --provider-name "Token Json Schema" \
 --type json-schema --set enabled:true \
 --set case-sensitive-strings:false \
 --set ignore-white-space:true \
 --set matching-rule-name:caseIgnoreJsonTokenQueryMatch \
 --set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.999 \
 --set indexed-field:token_type \
 --set indexed-field:expires_at \
 --set indexed-field:access_token

Note that there is an issue with OpenDJ 4.0.0-SNAPSHOTS (nightly builds) and when you define a new Schema Provider, you need to restart the server to have it be effective.

One thought on “More about OpenDJ support for JSON attribute values

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s