Better index troubleshooting with ForgeRock DS / OpenDJ

Many years ago, I wrote about troubleshooting indexes and search performances, explaining the magicdebugSearchIndex” operational attribute, that allows an administrator to get from the server information about the processing of indexes for a specific search query.

The returned value provides insights on the indexes that were used for a particular search, how they were used and how the resulting set of candidates was built, allowing an administrator to understand whether indexes are used optimally or need to be tailored better for specific search queries and filters, in combination with access logs and other tools such as backendstat.

In DS 6.5, we’ve made some improvements in the search filter processing and we’ve changed the format of the debugSearchIndex value to provide a better reporting of how indexes are used.

The new format is now JSON based, which allow to give it more structure and all could be processed programatically. Here are a few examples of output of the new debugSearchIndex attribute values.

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(&(cn=*Den*)(mail=user.19*))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"intersection":[{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"ser.19","candidates":111,"retained":111},{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111,"retained":111},
{"filter":"(cn=*Den*)", "index":"cn.caseIgnoreSubstringsMatch:6",
"range":"[den,deo[","candidates":103,"retained":5}], "candidates":5},"final":5}

Let’s look at the debugSearchIndex value and interpret it:

{
"filter": {
"intersection": [
{
"index": "mail.caseIgnoreIA5SubstringsMatch:6",
"exact": "ser.19",
"candidates": 111,
"retained": 111
},
{
"index": "mail.caseIgnoreIA5SubstringsMatch:6",
"exact": "user.1",
"candidates": 1111,
"retained": 111
},
{
"filter": "(cn=*Den*)",
"index": "cn.caseIgnoreSubstringsMatch:6",
"range": "[den,deo[",
"candidates": 103,
"retained": 5
}
],
"candidates": 5
},
"final": 5
}

The filter had 2 components: (cn=*Den*) and (mail=user.19*). Because the whole filter is an AND, the result set is an intersection of several index lookups. Also, both substring filters, but one is a substring of 3 characters and the second one a substring of 7 characters. By default, substring indexes are built with substrings of 6 characters. So the filters are treated differently. The server optimises the processing of indexes so that it will try to first to use the queries that are the most effective. In the case above, the filter (mail=user.19*) is preferred. 2 records are read from the index, and that results in a list of 111 candidates. Then, the server use the remaining filter to narrow the result list. Because the string Den is shorter than the indexed substrings, the server scans a range of keys in the index, starting from the first key match “den” and stopping before the key that matches “deo”. This results in 103 candidates, but only 5 are retained because they were parts of the previous result set. So the result is 5 entries that are matching these filters.

Note the [den,deo[ notation is similar to mathematical Set representation where [ and ] indicate whether a set includes or excludes the boundaries.

Let’s take an example with an OR filter:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(|(cn=*Denice*)(uid=user.19))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"union":[{"filter":"(cn=*Denice*)", "index":"cn.caseIgnoreSubstringsMatch:6","exact":"denice","candidates":1}, {"filter":"(uid=user.19)", "index":"uid.caseIgnoreMatch","exact":"user.19","candidates":1}],"candidates":2},"final":2}

As you can see, the result is now a union of 2 exact match (i.e. reads of index keys), each resulting a 1 candidate.

Finally here’s another example, where the scope is used to attempt to reduce the candidate list:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "ou=people,dc=example,dc=com" -s one "(mail=user.1)" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"filter":"(mail=user.1)","index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111},"scope":{"type":"one","candidates":"[NOT-INDEXED]","retained":1111},"final":1111}

You can find more information and details about the debugsearchindex attribute in the ForgeRock Directory Services 6.5 Administration Guide.

2 thoughts on “Better index troubleshooting with ForgeRock DS / OpenDJ

  1. Nicolas Seigneur 08 January 2019 / 19:32

    Thanks for sharing Ludo, this is great to see improvement on this!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s