OpenDJ: Analyzing Search Filters and Indexes

LDAP directory services greatly rely on indexes to provide fast and accurate search results.

OpenDJ, the open source LDAP directory services for the Java platform, provides a number of tools to ensure indexes are efficiently used or to optimize them for even better performances.

To start with, OpenDJ rejects by default all unindexed searches, unless the authenticated user has the privilege to perform them. Unindexed searches are rejected because they result in scanning the whole database, which consumes lots of resources and time. There are legitimate uses of unindexed search though, and OpenDJ offers a way to control who can perform them through a privilege. To learn more about privileges, how to grant them, please check the Administration Guide or some of my previous posts.

When unindexed searches are completed, OpenDJ (starting with revision 7148 of the OpenDJ trunk, and therefore OpenDJ 2.5) does logs the “Unindexed” keyword as part of the Search Response access log message. But the access log file can also be used to identify search operations that are not making an optimal use of indexes. Simply check for those search responses that have been returned with an etime (execution time) greater than the average.

The access log example below contains both an unusually high etime (expressed in ms) and the Unindexed tag.

[27/Jul/2011:20:27:27 +0200] SEARCH RES conn=0 op=1 msgID=2 result=0 nentries=10001 Unindexed etime=1846

The verify-index command let you check that no index is corrupted (i.e. no data is missing from indexes).

The rebuild-index command let you build or rebuild an index that would be corrupted or had its configuration changed.

One of the tuning parameter of indexes is the index-entry-limit (which was known in Sun DSEE as the AllIDsThreshold), the maximum size of entries kept in an index record, before the server stop maintaining that record and consider it’s more efficient to scan the whole database. For more information on the index entry limit, check the Section 7.2.4 Changing Index Entry Limits of the Indexing chapter of the Administration Guide.

OpenDJ provides a static analyzer of indexes which can help to understand how well the attributes are indexed, as well as help to tune the index entry limit. This tool is a function of the dbtest utility and is simply used as follow:

$ bin/dbtest list-index-status -n userRoot -b "dc=example,dc=com"

Index Name Index Type JE Database Name Index Valid Record Count Undefined 95% 90% 85%

---------------------------------------------------------------------------------------------------------------------------------------
id2children                Index       dc_example_dc_com_id2children                true         2             0          0    0    0
id2subtree                 Index       dc_example_dc_com_id2subtree                 true         2             0          0    0    0
uid.equality               Index       dc_example_dc_com_uid.equality               true         2000          0          0    0    0
aci.presence               Index       dc_example_dc_com_aci.presence               true         0             0          0    0    0
ds-sync-conflict.equality  Index       dc_example_dc_com_ds-sync-conflict.equality  true         0             0          0    0    0
givenName.equality         Index       dc_example_dc_com_givenName.equality         true         2000          0          0    0    0
givenName.substring        Index       dc_example_dc_com_givenName.substring        true         5777          0          0    0    0
objectClass.equality       Index       dc_example_dc_com_objectClass.equality       true         6             0          0    0    0
member.equality            Index       dc_example_dc_com_member.equality            true         0             0          0    0    0
uniqueMember.equality      Index       dc_example_dc_com_uniqueMember.equality      true         0             0          0    0    0
cn.equality                Index       dc_example_dc_com_cn.equality                true         2000          0          0    0    0
cn.substring               Index       dc_example_dc_com_cn.substring               true         19407         0          0    0    0
sn.equality                Index       dc_example_dc_com_sn.equality                true         2000          0          0    0    0
sn.substring               Index       dc_example_dc_com_sn.substring               true         8147          0          0    0    0
telephoneNumber.equality   Index       dc_example_dc_com_telephoneNumber.equality   true         2000          0          0    0    0
telephoneNumber.substring  Index       dc_example_dc_com_telephoneNumber.substring  true         16506         0          0    0    0
ds-sync-hist.ordering      Index       dc_example_dc_com_ds-sync-hist.ordering      true         1             0          0    0    0
mail.equality              Index       dc_example_dc_com_mail.equality              true         2000          0          0    0    0
mail.substring             Index       dc_example_dc_com_mail.substring             true         7235          0          0    0    0
entryUUID.equality         Index       dc_example_dc_com_entryUUID.equality         true         2002          0          0    0    0

Total: 20

If an index contains a non zero value (N) in the undefined column, it means N index keys have reached the index entry limit and are no longer maintained. This can be normal, for example with the ObjectClass equality index, where the vast majority of entries will have the same objectclasses (top, Person, organizationalPerson, inetOrgPerson). But, for other attributes, such as cn, it may indicate that the index entry limit is too low.

Finally, OpenDJ has an option to do a live analysis of search filters and how they use indexes. To enable live index analysis, simply enable it for the database backend that contains the data :

dsconfig set-backend-prop --backend-name userRoot  --set index-filter-analyzer-enabled:true \
 --set max-entries:50 -h localhost -p 4444 -D cn=Directory\ Manager -w ****** -n -X

The max-entries parameter specifies how many filter items are being analyzed and kept in memory. Only the last max-entries will be kept. If there is a huge variety of requests against the directory service, you might want to increase the number. However, keep in mind that the analysis is kept in memory, and the higher the number the largest the impact on the overall performances of the server.

We do not recommend that you leave the index analysis enabled all the time, especially in production. The index analyzer should be used to gather statistics over a flow of requests for a short period of time, and should be disabled afterwards to free the resources.

The result of the index analyzer can be retrieved under the cn=monitor suffix, more specifically as part of the database environment of the backend.

$ bin/ldapsearch -p 1389 -D cn=directory\ manager -w secret12  \
-b "cn=userRoot Database Environment,cn=monitor" '(objectclass=*)' filter-use

dn: cn=userRoot Database Environment,cn=monitor
filter-use: (uid=user.*) hits:1 maxmatches:20 message:
filter-use: (tel=*) hits:1 maxmatches:-1 message:presence index type is disabled
  for the tel attribute
filter-use: (objectClass=groupOfURLs) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfEntries) hits:1 maxmatches:0 message:
filter-use: (objectClass=person) hits:1 maxmatches:20 message:
filter-use: (objectClass=ds-virtual-static-group) hits:1 maxmatches:0 message:
filter-use: (aci=*) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfNames) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfUniqueNames) hits:1 maxmatches:0 message:
filter-use: (objectClass=ldapSubentry) hits:1 maxmatches:0 message:
filter-use: (objectClass=subentry) hits:1 maxmatches:0 message:

hits represents the number of time this filter was used. the maxmatches represents the maximum number of entries that were returned for that filter.

Index analysis and tuning is not a simple task, and I recommend to play with these tools  a lot on a test environment to understand how to get the best out of them. But, as you can see, OpenDJ provides you with all the tools you need to get the best performances out of your LDAP directory.

7 thoughts on “OpenDJ: Analyzing Search Filters and Indexes

  1. Matt Swift 12 August 2011 / 13:11

    There’s also a poorly documented feature for debugging searches which is very easy to use. Simply specify the debugsearchindex attribute in your search to have OpenDJ report how the search would have been processed:

    $ ./bin/ldapsearch -h localhost -p 1389 -D cn=directory\ manager -w password \
    -b “dc=example,dc=com” -s sub -T “(uid=user.10)” debugsearchindex
    dn: cn=debugsearch
    debugsearchindex: filter=(uid=user.10)[INDEX:uid.equality][COUNT:1] final=[COUNT:1]

    Whereas:

    $ ./bin/ldapsearch -h localhost -p 1389 -D cn=directory\ manager -w password \
    -b “dc=example,dc=com” -s sub -T “(objectClass=person)” debugsearchindex
    dn: cn=debugsearch
    debugsearchindex: filter=(objectClass=person)[INDEX:objectClass.equality][LIMIT-EXCEEDED] scope=wholeSubtree[LIMIT-EXCEEDED:10002] final=[LIMIT-EXCEEDED:10002]

    The magic attribute “debugsearchindex” tells OpenDJ to not do the search but instead return a fake entry containing debug information describing how the search would have been processed. In the second example you can see that it attempted to use objectClass equality index but went allids, and then tried id2subtree which also went allids. The final outcome was that the search was allids (final=[LIMIT-EXCEEDED:10002]).

  2. Georgi 14 April 2012 / 22:23

    For some reason I get 404 when I try to click on the Admin Guide links..

    • Ludo 15 April 2012 / 21:08

      Hi Georgi, thanks for the notice. We have changed the structure of the documentation to shorten the name and this has broken many links.
      I am fixing the few links I have here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s