Explaining index-entry-limit in ForgeRock Directory Services / OpenDJ

A few years ago, I’ve explained the various resource limits in OpenDJ, the open source LDAP and REST directory server. A few months ago, someone read the post and asked on twitter about the index-entry-limit:

Screen Shot 2016-08-20 at 16.28.01

The index-entry-limit is probably the least understood parameter in the OpenDJ directory server, as was the AllIDThreshold in Sun Directory Server (and its siblings : Netscape Directory, Red Hat Directory, Oracle DSEE…). So before I dive in explaining what is this parameter, how it’s used and how it can be tuned, let me start with answering the question : how does index-entry-limit relate to other administrative limits ?

Answer: It doesn’t ! The index-entry-limit is an internal limit and does not really limits the results returned to clients. It just limits the resources consumed when processing indexes.

A Directory Server is a very specialized data-store based on the LDAP standard, and its primary goal was to be able to search and return user information such as email addresses or names and phone numbers, very quickly and for a large number of different clients. For that, the directory servers were designed to favor reads over writes, and read optimization was achieved through the use of indexes.

In LDAP, a search request (which can be used to read an entry or search for one or more through the whole database) contains a search filter. The filter may be simple or complex, and composed of one or more attribute value assertions.

A simple filter can be “(sn=Smith)”. Complex filters combine operators and different attributes : “(&(objectclass=Person)(|(sn=Smith)(cn=*Smith*)))” – find a person whose surname is smith or whose common name contains smith

When the ForgeRock Directory Server / OpenDJ receives a search request, it processes it in 2 phases. In the first phase, it analyzes the search filter, to identify which attributes are indexed, and then uses these indexes to build a list of possible candidates to return. If there are no indexed attributes or the list is too large, the server decides that the list is actually the whole database. Such search request is tagged as “unindexed” and the server verify if the authenticated user has the “unindexed-search” privilege before continuing. In the second phase, it reads all the candidates from the database, and assess the full filter to decide to return the entry to the client or not (subject to access controls).

ForgeRock DS / OpenDJ implements attribute indexes as reversed index. Meaning that for a specific attribute, we keep a pair of each unique value and a list of the entries that  contain that value. Because maintaining a large list of entries for each value of all indexed attributes may have a big cost, both in term of memory usage and disk I/O (think that when you add an entry in the Directory, all of its indexed attributes will need to be updated), we introduced a limit to the number of entries that an index record can contain: the index-entry-limit. For example, if the number of entries that contain the objectClass person exceeds the limit, then we mark the key as “full” and we consider that the list of candidates is actually the whole set of directory entries.  This saves us from updating and reading a very long record, allocating lots of data, to end up iterating through almost all entries. You might ask, so why having an index for objectClass then ? Well, in a directory server that contains millions of users, there are in fact very few entries that are not persons. These entries will have their objectClass values indexed, and searching for those entries will be very efficient thanks for the index.

The index-entry-limit is a limit of the number of entries that are contained in a single index record, per value of an attribute index. Its default value is 4000 and works for most medium to large scale deployments. So, why is it a configurable parameter, and when should you change it?

Because ForgeRock DS is used in many different environments with various use cases, and a great range of number of entries (some of our customers have over 100 millions entries in a directory service), we know that one size doesn’t fit all. But the default value works for most of the index usages. Also, the index-entry-limit can be set for each individual index, or for the whole backend (and this value applies to all indexes that don’t have a specific value). It is highly recommended that you only try to change the index-entry-limit of specific indexes, and not the backend default value.

In no case, should you increase the index-entry-limit to a value close to the total number of entries in the directory. This will undermine performances of both searches and updates, significantly increase the footprint of the data stored on disk.

There are few known cases where the index-entry-limit value should be changed (and equally cases where increasing the value will only consume more resources for no performance gain). Keep also in mind that when you change the index-entry-limit, you need to rebuild the indexes for which the limit was changed. So it’s not something that you want to do too often. And definitely not something that you need to adjust constantly.

Groups. When the server starts, it issues an internal search to find all group entries and cache them for better performances. The search is based on the ObjectClass attribute. If there are more than 4000 groups of one kind (the search is for GroupOfNames, GroupOfUniqueNames, GroupOfEntries, DynamicGroup and ds-virtual-static-group), the search will be unindexed and can take a long time to proceed. In that case, you should increase the index-entry-limit for the ObjectClass attribute, to a value just above the number of groups.

Members (or uniqueMembers). If you have more than 4000 static groups, and you know that some users are likely to be member of more than 4000 groups, then you should also increase the index-entry-limit for the member attribute (or uniqueMember) to a value just above the maximum number of group a user can be in, especially if you have enabled the Referential Integrity Plugin (that removes a user from groups when its entry is deleted).

Another typical use case for increasing the index-entry-limit is when you have millions of entries, and an attribute doesn’t have a flat distribution of values. Think about the surname of users. In a wide range of population, there are probably more “Smith” or “Lee” than “Washington”. Within 10M users, would there be more than 4000 “Lee”? If it’s possible, and the server receives searches with filters such as “(sn=Lee)”, then you should consider increasing the limit for the sn attribute.

Backendstat is the tool you want to use to verify the state of the index and whether some records have reached the index-entry-limit. For some attributes, such as ObjectClass, it is normal that the limit is reached. For others, such as sn, it’s probably something you want to check regularly.

The backendstat tool requires exclusive access to the database, and thus can only run against a server that is stopped (or a backup).

To list the indexes, use backendstat list-indexes:

$ backendstat list-indexes -b dc=example,dc=com -n userRoot

Index Name Raw DB Name Type Record Count
dn2id /dc=com,dc=example/dn2id DN2ID 10002
id2entry /dc=com,dc=example/id2entry ID2Entry 10002
referral /dc=com,dc=example/referral DN2URI 0
id2childrencount /dc=com,dc=example/id2childrencount ID2ChildrenCount 3
state /dc=com,dc=example/state State 18
uniqueMember.uniqueMemberMatch /dc=com,dc=example/uniqueMember.uniqueMemberMatch MatchingRuleIndex 0
mail.caseIgnoreIA5SubstringsMatch:6 /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6 MatchingRuleIndex 31232
mail.caseIgnoreIA5Match /dc=com,dc=example/mail.caseIgnoreIA5Match MatchingRuleIndex 10000
aci.presence /dc=com,dc=example/aci.presence MatchingRuleIndex 0
member.distinguishedNameMatch /dc=com,dc=example/member.distinguishedNameMatch MatchingRuleIndex 0
givenName.caseIgnoreMatch /dc=com,dc=example/givenName.caseIgnoreMatch MatchingRuleIndex 8605
givenName.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6 MatchingRuleIndex 19629
telephoneNumber.telephoneNumberSubstringsMatch:6 /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6 MatchingRuleIndex 73235
telephoneNumber.telephoneNumberMatch /dc=com,dc=example/telephoneNumber.telephoneNumberMatch MatchingRuleIndex 10000
ds-sync-hist.changeSequenceNumberOrderingMatch /dc=com,dc=example/ds-sync-hist.changeSequenceNumberOrderingMatch MatchingRuleIndex 0
ds-sync-conflict.distinguishedNameMatch /dc=com,dc=example/ds-sync-conflict.distinguishedNameMatch MatchingRuleIndex 0
entryUUID.uuidMatch /dc=com,dc=example/entryUUID.uuidMatch MatchingRuleIndex 10002
sn.caseIgnoreMatch /dc=com,dc=example/sn.caseIgnoreMatch MatchingRuleIndex 10000
sn.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6 MatchingRuleIndex 32217
cn.caseIgnoreMatch /dc=com,dc=example/cn.caseIgnoreMatch MatchingRuleIndex 10000
cn.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6 MatchingRuleIndex 86040
objectClass.objectIdentifierMatch /dc=com,dc=example/objectClass.objectIdentifierMatch MatchingRuleIndex 6
uid.caseIgnoreMatch /dc=com,dc=example/uid.caseIgnoreMatch MatchingRuleIndex 10000

Total: 23

To check the status of the indexes and see which keys are full (i.e. exceeded the index-entry-limit), use backendstat show-index-status. Warning, this may take a long time.

$ backendstat show-index-status -b dc=example,dc=com -n userRoot
Index Name Raw DB Name Valid Confidential Record Count Over Entry Limit 95% 90% 85%
uniqueMember.uniqueMemberMatch /dc=com,dc=example/uniqueMember.uniqueMemberMatch true false 0 0 0 0 0
mail.caseIgnoreIA5SubstringsMatch:6 /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6 true false 31232 12 0 0 0
mail.caseIgnoreIA5Match /dc=com,dc=example/mail.caseIgnoreIA5Match true false 10000 0 0 0 0
aci.presence /dc=com,dc=example/aci.presence true false 0 0 0 0 0
member.distinguishedNameMatch /dc=com,dc=example/member.distinguishedNameMatch true false 0 0 0 0 0
givenName.caseIgnoreMatch /dc=com,dc=example/givenName.caseIgnoreMatch true false 8605 0 0 0 0
givenName.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6 true false 19629 0 0 0 0
telephoneNumber.telephoneNumberSubstringsMatch:6 /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6 true false 73235 0 0 0 0
telephoneNumber.telephoneNumberMatch /dc=com,dc=example/telephoneNumber.telephoneNumberMatch true false 10000 0 0 0 0
ds-sync-hist.changeSequenceNumberOrderingMatch /dc=com,dc=example/ds-sync-hist.changeSequenceNumberOrderingMatch true false 0 0 0 0 0
ds-sync-conflict.distinguishedNameMatch /dc=com,dc=example/ds-sync-conflict.distinguishedNameMatch true false 0 0 0 0 0
entryUUID.uuidMatch /dc=com,dc=example/entryUUID.uuidMatch true false 10002 0 0 0 0
sn.caseIgnoreMatch /dc=com,dc=example/sn.caseIgnoreMatch true false 10000 0 0 0 0
sn.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6 true false 32217 0 0 0 0
cn.caseIgnoreMatch /dc=com,dc=example/cn.caseIgnoreMatch true false 10000 0 0 0 0
cn.caseIgnoreSubstringsMatch:6 /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6 true false 86040 0 0 0 0
objectClass.objectIdentifierMatch /dc=com,dc=example/objectClass.objectIdentifierMatch true false 6 4 0 0 0
uid.caseIgnoreMatch /dc=com,dc=example/uid.caseIgnoreMatch true false 10000 0 0 0 0
Total: 18
Index: /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6
Over index-entry-limit keys: [.com] [@examp] [ample.] [com] [e.com] [exampl] [le.com] [m] [mple.c] [om] [ple.co] [xample]
Index: /dc=com,dc=example/objectClass.objectIdentifierMatch
Over index-entry-limit keys: [inetorgperson] [organizationalperson] [person] [top]

I hope this long article will help you better understand and tune your ForgeRock Directory Servers for search performances. Please let me know how it goes.

Better index troubleshooting with ForgeRock DS / OpenDJ

Many years ago, I wrote about troubleshooting indexes and search performances, explaining the magicdebugSearchIndex” operational attribute, that allows an administrator to get from the server information about the processing of indexes for a specific search query.

The returned value provides insights on the indexes that were used for a particular search, how they were used and how the resulting set of candidates was built, allowing an administrator to understand whether indexes are used optimally or need to be tailored better for specific search queries and filters, in combination with access logs and other tools such as backendstat.

In DS 6.5, we’ve made some improvements in the search filter processing and we’ve changed the format of the debugSearchIndex value to provide a better reporting of how indexes are used.

The new format is now JSON based, which allow to give it more structure and all could be processed programatically. Here are a few examples of output of the new debugSearchIndex attribute values.

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(&(cn=*Den*)(mail=user.19*))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"intersection":[{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"ser.19","candidates":111,"retained":111},{"index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111,"retained":111},
{"filter":"(cn=*Den*)", "index":"cn.caseIgnoreSubstringsMatch:6",
"range":"[den,deo[","candidates":103,"retained":5}], "candidates":5},"final":5}

Let’s look at the debugSearchIndex value and interpret it:

{
"filter": {
"intersection": [
{
"index": "mail.caseIgnoreIA5SubstringsMatch:6",
"exact": "ser.19",
"candidates": 111,
"retained": 111
},
{
"index": "mail.caseIgnoreIA5SubstringsMatch:6",
"exact": "user.1",
"candidates": 1111,
"retained": 111
},
{
"filter": "(cn=*Den*)",
"index": "cn.caseIgnoreSubstringsMatch:6",
"range": "[den,deo[",
"candidates": 103,
"retained": 5
}
],
"candidates": 5
},
"final": 5
}

The filter had 2 components: (cn=*Den*) and (mail=user.19*). Because the whole filter is an AND, the result set is an intersection of several index lookups. Also, both substring filters, but one is a substring of 3 characters and the second one a substring of 7 characters. By default, substring indexes are built with substrings of 6 characters. So the filters are treated differently. The server optimises the processing of indexes so that it will try to first to use the queries that are the most effective. In the case above, the filter (mail=user.19*) is preferred. 2 records are read from the index, and that results in a list of 111 candidates. Then, the server use the remaining filter to narrow the result list. Because the string Den is shorter than the indexed substrings, the server scans a range of keys in the index, starting from the first key match “den” and stopping before the key that matches “deo”. This results in 103 candidates, but only 5 are retained because they were parts of the previous result set. So the result is 5 entries that are matching these filters.

Note the [den,deo[ notation is similar to mathematical Set representation where [ and ] indicate whether a set includes or excludes the boundaries.

Let’s take an example with an OR filter:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "dc=example,dc=com" "(|(cn=*Denice*)(uid=user.19))" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"union":[{"filter":"(cn=*Denice*)", "index":"cn.caseIgnoreSubstringsMatch:6","exact":"denice","candidates":1}, {"filter":"(uid=user.19)", "index":"uid.caseIgnoreMatch","exact":"user.19","candidates":1}],"candidates":2},"final":2}

As you can see, the result is now a union of 2 exact match (i.e. reads of index keys), each resulting a 1 candidate.

Finally here’s another example, where the scope is used to attempt to reduce the candidate list:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=directory manager" -b "ou=people,dc=example,dc=com" -s one "(mail=user.1)" debugsearchindex
Password for user 'cn=directory manager': *********

dn: cn=debugsearch
debugsearchindex: {"filter":{"filter":"(mail=user.1)","index":"mail.caseIgnoreIA5SubstringsMatch:6", "exact":"user.1","candidates":1111},"scope":{"type":"one","candidates":"[NOT-INDEXED]","retained":1111},"final":1111}

You can find more information and details about the debugsearchindex attribute in the ForgeRock Directory Services 6.5 Administration Guide.

OpenDJ: Monitoring Unindexed Searches…

FR_plogo_org_FC_openDJ-300x86OpenDJ, the open source LDAP directory services, makes use of indexes to optimise search queries. When a search query doesn’t match any index, the server will cursor through the whole database to return the entries, if any, that match the search filter. These unindexed queries can require a lot of resources : I/Os, CPU… In order to reduce the resource consumption, OpenDJ rejects unindexed queries by default, except for the Root DNs (i.e. for cn=Directory Manager).

In previous articles, I’ve talked about privileges for administratives accounts, and also about Analyzing Search Filters and Indexes.

Today, I’m going to show you how to monitor for unindexed searches by keeping a dedicated log file, using the traditional access logger and filtering criteria.

First, we’re going to create a new access logger, named “Searches” that will write its messages under “logs/search”.

dsconfig -D cn=directory\ manager -w secret12 -h localhost -p 4444 -n -X \
    create-log-publisher \
    --set enabled:true \
    --set log-file:logs/search \
    --set filtering-policy:inclusive \
    --set log-format:combined \
    --type file-based-access \
    --publisher-name Searches

Then we’re defining a Filtering Criteria, that will restrict what is being logged in that file: Let’s log only “search” operations, that are marked as “unindexed” and take more than “5000” milliseconds.

dsconfig -D cn=directory\ manager -w secret12 -h localhost -p 4444 -n -X \
    create-access-log-filtering-criteria \
    --publisher-name Searches \
    --set log-record-type:search \
    --set search-response-is-indexed:false \
    --set response-etime-greater-than:5000 \
    --type generic \
    --criteria-name Expensive\ Searches

Voila! Now, whenever a search request is unindexed and take more than 5 seconds, the server will log the request to logs/search (in a single line) as below :

$ tail logs/search
[12/Sep/2016:14:25:31 +0200] SEARCH conn=10 op=1 msgID=2 base="dc=example,
dc=com" scope=sub filter="(objectclass=*)" attrs="+,*" result=0 nentries=
10003 unindexed etime=6542

This file can be monitored and used to trigger alerts to administrators, or simply used to collect and analyse the filters that result into unindexed requests, in order to better tune the OpenDJ indexes.

Note that sometimes, it is a good option to leave some requests unindexed (the cost of indexing them outweighs the benefits of the index). If these requests are unfrequent, run by specific administrators for reporting reasons, and if the results are expecting to contain a lot of entries. If so, a best practice is to have a dedicated replica for administration and run these expensive requests. Also, it is better if the client applications are tuned to expect these requests to take a long time.

OpenDJ presented at the LavaJUG

As I mentioned last week, I was presenting OpenDJ and server performances in Java at the LavaJUG on Thursday.
Ludo@LavaJUG

The session was broadcasted live on Google Hangout, unfortunately due to a nice blue screen, in 2 parts. You can watch them here:

Part 1:

Part 2:

The slides are available on the LavaJUG Wiki

Thanks for the great reception to the whole LavaJUG and more specifically to its leaders Olivier Coupelon, Pierre Colomb, Sylvain Desgrais and Thomas Maurel.

Upcoming events: LavaJUG & Devoxx France

I will be at the LavaJUG (Java User Group from Clermont-Ferrand, France) this Thursday from 19:00 to 21:00, presenting our experience with the OpenDJ project with building a highly scalable and high performance server in Java. The presentation is based on what I’ve already presented in a few JUG in France (AlpesJUG, MarsJUG, PoitouCharentesJUG,…) and Switzerland (JUG Lausanne), but has been updated with regards to GarbageFirst GC and the most recent HotSpot JVM.

 

And next week, from  March Wednesday 27th to Friday 29th, you will find ForgeRock at the Devoxx France conference.

Come to our conference session about “Enterprise Security in a Cloudy and Mobile World” (the session is in French). The session is on Friday 29th, from 11:45 to 12:35, in Miles Davis room. Mark it on your calendar, and if you miss it, make sure you stop by our booth (B3) to say hello and talk with some of our engineers. We will also be present at the HackerGarten on Wednesday from 14:00 to 18:00, should you want to have fun with one of our open source projects : OpenAM, OpenDJ or OpenIDM.

DevoxxFR-2013-banniere-texte-600-232

More secure passwords !

I’ve received an intriguing request from a customer last week :  he wanted to know if we’ve done benchmarks of the password hashing schemes that are available in OpenDJ, our LDAP directory service. Their fear was that with stronger schemes, they could not sustain a high authentication rate.

In light of the LinkedIn leak of several millions of passwords, hashed with a simple unsalted SHA1, I decided to run a quick and simple test.

SSHA1 is the default hashing scheme for password in OpenDJ. The salt is an 8 bytes (64-bit) random string and is used with the password to produce the 20 bytes message digest. But OpenDJ directory server supports a wide range of password hashing scheme and salted SHA512 is currently the most secure hashing algorithm we support (and the salt here is also an 8 bytes (64-bit) random octet string).

So for the test, I generated a sample directory data set with 10 000 users, and imported it in the OpenDJ directory (a 2.5 development build) with the default settings, on my laptop (MacBook Pro, 2.2 GHz intel Core i7).

$ ldapsearch -D "cn=directory manager" -w secret12 -p 1389 -b "dc=example,dc=com" 'uid=user.10' dn userPassword
dn: uid=user.10,ou=People,dc=example,dc=com
userPassword: {SSHA}cchzM+LrPCvbZdthOC8e62d4h7a4CfoNvl6d/w==

I then ran an “authrate” which is a small benchmark tool that allows to stress an LDAP server with a high number of authentications (LDAP Bind requests) and let it run to 5 minutes.

authrate -h localhost -p 1389 -g 'rand(0,10000)' -D "uid=user.%d,ou=people,dc=example,dc=com" -w password -c 32 -f
-----------------------------------------------------------------
 Throughput     Response Time
 (ops/second)   (milliseconds)
 recent average recent average 99.9% 99.99% 99.999% err/sec
 -----------------------------------------------------------------
 ...
 26558.0  26148.9   1.179    1.195  10.168  19.431  156.421      0.0

I then stopped the server, changed the import default password encryption scheme to Salted SHA512, and reimported the data.

$ ldapsearch -D "cn=directory manager" -w secret12 -p 1389 -b "dc=example,dc=com" 'uid=user.10' dn userPassword
 dn: uid=user.10,ou=People,dc=example,dc=com
 userPassword: {SSHA512}eTGiwtTM4niUKNkEBy/9t03UdbsyYTL1ZXhy6uFnw4X0T6Y9Zf5/dS7hDIdx3/UTlUQ/9JjNV9fOg2BkmVgBhWWu5WpWKPog

And then re-run the “authrate”

$ authrate -h localhost -p 1389 -g 'rand(0,10000)' -D "uid=user.,ou=people,dc=example,dc=com" -w password -c 32 -f
 -----------------------------------------------------------------
 Throughput     Response Time
 (ops/second)   (milliseconds)
 recent average recent average 99.9% 99.99% 99.999% err/sec
 -----------------------------------------------------------------
 ...
 25481.7 25377.6 1.222 1.227 10.470 15.473 158.234 0.0

As you can see, there is not much of a difference in throughput or response time, when using the strongest algorithm to hash user password. So do not hesitate to change the default settings and make use of the strongest password hashing schemes with OpenDJ. It could save you from the embarrassment of, one day, contacting each of your users or customers to ask them to change their compromised password.

The default password hashing schemes are in 2 locations :

  • The default password policy for all passwords that are changed online.
dn: cn=Default Password Policy,cn=Password Policies,cn=config
ds-cfg-default-password-storage-scheme: cn=Salted SHA-512,cn=Password Storage Schemes,cn=config
  • In the Import Password Policy
dn: cn=Password Policy Import,cn=Plugins,cn=config
ds-cfg-default-user-password-storage-scheme: cn=Salted SHA-512,cn=Password Storage Schemes,cn=config

Both properties can be changed with dsconfig while the OpenDJ server is running, and the new scheme will be used for all subsequent operations.

About OpenDJ and Hotspot JVM G1

Duke on a bike
curtesy of Charly Hunt

Understanding and tuning the JVM is quite important to get the best performances out of OpenDJ. We do provide some high level guidance in our documentation and I’ve been talking about Java performances in the last few years at various Java User Groups in France and Switzerland (you can find presentations in French here or here) as well as at a major conference in Brazil : FISL in 2009. On this later occasion, I was asked to cover the presentation for 2 prestigious names in the Sun Hotspot JVM team : Charly Hunt and Tony Printezis. I’ve spent a few hours with them and have learnt a great deal about the internals of the Hotspot JVM and memory management, and all magic parameters, in order to deliver that presentation. At that time, our directory team was interacting a lot with the Hotspot team as we were testing a new and promising garbage collector: Garbage First aka G1. OpenDS was even wrapped and used in one of the largest collection of tests for the Sun JVM.

During the acquisition of Sun by Oracle, the future of G1 and the Hotspot JVM were unsure and our interactions with the Hotspot team diminished seriously.

At ForgeRock, we continued to pay attention to Garbage First and for a long time, we noticed that it wasn’t moving along. Most of the issues that were raised after tests with OpenDS and that were addressed in some development version of the JVM were not integrated in official JVM releases. It only with the Oracle JVM 1.7 update 2 that we noticed the large list of issues fixed with G1. We’ve then resumed testing OpenDJ with G1 to see that while the promise of no full GC seems to be addressed, the performance impact of G1 is still significantly high. With our limited tests of JVM under 4GB of heap size, we noticed a 10% performance degradation over CMS, corresponding with an approximate 10% increase of CPU load (on a quad core machine with hyperthreading on), but with better overall response times for OpenDJ as the maximum response time decreased from 200ms to 80ms, as illustrated below.

LDAP Modrate with Garbage First
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
16196.7 16374.1  1.972 1.951  18.886 28.129 66.933  0.0
16468.8 16374.9  1.941 1.951  18.883 28.087 66.521  0.0

LDAP Modrate with CMS
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
17937.1 17487.7  1.780 1.827  18.175 30.521 116.990 0.0
17783.7 17494.3  1.796 1.826  18.145 30.320 117.017 0.0

We need to run more tests with OpenDJ and G1, especially with very large heaps (from 4 to 32GB), but we’re not sure whether G1 will be able to deliver the performances it promised.

And today I noticed on LinkedIn that both Charly Hunt and Tony Printezis, the 2 main engineers behind the HotSpot JVM and Garbage First, had left Oracle for new adventures. Charly’s gone to  SalesForce and Tony to Adobe. This is certainly a good move for both of them, but it leaves me worried about the future of the Hotspot JVM and its ability to deliver innovation in GCs.

[Update on May 6th]

It appears that more engineers of the Sun JVM team have actually left in the last couple of months : John Pampuch, Igor Veresov, Paul Hohensee..

An Optimized solution for directory services ?

I was recently pointed to a white paper published by Oracle called : Oracle Optimized Solution for Oracle Unified Directory — Implementation Guide.

Because Oracle Unified Directory and OpenDJ have a common root (both derive from the Sun initiated OpenDS project), I was curious about that optimized solution, and if there was anything that might be applicable for our customers. And after reading the 45 pages white-paper, honestly, the Oracle Optimized Solution is not something I would recommend to any of our customers (*).

The white paper describes the hardware used for the solution : 3 SPARC T4-1 systems with 128GB of RAM and 6 300GB internal disks each. A Sparc T4-1 machine is an 8 core machine with each core supporting up to 8 threads. Each T4-1 system has 10GbE add-on network card. And each T4-1 machine is attached to a Sun Storage 2500-M2 array (with 2540 controllers) through two fiber channel cards, and each storage has 12 disks.

Let’s see the average price for this solution : The SPARC T4-1 with 128 GB of RAM has an estimated public price of $24,344, but with only 2 internal disks. So add another $1,660 for the additional 4 disks. The lowest price for a 10GbE card for that system is $2,000 and the cheapest storage array with the same amount of disks roughly $27,000. A total cost per system over $55,000, not including the cost of the Operating System, and a total cost for the “Optimized Solution” of approximately $165,000 (estimated public price).

So what do we get in performance for this price ? Well the white paper will not tell you what the solution is optimized for. The only number that appears, is the time it took to import the 15 Million entries on one of the systems :

[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847569 msg=Total import time was 3790 seconds. Phase one processing completed in 2868 seconds, phase two processing completed in 922 seconds
[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847454 msg=Processed 15000001 entries, imported 15000001, skipped 0, rejected 0 and migrated 0 in 3790 seconds (average rate 3957.0/sec)
[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847536 msg=Import LDIF environment close took 0 seconds

Last week, I was in Mexico with a partner of ours, demonstrating the capabilities of OpenDJ with the customer’s data (exported from a week ago, and which also contains several hundreds of very large static groups). We used x.86 based machines, with 96GB of memory, although we only used 16GB for the instance of OpenDJ.

And here’s the output of the import command :

[25/Apr/2012:20:10:44 +0200] category=JEB severity=NOTICE msgID=8847538 msg=DN phase two processing completed. Processed 21654508 DNs
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847569 msg=Total import time was 2002 seconds. Phase one processing completed in 1137 seconds, phase two processing completed in 865 seconds
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847454 msg=Processed 21654508 entries, imported 21654508, skipped 0, rejected 0 and migrated 0 in 2002 seconds (average rate 10815.3/sec)
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847536 msg=Import LDIF environment close took 0 seconds

I don’t have the price for the servers we used (but our partner can get in touch with you if you’re interested in the solution), but I doubt that it tops half of the price of the Oracle optimized solution !

So before you drink the Oracle cool-aid, think twice about what an optimized solution should be, and how much it should cost. Oh, by the way, there is no cost in license for OpenDJ, it’s open source, it’s available now and you can try it free of charge. Of course, we do appreciate if you subscribe to one of our support offering to protect your investment and ensure some Service Level Agreement.

(*) I would not recommend a directory solution on SPARC Tx machine ever. While the machines have a good capacity for load, the performance for any write activity is really bad, especially as soon as access controls are in use. Most of our partners who have been deploying directory services on these machines will agree with me. As a matter of fact, I don’t recall any recent customer mentioning SPARC nor Solaris when renewing their directory service infrastructure.

Mexico, Mexiiiiico !

I’m just back from a week of business trip to Mexico City. This was my first time in Mexico and I’ve heard all the rumors of it being a very dangerous city. I must say that I’ve seen a very very big city, vibrant, busy, with a lot of car trafic, but at no point I had any fear of being robbed or molested.

Two things have marked me during my stay. First, the city is very green. There are lots of trees, plants, flowers everywhere. All main avenues are borded by trees. It’s like mother nature is trying to tell us that she still exists despite the concrete and buildings.

Trees in AvenuesTree in flower

The other thing is that at any time of the day or the night, there are people in the street, trying to earn a little bit of money, selling water, tissues or balloons.

Globero


The food was amazing. I enjoyed tacos, fresh fruits, some argentinian bife, jalapeños… Spicy, but not “mucho picante”. As well as beers like Victoria, Bohemia, Dox Equis, Modelo… And tequila of course !

Other photos from my trip are on Google+

By the way, we did work this week in Mexico.

Below is a photo of the screen as we’ve finished importing the customers’ data in OpenDJ (the data includes a few hundreds of groups, each averaging 40 000 members). I like this kind of performance number ! And I will probably say more about the hardware and settings to achieve that in a future post.

I shall say a big thank you to our partner in Mexico and Latin America : NoLogin. They’ve made everything to make my stay safe and comfortable, including with jalapeños and tequila !

I hope the few companies I visited will turn into customers. I’d like to come back again in Mexico. These 5 days have just gone to0 fast. And I’ve just started to get into lutta libre 😉

Mexican Wrestler

Cache strategy for OpenDJ LDAP directory server

System administrators that are familiar with legacy LDAP directory servers know that one of the key for the best performance is caching the data. With Sun Directory Server or OpenLDAP, there are 3 levels of caching that could be done : the filesystem level, the database level and the entries level. The filesystem level cache is managed by the OS and cannot be controlled by the application. Using the filesystem cache is good when the directory server is the only process on the machine, and/or for initial performance. The database level cache allows faster read or write operations, and also includes the indexes. The later cache is the higher level cache, and usually the one that provides the best performances as it requires the least processing from the server to return entries to the applications, and it has the least contention.

OpenDJ has a different design for its database and core server, and thus the caching strategy needs to be different.

By default, OpenDJ does have a database cache enabled, and 3 different kind of entry caches, all disabled. The reason for the 3 entry caches is that they are implementing for different needs and access patterns. But all have in common a specific filter to select which entries to cache, and some settings as to how much memory to use. During our stress and performance tests, we noticed that using an entry cache for all accessed entries added a lot of pressure on the garbage collector, and also caused more garbage collection from the old generation, often leading to either fragmentation of the memory, or more frequent full GC (also known as “Stop the world GC”). This resulted in an overall lower consistent average response time and throughput.

So, we recommend that you favor the database cache, and do not setup an entry cache, except for specific needs (and do not try to activate all 3 entry caches, this may lead to some really strange behavior).

The default settings with OpenDJ 2.4 is that 10 % of the JVM heap space will be used for the database cache. With OpenDJ 2.5 (soon to be released), we have bumped the default to 50% of the heap space. If you’re tuning the heap size and make it larger than 2GB, we recommend that you keep that 50% ratio or even increase it if the heap size exceeds the 3GB.

If you do have a few very specific entries that are very often accessed, like large static groups that are constantly used for ACI or group membership by application, then the entry cache becomes handy, and then you want to set a filter so only these specific entries are cached.

For example, if you want to cache at most 5 entries, that are groupOfNames, you can use the following dsconfig command:

bin/dsconfig set-entry-cache-prop --cache-name FIFO
 --set include-filter:\(objectclass=GroupOfNames\)
 --set max-entries:5 --set max-memory-percent:90 --set enabled:true
 -h localhost -p 4444 -D "cn=Directory Manager" -w secret12 -X -n

Otherwise, you’d better of running with no entry cache. OpenDJ read performance are such that the directory server can respond to tens of thousands if not hundred of thousands searches per second with average response time in the order of a milli-second. This should be good enough for most applications !

OpenDJ Tips: More on troubleshooting indexes and search performances

In a previous post I talked about analyzing search filters and indexes. Matt added in a comment that OpenDJ has another mean of understanding how indexes are used in a search. Here’s a detailed post.

The OpenDJ LDAP directory server supports a “magic” operational attribute that allows an administrator to get from the server information about the processing of indexes for a specific search query: debugsearchindex.

If the attribute is set in the requested attributes in a search operation, the server will not return all entries as expected, but a single result entry with a fixed distinguished name and a single valued attribute debugsearchindex that contains the information related to the index processing, including the number of candidate entries per filter component, the overall number of candidate, and whether any or all of the search is indexed.

$ bin/ldapsearch -h localhost -p 1389 -D "cn=Directory Manager" -b "dc=example,dc=com" "(&(mail=user.*)(cn=*Denice*))" debugsearchindex
Password for user 'cn=Directory Manager': *******
dn: cn=debugsearch
debugsearchindex: filter=(&(mail=user.*)[INDEX:mail.substring][COUNT:2000](cn=*Denice*)[INDEX:cn.substring][COUNT:1])[COUNT:1] final=[COUNT:1]

$ bin/ldapsearch -h localhost -p 1389 -D "cn=Directory Manager" -b "dc=example,dc=com" "objectclass=*" debugsearchindex
Password for user 'cn=Directory Manager': *********
dn: cn=debugsearchdebugsearchindex: filter=(objectClass=*)[NOT-INDEXED] scope=wholeSubtree[COUNT:2007] final=[COUNT:2007]

$ bin/ldapsearch -h localhost -p 1389 -D "cn=Directory Manager" -b "dc=example,dc=com" "mail=user.1*" debugsearchindex
Password for user 'cn=Directory Manager': *********
dn: cn=debugsearch
debugsearchindex: filter=(mail=user.1*)[INDEX:mail.substring][COUNT:1111] scope=wholeSubtree[COUNT:2007] final=[COUNT:1111]

Note that sometimes, OpenDJ tries to optimize the query and use some other index than the regular one for the query. For example, it might use the equality index for an initial substring filter. The index used during the search does appear in the debugsearchindex attribute. Also, once the result set has been narrowed down to very few entries, it will stop using index and evaluate directly the entry set, as for the example below:

$ bin/ldapsearch -h localhost -p 1389 -D "cn=Directory Manager" -b "dc=example,dc=com" "(&(cn=Denice*)(mail=user.9*))" debugsearchindex
Password for user 'cn=Directory Manager':
dn: cn=debugsearch
debugsearchindex: filter=(&(cn=Denice*)[INDEX:cn.equality][COUNT:1])[COUNT:1] final=[COUNT:1]

Benchmark proves OpenDJ fastest directory server !

Isode has just released a benchmark of their M-Vault R15.1 directory server, and has run some comparative tests against OpenLDAP and OpenDJ.

While the benchmark demonstrates that M-Vault is one of the best directory server out there (the new release has some really impressive search performance) , I paid more attention to the write performance, and I really like those results that are showing the OpenDJ is the fastest directory server for write operations, even when modifications are mixed with searches.

Benchmark write performance summary

Captured from Isode benchmark white-paper.

Thanks Isode for running those tests, and making those numbers publicly available.

An important tuning flag for OpenDJ with 64bit JVM…

If you’re running OpenDJ with a 64bit JVM with less than 32GB of heap size, be aware of the need to explicitly set the -XX:+UseCompressedOops option (unless you want to disable it).

Compressed oops is supported and enabled by default in Java SE 6u23 and later, when running a 64bit JBM with a value of -Xmx lower than 32GB. You can find more information about Compressed Oops in Java technical notes here: http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html

However, OpenDJ internal database, in order to estimate properly the occupation of the DB cache and tune the cache eviction threads, needs to take into account the compressed oops option. For this is relies on the JVM option to be set explicitly. If the option is not explicitly set, the database may consider the cache full when it’s not, and run cache eviction too early, resulting in less optimized performances.

So, with 64bit JVM, make sure you add the -XX:+UseCompressedOops option to the start-ds line in the config/java.properties file. Then run bin/dsjavaproperties and restart OpenDJ to benefit from the new settings.

OpenDJ: Analyzing Search Filters and Indexes

LDAP directory services greatly rely on indexes to provide fast and accurate search results.

OpenDJ, the open source LDAP directory services for the Java platform, provides a number of tools to ensure indexes are efficiently used or to optimize them for even better performances.

To start with, OpenDJ rejects by default all unindexed searches, unless the authenticated user has the privilege to perform them. Unindexed searches are rejected because they result in scanning the whole database, which consumes lots of resources and time. There are legitimate uses of unindexed search though, and OpenDJ offers a way to control who can perform them through a privilege. To learn more about privileges, how to grant them, please check the Administration Guide or some of my previous posts.

When unindexed searches are completed, OpenDJ (starting with revision 7148 of the OpenDJ trunk, and therefore OpenDJ 2.5) does logs the “Unindexed” keyword as part of the Search Response access log message. But the access log file can also be used to identify search operations that are not making an optimal use of indexes. Simply check for those search responses that have been returned with an etime (execution time) greater than the average.

The access log example below contains both an unusually high etime (expressed in ms) and the Unindexed tag.

[27/Jul/2011:20:27:27 +0200] SEARCH RES conn=0 op=1 msgID=2 result=0 nentries=10001 Unindexed etime=1846

The verify-index command let you check that no index is corrupted (i.e. no data is missing from indexes).

The rebuild-index command let you build or rebuild an index that would be corrupted or had its configuration changed.

One of the tuning parameter of indexes is the index-entry-limit (which was known in Sun DSEE as the AllIDsThreshold), the maximum size of entries kept in an index record, before the server stop maintaining that record and consider it’s more efficient to scan the whole database. For more information on the index entry limit, check the Section 7.2.4 Changing Index Entry Limits of the Indexing chapter of the Administration Guide.

OpenDJ provides a static analyzer of indexes which can help to understand how well the attributes are indexed, as well as help to tune the index entry limit. This tool is a function of the dbtest utility and is simply used as follow:

$ bin/dbtest list-index-status -n userRoot -b "dc=example,dc=com"

Index Name Index Type JE Database Name Index Valid Record Count Undefined 95% 90% 85%

---------------------------------------------------------------------------------------------------------------------------------------
id2children                Index       dc_example_dc_com_id2children                true         2             0          0    0    0
id2subtree                 Index       dc_example_dc_com_id2subtree                 true         2             0          0    0    0
uid.equality               Index       dc_example_dc_com_uid.equality               true         2000          0          0    0    0
aci.presence               Index       dc_example_dc_com_aci.presence               true         0             0          0    0    0
ds-sync-conflict.equality  Index       dc_example_dc_com_ds-sync-conflict.equality  true         0             0          0    0    0
givenName.equality         Index       dc_example_dc_com_givenName.equality         true         2000          0          0    0    0
givenName.substring        Index       dc_example_dc_com_givenName.substring        true         5777          0          0    0    0
objectClass.equality       Index       dc_example_dc_com_objectClass.equality       true         6             0          0    0    0
member.equality            Index       dc_example_dc_com_member.equality            true         0             0          0    0    0
uniqueMember.equality      Index       dc_example_dc_com_uniqueMember.equality      true         0             0          0    0    0
cn.equality                Index       dc_example_dc_com_cn.equality                true         2000          0          0    0    0
cn.substring               Index       dc_example_dc_com_cn.substring               true         19407         0          0    0    0
sn.equality                Index       dc_example_dc_com_sn.equality                true         2000          0          0    0    0
sn.substring               Index       dc_example_dc_com_sn.substring               true         8147          0          0    0    0
telephoneNumber.equality   Index       dc_example_dc_com_telephoneNumber.equality   true         2000          0          0    0    0
telephoneNumber.substring  Index       dc_example_dc_com_telephoneNumber.substring  true         16506         0          0    0    0
ds-sync-hist.ordering      Index       dc_example_dc_com_ds-sync-hist.ordering      true         1             0          0    0    0
mail.equality              Index       dc_example_dc_com_mail.equality              true         2000          0          0    0    0
mail.substring             Index       dc_example_dc_com_mail.substring             true         7235          0          0    0    0
entryUUID.equality         Index       dc_example_dc_com_entryUUID.equality         true         2002          0          0    0    0

Total: 20

If an index contains a non zero value (N) in the undefined column, it means N index keys have reached the index entry limit and are no longer maintained. This can be normal, for example with the ObjectClass equality index, where the vast majority of entries will have the same objectclasses (top, Person, organizationalPerson, inetOrgPerson). But, for other attributes, such as cn, it may indicate that the index entry limit is too low.

Finally, OpenDJ has an option to do a live analysis of search filters and how they use indexes. To enable live index analysis, simply enable it for the database backend that contains the data :

dsconfig set-backend-prop --backend-name userRoot  --set index-filter-analyzer-enabled:true \
 --set max-entries:50 -h localhost -p 4444 -D cn=Directory\ Manager -w ****** -n -X

The max-entries parameter specifies how many filter items are being analyzed and kept in memory. Only the last max-entries will be kept. If there is a huge variety of requests against the directory service, you might want to increase the number. However, keep in mind that the analysis is kept in memory, and the higher the number the largest the impact on the overall performances of the server.

We do not recommend that you leave the index analysis enabled all the time, especially in production. The index analyzer should be used to gather statistics over a flow of requests for a short period of time, and should be disabled afterwards to free the resources.

The result of the index analyzer can be retrieved under the cn=monitor suffix, more specifically as part of the database environment of the backend.

$ bin/ldapsearch -p 1389 -D cn=directory\ manager -w secret12  \
-b "cn=userRoot Database Environment,cn=monitor" '(objectclass=*)' filter-use

dn: cn=userRoot Database Environment,cn=monitor
filter-use: (uid=user.*) hits:1 maxmatches:20 message:
filter-use: (tel=*) hits:1 maxmatches:-1 message:presence index type is disabled
  for the tel attribute
filter-use: (objectClass=groupOfURLs) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfEntries) hits:1 maxmatches:0 message:
filter-use: (objectClass=person) hits:1 maxmatches:20 message:
filter-use: (objectClass=ds-virtual-static-group) hits:1 maxmatches:0 message:
filter-use: (aci=*) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfNames) hits:1 maxmatches:0 message:
filter-use: (objectClass=groupOfUniqueNames) hits:1 maxmatches:0 message:
filter-use: (objectClass=ldapSubentry) hits:1 maxmatches:0 message:
filter-use: (objectClass=subentry) hits:1 maxmatches:0 message:

hits represents the number of time this filter was used. the maxmatches represents the maximum number of entries that were returned for that filter.

Index analysis and tuning is not a simple task, and I recommend to play with these tools  a lot on a test environment to understand how to get the best out of them. But, as you can see, OpenDJ provides you with all the tools you need to get the best performances out of your LDAP directory.

1 Year Old and 1 New Architect

ForgeRock is exactly ONE year old today. As we’re a distributed and quite global company, we’re not going to blow the candle on the cake today. But I’m sure next time we meet, we’ll have one as nice as the one we had during our last company meeting in Faro, Portugal.

Also today is the first day at ForgeRock of Matthew Swift, as Architect for the OpenDJ project, growing the forces at the ForgeRock Grenoble Engineering Center. Matthew comes from Sun (Oracle) where he was leading the development of the core of OpenDS, as well as the LDAP Client API. He has been doing interesting work with regards to performances with the OpenDS server (he’s the one who provided me with nice numbers to present), and its reliability. Matthew has several years of experience building LDAP and Directory related products as well as Java development, for Sun, Bloomberg and Isode. He’s bringing his talent and energy back to the open source project and will help make OpenDJ an even stronger and better product.

I’m really delighted to work with Matthew again.

And what a great day today !