OpenDJ: Monitoring Unindexed Searches…

FR_plogo_org_FC_openDJ-300x86OpenDJ, the open source LDAP directory services, makes use of indexes to optimise search queries. When a search query doesn’t match any index, the server will cursor through the whole database to return the entries, if any, that match the search filter. These unindexed queries can require a lot of resources : I/Os, CPU… In order to reduce the resource consumption, OpenDJ rejects unindexed queries by default, except for the Root DNs (i.e. for cn=Directory Manager).

In previous articles, I’ve talked about privileges for administratives accounts, and also about Analyzing Search Filters and Indexes.

Today, I’m going to show you how to monitor for unindexed searches by keeping a dedicated log file, using the traditional access logger and filtering criteria.

First, we’re going to create a new access logger, named “Searches” that will write its messages under “logs/search”.

dsconfig -D cn=directory\ manager -w secret12 -h localhost -p 4444 -n -X \
    create-log-publisher \
    --set enabled:true \
    --set log-file:logs/search \
    --set filtering-policy:inclusive \
    --set log-format:combined \
    --type file-based-access \
    --publisher-name Searches

Then we’re defining a Filtering Criteria, that will restrict what is being logged in that file: Let’s log only “search” operations, that are marked as “unindexed” and take more than “5000” milliseconds.

dsconfig -D cn=directory\ manager -w secret12 -h localhost -p 4444 -n -X \
    create-access-log-filtering-criteria \
    --publisher-name Searches \
    --set log-record-type:search \
    --set search-response-is-indexed:false \
    --set response-etime-greater-than:5000 \
    --type generic \
    --criteria-name Expensive\ Searches

Voila! Now, whenever a search request is unindexed and take more than 5 seconds, the server will log the request to logs/search (in a single line) as below :

$ tail logs/search
[12/Sep/2016:14:25:31 +0200] SEARCH conn=10 op=1 msgID=2 base="dc=example,
dc=com" scope=sub filter="(objectclass=*)" attrs="+,*" result=0 nentries=
10003 unindexed etime=6542

This file can be monitored and used to trigger alerts to administrators, or simply used to collect and analyse the filters that result into unindexed requests, in order to better tune the OpenDJ indexes.

Note that sometimes, it is a good option to leave some requests unindexed (the cost of indexing them outweighs the benefits of the index). If these requests are unfrequent, run by specific administrators for reporting reasons, and if the results are expecting to contain a lot of entries. If so, a best practice is to have a dedicated replica for administration and run these expensive requests. Also, it is better if the client applications are tuned to expect these requests to take a long time.

OpenDJ presented at the LavaJUG

As I mentioned last week, I was presenting OpenDJ and server performances in Java at the LavaJUG on Thursday.
Ludo@LavaJUG

The session was broadcasted live on Google Hangout, unfortunately due to a nice blue screen, in 2 parts. You can watch them here:

Part 1:

Part 2:

The slides are available on the LavaJUG Wiki

Thanks for the great reception to the whole LavaJUG and more specifically to its leaders Olivier Coupelon, Pierre Colomb, Sylvain Desgrais and Thomas Maurel.

Upcoming events: LavaJUG & Devoxx France

I will be at the LavaJUG (Java User Group from Clermont-Ferrand, France) this Thursday from 19:00 to 21:00, presenting our experience with the OpenDJ project with building a highly scalable and high performance server in Java. The presentation is based on what I’ve already presented in a few JUG in France (AlpesJUG, MarsJUG, PoitouCharentesJUG,…) and Switzerland (JUG Lausanne), but has been updated with regards to GarbageFirst GC and the most recent HotSpot JVM.

 

And next week, from  March Wednesday 27th to Friday 29th, you will find ForgeRock at the Devoxx France conference.

Come to our conference session about “Enterprise Security in a Cloudy and Mobile World” (the session is in French). The session is on Friday 29th, from 11:45 to 12:35, in Miles Davis room. Mark it on your calendar, and if you miss it, make sure you stop by our booth (B3) to say hello and talk with some of our engineers. We will also be present at the HackerGarten on Wednesday from 14:00 to 18:00, should you want to have fun with one of our open source projects : OpenAM, OpenDJ or OpenIDM.

DevoxxFR-2013-banniere-texte-600-232

More secure passwords !

I’ve received an intriguing request from a customer last week :  he wanted to know if we’ve done benchmarks of the password hashing schemes that are available in OpenDJ, our LDAP directory service. Their fear was that with stronger schemes, they could not sustain a high authentication rate.

In light of the LinkedIn leak of several millions of passwords, hashed with a simple unsalted SHA1, I decided to run a quick and simple test.

SSHA1 is the default hashing scheme for password in OpenDJ. The salt is an 8 bytes (64-bit) random string and is used with the password to produce the 20 bytes message digest. But OpenDJ directory server supports a wide range of password hashing scheme and salted SHA512 is currently the most secure hashing algorithm we support (and the salt here is also an 8 bytes (64-bit) random octet string).

So for the test, I generated a sample directory data set with 10 000 users, and imported it in the OpenDJ directory (a 2.5 development build) with the default settings, on my laptop (MacBook Pro, 2.2 GHz intel Core i7).

$ ldapsearch -D "cn=directory manager" -w secret12 -p 1389 -b "dc=example,dc=com" 'uid=user.10' dn userPassword
dn: uid=user.10,ou=People,dc=example,dc=com
userPassword: {SSHA}cchzM+LrPCvbZdthOC8e62d4h7a4CfoNvl6d/w==

I then ran an “authrate” which is a small benchmark tool that allows to stress an LDAP server with a high number of authentications (LDAP Bind requests) and let it run to 5 minutes.

authrate -h localhost -p 1389 -g 'rand(0,10000)' -D "uid=user.%d,ou=people,dc=example,dc=com" -w password -c 32 -f
-----------------------------------------------------------------
 Throughput     Response Time
 (ops/second)   (milliseconds)
 recent average recent average 99.9% 99.99% 99.999% err/sec
 -----------------------------------------------------------------
 ...
 26558.0  26148.9   1.179    1.195  10.168  19.431  156.421      0.0

I then stopped the server, changed the import default password encryption scheme to Salted SHA512, and reimported the data.

$ ldapsearch -D "cn=directory manager" -w secret12 -p 1389 -b "dc=example,dc=com" 'uid=user.10' dn userPassword
 dn: uid=user.10,ou=People,dc=example,dc=com
 userPassword: {SSHA512}eTGiwtTM4niUKNkEBy/9t03UdbsyYTL1ZXhy6uFnw4X0T6Y9Zf5/dS7hDIdx3/UTlUQ/9JjNV9fOg2BkmVgBhWWu5WpWKPog

And then re-run the “authrate”

$ authrate -h localhost -p 1389 -g 'rand(0,10000)' -D "uid=user.,ou=people,dc=example,dc=com" -w password -c 32 -f
 -----------------------------------------------------------------
 Throughput     Response Time
 (ops/second)   (milliseconds)
 recent average recent average 99.9% 99.99% 99.999% err/sec
 -----------------------------------------------------------------
 ...
 25481.7 25377.6 1.222 1.227 10.470 15.473 158.234 0.0

As you can see, there is not much of a difference in throughput or response time, when using the strongest algorithm to hash user password. So do not hesitate to change the default settings and make use of the strongest password hashing schemes with OpenDJ. It could save you from the embarrassment of, one day, contacting each of your users or customers to ask them to change their compromised password.

The default password hashing schemes are in 2 locations :

  • The default password policy for all passwords that are changed online.
dn: cn=Default Password Policy,cn=Password Policies,cn=config
ds-cfg-default-password-storage-scheme: cn=Salted SHA-512,cn=Password Storage Schemes,cn=config
  • In the Import Password Policy
dn: cn=Password Policy Import,cn=Plugins,cn=config
ds-cfg-default-user-password-storage-scheme: cn=Salted SHA-512,cn=Password Storage Schemes,cn=config

Both properties can be changed with dsconfig while the OpenDJ server is running, and the new scheme will be used for all subsequent operations.

About OpenDJ and Hotspot JVM G1

Duke on a bike
curtesy of Charly Hunt

Understanding and tuning the JVM is quite important to get the best performances out of OpenDJ. We do provide some high level guidance in our documentation and I’ve been talking about Java performances in the last few years at various Java User Groups in France and Switzerland (you can find presentations in French here or here) as well as at a major conference in Brazil : FISL in 2009. On this later occasion, I was asked to cover the presentation for 2 prestigious names in the Sun Hotspot JVM team : Charly Hunt and Tony Printezis. I’ve spent a few hours with them and have learnt a great deal about the internals of the Hotspot JVM and memory management, and all magic parameters, in order to deliver that presentation. At that time, our directory team was interacting a lot with the Hotspot team as we were testing a new and promising garbage collector: Garbage First aka G1. OpenDS was even wrapped and used in one of the largest collection of tests for the Sun JVM.

During the acquisition of Sun by Oracle, the future of G1 and the Hotspot JVM were unsure and our interactions with the Hotspot team diminished seriously.

At ForgeRock, we continued to pay attention to Garbage First and for a long time, we noticed that it wasn’t moving along. Most of the issues that were raised after tests with OpenDS and that were addressed in some development version of the JVM were not integrated in official JVM releases. It only with the Oracle JVM 1.7 update 2 that we noticed the large list of issues fixed with G1. We’ve then resumed testing OpenDJ with G1 to see that while the promise of no full GC seems to be addressed, the performance impact of G1 is still significantly high. With our limited tests of JVM under 4GB of heap size, we noticed a 10% performance degradation over CMS, corresponding with an approximate 10% increase of CPU load (on a quad core machine with hyperthreading on), but with better overall response times for OpenDJ as the maximum response time decreased from 200ms to 80ms, as illustrated below.

LDAP Modrate with Garbage First
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
16196.7 16374.1  1.972 1.951  18.886 28.129 66.933  0.0
16468.8 16374.9  1.941 1.951  18.883 28.087 66.521  0.0

LDAP Modrate with CMS
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
17937.1 17487.7  1.780 1.827  18.175 30.521 116.990 0.0
17783.7 17494.3  1.796 1.826  18.145 30.320 117.017 0.0

We need to run more tests with OpenDJ and G1, especially with very large heaps (from 4 to 32GB), but we’re not sure whether G1 will be able to deliver the performances it promised.

And today I noticed on LinkedIn that both Charly Hunt and Tony Printezis, the 2 main engineers behind the HotSpot JVM and Garbage First, had left Oracle for new adventures. Charly’s gone to  SalesForce and Tony to Adobe. This is certainly a good move for both of them, but it leaves me worried about the future of the Hotspot JVM and its ability to deliver innovation in GCs.

[Update on May 6th]

It appears that more engineers of the Sun JVM team have actually left in the last couple of months : John Pampuch, Igor Veresov, Paul Hohensee..

An Optimized solution for directory services ?

I was recently pointed to a white paper published by Oracle called : Oracle Optimized Solution for Oracle Unified Directory — Implementation Guide.

Because Oracle Unified Directory and OpenDJ have a common root (both derive from the Sun initiated OpenDS project), I was curious about that optimized solution, and if there was anything that might be applicable for our customers. And after reading the 45 pages white-paper, honestly, the Oracle Optimized Solution is not something I would recommend to any of our customers (*).

The white paper describes the hardware used for the solution : 3 SPARC T4-1 systems with 128GB of RAM and 6 300GB internal disks each. A Sparc T4-1 machine is an 8 core machine with each core supporting up to 8 threads. Each T4-1 system has 10GbE add-on network card. And each T4-1 machine is attached to a Sun Storage 2500-M2 array (with 2540 controllers) through two fiber channel cards, and each storage has 12 disks.

Let’s see the average price for this solution : The SPARC T4-1 with 128 GB of RAM has an estimated public price of $24,344, but with only 2 internal disks. So add another $1,660 for the additional 4 disks. The lowest price for a 10GbE card for that system is $2,000 and the cheapest storage array with the same amount of disks roughly $27,000. A total cost per system over $55,000, not including the cost of the Operating System, and a total cost for the “Optimized Solution” of approximately $165,000 (estimated public price).

So what do we get in performance for this price ? Well the white paper will not tell you what the solution is optimized for. The only number that appears, is the time it took to import the 15 Million entries on one of the systems :

[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847569 msg=Total import time was 3790 seconds. Phase one processing completed in 2868 seconds, phase two processing completed in 922 seconds
[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847454 msg=Processed 15000001 entries, imported 15000001, skipped 0, rejected 0 and migrated 0 in 3790 seconds (average rate 3957.0/sec)
[07/Jan/2012:12:19:29 +0000] category=JEB severity=NOTICE msgID=8847536 msg=Import LDIF environment close took 0 seconds

Last week, I was in Mexico with a partner of ours, demonstrating the capabilities of OpenDJ with the customer’s data (exported from a week ago, and which also contains several hundreds of very large static groups). We used x.86 based machines, with 96GB of memory, although we only used 16GB for the instance of OpenDJ.

And here’s the output of the import command :

[25/Apr/2012:20:10:44 +0200] category=JEB severity=NOTICE msgID=8847538 msg=DN phase two processing completed. Processed 21654508 DNs
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847569 msg=Total import time was 2002 seconds. Phase one processing completed in 1137 seconds, phase two processing completed in 865 seconds
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847454 msg=Processed 21654508 entries, imported 21654508, skipped 0, rejected 0 and migrated 0 in 2002 seconds (average rate 10815.3/sec)
[25/Apr/2012:20:10:45 +0200] category=JEB severity=NOTICE msgID=8847536 msg=Import LDIF environment close took 0 seconds

I don’t have the price for the servers we used (but our partner can get in touch with you if you’re interested in the solution), but I doubt that it tops half of the price of the Oracle optimized solution !

So before you drink the Oracle cool-aid, think twice about what an optimized solution should be, and how much it should cost. Oh, by the way, there is no cost in license for OpenDJ, it’s open source, it’s available now and you can try it free of charge. Of course, we do appreciate if you subscribe to one of our support offering to protect your investment and ensure some Service Level Agreement.

(*) I would not recommend a directory solution on SPARC Tx machine ever. While the machines have a good capacity for load, the performance for any write activity is really bad, especially as soon as access controls are in use. Most of our partners who have been deploying directory services on these machines will agree with me. As a matter of fact, I don’t recall any recent customer mentioning SPARC nor Solaris when renewing their directory service infrastructure.

Mexico, Mexiiiiico !

I’m just back from a week of business trip to Mexico City. This was my first time in Mexico and I’ve heard all the rumors of it being a very dangerous city. I must say that I’ve seen a very very big city, vibrant, busy, with a lot of car trafic, but at no point I had any fear of being robbed or molested.

Two things have marked me during my stay. First, the city is very green. There are lots of trees, plants, flowers everywhere. All main avenues are borded by trees. It’s like mother nature is trying to tell us that she still exists despite the concrete and buildings.

Trees in AvenuesTree in flower

The other thing is that at any time of the day or the night, there are people in the street, trying to earn a little bit of money, selling water, tissues or balloons.

Globero


The food was amazing. I enjoyed tacos, fresh fruits, some argentinian bife, jalapeños… Spicy, but not “mucho picante”. As well as beers like Victoria, Bohemia, Dox Equis, Modelo… And tequila of course !

Other photos from my trip are on Google+

By the way, we did work this week in Mexico.

Below is a photo of the screen as we’ve finished importing the customers’ data in OpenDJ (the data includes a few hundreds of groups, each averaging 40 000 members). I like this kind of performance number ! And I will probably say more about the hardware and settings to achieve that in a future post.

I shall say a big thank you to our partner in Mexico and Latin America : NoLogin. They’ve made everything to make my stay safe and comfortable, including with jalapeños and tequila !

I hope the few companies I visited will turn into customers. I’d like to come back again in Mexico. These 5 days have just gone to0 fast. And I’ve just started to get into lutta libre 😉

Mexican Wrestler