OpenDJ, the Open source LDAP Directory service built on the Java platform, offer plenty of flexibility to administrators to setup their environment. One specific area is how to deal with multi-tenants and hosting data from different companies.
In OpenDJ, the data is organized in database backends -and there can be many database backends-, each capable of hosting many separated “Base DN”, aka naming context or suffixes (think about “dc=Coca,dc=com” and “dc=Pepsi,dc=com”).
So we are often asked the best practices around multi-tenants, and whether it’s preferable to put the baseDNs in a single backend or to separate them, each one in a separate backend ?
Before we dive in the response, it’s important to understand that a database backend is actually a whole database environment and as such the smallest unit for backup and restore procedure. And also that indexes are configured at the backend level, so all indexes configuration are identical for all base DNs in a backend.
There are several good reasons for using a backend per tenant :
- Backends can be placed in different filesystems and disks, allowing better scalability, consistent performance.
- Maintenance done on a backend does not affect the other backends and thus the other tenants. It’s also possible to define a separate recurrent backup schedule per backend.
- From a security and privacy point of view for the customer, separation of data is better. Even though ACI are meant to prevent one tenant to see the other’s data, having separated backends will also ensure that this is also the case when doing backups, exporting data to LDIF…
- Also, if there is a need to have different configuration of indexes for each tenant, because of the applications accessing the data, or because of the structure of the data itself, then they must be stored in separated backends.
All of this seems to lead to the point that each tenant should be in its own backend. Why supporting multiple base DNs in a single database backend then ? Well, when the data sets are small and consistent, from an administration point of view, it is much easier to deal with a single backend than many of them. It reduces time for configuration, monitoring of disk space and tend to optimize memory usage. It also simplifies the database cache management, as there is a cache per backend and the overall size must not exceed the JVM memory size, nor the machine’s one. As a single backend is able to scale to tens of million entries, there is no real penalty here.
As a conclusion, when deploying OpenDJ for multi-tenant services, make sure you properly evaluate your requirements for performance, security and privacy before configuring the server. But of course, you can also choose to mix backends with multiple tenants and separate backends for some larger and higher value customers (tenants).
For backend per tenant, which means tenant-management-server needs to create a backend for a new tenant, which also from coding point of.view, the sever (java server)
will call dsconfig admin tool to create opendj backend with 4444 port.
Question: how to call dsconfig from server side java code? Is there any java sdk for dsconfig?
dsconfig is written in Java and has a main class. But it’s mostly undocumented.
It’s possible to build the list of arguments and to call the dsconfig main method with the args. Alternately, we’ve seen customers calling exec() on the whole command itself.
Also, ultimately, dsconfig does all server’s modifications over LDAP, so it’s possible (by enabling audit log) to see those requests and replay them with a simple LDAP library.
Thanks LUDO. I will try. Do you have any plan to release a Java SDK to do all administration operations (the 4444 port one) ?
The configuration framework and its libraries is already a module of the OpenDJ project. But because each version is tightly coupled with a version of the product (and all its configuration parameters), we do not plan to make a specific public Java SDK to allow people to write code to do all administration operations. Note that with the current nightly builds, and moving forward, the Configuration is available in read-write mode over HTTP/REST/JSON. We have started to provide dynamic description of the API (based on Swagger) and this should become an evolving public API with OpenDJ 4.0 due at the end of this year (2016).
For multi-tenant data, a backend per tenant makes sense. What if I wanted to have custom schema extensions for one of those tenants only. I’d like to be able to have a fully customizable tenant environment including data and schema extensions. Possible?
Schema extensions per tenant are not possible today, unless you dedicate a specific directory server for the tenant.