OpenDJ: Quick Replication setup

10 May 201110 May 2011Ludo

OpenDJ Servers Replication

As we develop OpenDJ, we spend a lot of time testing, whether it’s a new feature or a correction to an existing one. We usually write some unit tests to validate the code and then some functional tests to check the feature from a “user” point of view. While the unit tests are typically run with a single server, the functional or integration tests are run with configurations that match our customers deployment. And one of the given fact for any directory service deployment with OpenDJ that I’m aware of, is that the service is made of two or more OpenDJ directory servers with Multi-Master Replication enabled between them.

Setting up Multi-Master Replication with OpenDJ is quite easy and I’m going to demonstrate it here:

Lets assume we want to install 2 OpenDJ servers on the following hosts : ldap1.example.com and ldap2.example.com. For simplicity and because for test we avoid running tests with root privileges, we will configure the server to use port 1389 and 1636 for LDAP and LDAPS respectively.

On ldap1.example.com

$ unzip OpenDJ-2.4.2.zip
$ cd OpenDJ-2.4.2
$ ./setup -i -n -b "dc=example,dc=com" -d 20 -h ldap1.example.com -p 1389 \
  --adminConnectorPort 4444 -D "cn=Directory Manager" -w "secret12" -q -Z 1636 \
  --generateSelfSignedCertificate

Do the same on ldap2.example.com, the parameters being the same except for the -h option that should be ldap2.example.com

Now, you have 2 instances of OpenDJ configured and running with 20 sample entries in the suffix “dc=example,dc=com”. Let’s enable replication:

$ bin/dsreplication enable --host1 ldap1.example.com --port1 4444\
  --bindDN1 "cn=directory manager" \
  --bindPassword1 secret12 --replicationPort1 8989 \
  --host2 ldap2.example.com --port2 4444 --bindDN2 "cn=directory manager" \
  --bindPassword2 secret12 --replicationPort2 8989 \
  --adminUID admin --adminPassword password --baseDN "dc=example,dc=com" -X -n

And now make sure they both have the same data:

$ bin/dsreplication initialize --baseDN "dc=example,dc=com" \
  --adminUID admin --adminPassword password \
  --hostSource ldap1.example.com --portSource 4444 \
  --hostDestination ldap2.example.com --portDestination 4444 -X -n

For my daily tests I’ve put the commands in a script that I can run and will deploy 2 servers, enable replication between them and initialize them, all on a single machine (using different ports for LDAP and LDAPS).

Now if you want to add a 3rd server in the replication topology, install and configure it like the first 2 ones. And join it to the replication topology by repeating the last 2 commands above, replacing ldap2.example.com with the hostname of the 3rd server. Need a 4th one ? Repeat again, keeping ldap1.example.com as the server of reference.

Vladimir Dzhuvinov 10 May 2011 / 13:27

If I remember properly, the OpenDJ installation wizard also has a nice screen for enabling replication at setup time.

Reply
- Ludo 10 May 2011 / 15:02
  
  You’re correct. When doing tests with several servers, I find it much faster to use command line utilities, and I even have scripts that will setup 2 or 3 servers replicated without a single keystroke !
  
  Reply
- permanentr 30 October 2012 / 16:51
  
  Hi Ludovic,
  
  Your example is quite simple and to the point, I am trying to simulate your exercise on windows 7.
  I am able to replicate my initial ldap server (data etc.) multiple times, but not with the same ports.
  How does the replicas take over (automatically) when the others are down? are there other settings for that?
  Thus far, the only way I am able to test the replicated servers is to change the port number and/or hosts ID in my client application, looking to automate this.
  
  Regards
  Sam Saba
  
  Reply
  - Ludo 30 October 2012 / 17:01
    
    When testing replication on a single machine, you need to use different port numbers for each server.
    I run this everyday (and my scripts have hardcoded ports 1389/2389 for LDAP, 4444/5444 for Admin, 8989/9989 for replication).
    As for the take over, your applications must switch over either implicitly or through a load balancer. OpenDJ replication is using multi-master replication which means that any server can respond to all requests at any point.
    
    Regards,
    Ludo
Pingback: OpenDJ: Enabling the External Change Log on a single server « Ludo's Sketches
Pingback: OpenDJ: Replicating and tracking changes | Margin Notes 2.0
g2-36fab969e19d570a9fe0b41befcf4a86 24 June 2011 / 20:35

is secured replication that fast and easy?

can you provide a command line example?

Reply
- Ludo 30 June 2011 / 23:06
  
  Yes secure replication is that fast and easy:
  
  $ bin/dsreplication enable –host1 ldap1.example.com –port1 4444\
  –bindDN1 “cn=directory manager” \
  –bindPassword1 secret12 –replicationPort1 8989 –secureReplication1 \
  –host2 ldap2.example.com –port2 4444 –bindDN2 “cn=directory manager” \
  –bindPassword2 secret12 –replicationPort2 8989 –secureReplication2 \
  –adminUID admin –adminPassword password –baseDN “dc=example,dc=com” -X -n
  
  That’s it !
  
  Reply
Karthik 18 April 2012 / 12:26

Hi,

Actually I have a Master-slave server setup where I want to achieve data synch.
I have 2 baseDN which should in synch – say baseDN_1 and baseDN_2.

baseDN_1 should always be enabled, where as baseDN_2 should be eabled on demand basis. Initaill baseDN_2 will be enable and after initial synhc up immediately we have to disable it – but some operation by user would expect synch up of baseDN_2. My question is, is there any way to disable specificof the baseDNs leaving rest of the baseDNs enabled or we should use dsreplication disble command only.

P.S. In normal ldap dsconf has different commands like to disable-repl-agmt, delete-repl-agmt, disable-repl, etc

I am badly in need of help. Could you please guide me on this. Thanks in advance.

Reply
- Ludo 18 April 2012 / 16:39
  
  Hi Karthik,
  
  You can enable /disable replication for each baseDN separately. Each one is a separated replication domain.
  Check the Replication chapter of the OpenDJ Administration Guide: http://opendj.forgerock.org/doc/admin-guide/
  
  Regards,
  
  Ludovic.
  
  BTW, what you call “normal ldap” looks like Sun DS (it’s the only one I know that has a dsconf tool) and is legacy.
  
  Reply
  - Karthik 19 April 2012 / 08:49
    
    Thanks a ton!!!
    
    I am trying to configure data synch btwn 2 servers and as per my underdstanding I have to do following steps. Please correct me if I am worng:
    
    1. Install OpenDJ on both servers – say S1 and S2
    2. Now run the dsreplication enable command on S2 – This will make sure the data is synched between servers.
    3. Initailize-all command wrt to both servers.
    4. Now I am resady with servers which are in synch.
    
    Is there anything else that has to be done for successful establishement of synch between two servers.
  - Ludo 21 April 2012 / 19:03
    
    Hi Karthik,
    No, nothing else to do for replicating 2 servers.
    Install the 2 servers, one may be loaded with the data.
    Run dsreplication enable from either server, to setup replication.
    Run dsreplication initialize or initialize-all to push the data from the primary server to the replica.
    And any change you do on any server will be replicated.
    🙂
Arnljot Arntsen (@arnljot76) 14 June 2012 / 16:39

Hi Ludovic

I’ve got problems getting this to go as smooth as this using 2.4.5 on 64bit linux (Amazon EC2) using Oracle Hotspot 1.6.0_33 64bit JVM.

Client side in the shell I get:
Initializing registration information on server idp02:4444 with the contents of server idp01:4444 …..
Error during the initialization with contents from server
idp01:4444. Last log details: [14/Jun/2012:14:30:41 +0000]
severity=”NOTICE” msgCount=0 msgID=9896349 message=”Set Generation ID task
quicksetup-reset-generation-id-1 started execution”. Task state:
STOPPED_BY_ERROR. Check the error logs of idp01:4444 for more
information.
See /tmp/opends-replication-8413019362871675684.log for a detailed log of this
operation.

My set command is:
Machine ONE
sudo -u opendj ./setup -i -n -a -b “dc=opensso,dc=java,dc=net” -h idp01 -p 1389 –adminConnectorPort 4444 -D “cn=Directory Manager” -w “managerpassword” -q -Z 1636 –generateSelfSignedCertificate -x 1689

Machine TWO
sudo -u opendj ./setup -i -n -a -b “dc=opensso,dc=java,dc=net” -h idp02 -p 1389 –adminConnectorPort 4444 -D “cn=Directory Manager” -w “managerpassword” -q -Z 1636 –generateSelfSignedCertificate -x 1689

Then I enable replication:
sudo -u opendj bin/dsreplication enable –host1 idp01 –port1 4444 –bindDN1 “cn=Directory Manager” –bindPassword1 managerpassword –replicationPort1 8989 –host2 idp02 –port2 4444 –bindDN2 “cn=Directory Manager” –bindPassword2 managerpassword –replicationPort2 8989 –adminUID admin –adminPassword adminpassword –baseDN “dc=opensso,dc=java,dc=net” -X -n

The log on both machines is very chatty in the /usr/share/opendj/logs/errors, with lots of (machine ONE):
[14/Jun/2012:14:20:34 +0000] category=SYNC severity=SEVERE_ERROR msgID=14942387 msg=Replication server 27005 was attempting to connect to replication server idp01/xx.xxx.xxx.90:8989 but has disconnected in handshake phase

And this:
[14/Jun/2012:14:12:30 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811281 msg=Timed out while trying to acquire the domain lock for domain “cn=admin data”. The connection attempt from replication server RS(27005) at xx.xxx.xxx.90/xx.xxx.xxx.90:58019 to this replication server RS(27005) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt

What should I look into to get this fixed?

Reply
- Ludo 14 June 2012 / 18:29
  
  Hi Arnljot,
  
  I believe this is an issue of having an EC2 instance being able to communicate to the other one, and possibly with SSL based connections.
  I know that we’ve been able to do it, but I haven’t done it myself, and I don’t have the details at hand. I will look for them and post the notes then.
  
  Reply
  - Arnljot Arntsen (@arnljot76) 15 June 2012 / 13:05
    
    Thank you! 🙂
    
    I’ve verified that the ports are open, but I haven’t enabled SSL other than what’s done in the example (generating self signed certificates, and opening LDAPS port).
    
    I’d really appreciate the help.
    
    With the setup as it is now, and using initialize-all instead of doing it “one by one” replication seemingly works. But the log complaints that the admin user has different id on the two boxes.
    
    [14/Jun/2012:15:56:17 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811232 msg=Directory server DS(25634) has connected to replication server RS(4956) for domain “cn=admin data” at idp01/xx.xxx.xxx.90:8989, but the generation IDs do not match, indicating that a full re-initialization is required. The local (DS) generation ID is 138543 and the remote (RS) generation ID is 67162
    
    I think a consequence of this is that on idp02 (the second box) the admin-backend.ldiff in config is without userpassword entry.
  - Ludo 19 June 2012 / 22:22
    
    The message is not about the admin user, but the “admin data” internal backend with contains information for replication of all replicas.
    It might be worth trying to run dsreplication initialize for “cn=Admin data” suffix.
    But I have the feeling that you might be better to disable replication and try to re-enable it fully.
    Also, I’m not sure the blog is the best place for troubleshooting issues. You might want to bring the discussion to the OpenDJ mailing list : https://lists.forgerock.org/mailman/listinfo/opendj.
Karthik S Patawardhan 26 June 2012 / 17:00

Hi Ludovic,

I have got a query regarding subtree replication. We have “o=data1” as one backend. Under “o=data1” we have “cn=subdata1, o=data1” and “cn=subdata2, o=data1” each are in separate backends on server 1 and server 2.

We have following backends for following DNs ():
1). “o=data1”
2). “cn=subdata1, o=data1”
3). “cn=subdata2, o=data1”

With above configuration in place, if we enable replication for “o=data1” between server 1 and server 2 then the “cn=subdata1, o=data1” and “cn=subdata2, o=data1” will also be getting replicated.

Could you please guide me in finding if its possible to block replication for “cn=subdata1, o=data1” and “cn=subdata2, o=data1” “o=data1” being enabled.

Best Regards,
Karthik S Patawardhan

Reply
Micky 28 February 2013 / 20:56

Hi Ludovic,

In your 2nd and 3rd step you suddenly start using “–adminUID admin –adminPassword password” for the enable and initialize replication options. What is the default password for this user ? How does one go about resetting it or creating it from scratch ? I don’t see the “admin” user created anywhere in the db.

Thanks in advance

_Micky

Reply
- Micky 28 February 2013 / 21:24
  
  Sorry, I take this comment back. The ‘enable’ replication command does create the ‘admin’ user with the specified password. My apologies.
  
  _Micky
  
  Reply
jang 16 October 2013 / 03:23

possible a few multi-master replication ?
And if you are using four multi-master replication ,load Is there a correlation?

Reply
- Ludo 20 October 2013 / 19:18
  
  Yes, it’s possible. We tested up to 20 servers fully connected. And some customers have even larger : https://ludopoitou.wordpress.com/2013/10/01/opendj-visualizing-the-replication-topology/
  Each replicated server, adds a little bit overhead to all others, as changes need to be pushed to more servers. But the overhead is small. Best is to try and see.
  
  Reply
php newbie 21 January 2014 / 01:06

if i have 3 servers or 4 or more….

on the 1st server i run the command:
bin/dsreplication enable –host1 ldap1.example.com –port1 4444\
–bindDN1 “cn=directory manager” \
–bindPassword1 secret12 –replicationPort1 8989 \
–host2 ldap2.example.com –port2 4444 –bindDN2 “cn=directory manager” \
–bindPassword2 secret12 –replicationPort2 8989 \
–adminUID admin –adminPassword password –baseDN “dc=example,dc=com” -X -n

and then i run :
bin/dsreplication initialize –baseDN “dc=example,dc=com” \
–adminUID admin –adminPassword password \
–hostSource ldap1.example.com –portSource 4444 \
–hostDestination ldap2.example.com –portDestination 4444 -X -n

to add the 2rd server you said replace ldap2.example.com with ldap3.example.com in the dsreplication commands
but do i not have to also have to replace –host2 with –host3 ?

or will all the other seervers like ldap3.example.com ldap4.example.com ldap5.example.com be referred to as –host2 in the dsreplication enable and dsreplication initialize command ?

Reply
- Ludo 03 February 2014 / 22:47
  
  The dsreplication enable command line only has options of -host1 and -host2 (we could have chosen -hostSource -hostDestination, but that’s not really the proper semantic for enable, even though -host1 is kind of the reference server, and some data like schema may be pushed to host2).
  So to keep your example, ldap3.example.com, ldap4.example.com will be referred as -host2 in dsreplication enable and –hostDestination in dsreplication initialize commands.
  
  Regards
  
  Reply
  - php newbie 10 February 2014 / 16:56
    
    thanking you for reply….
    
    so if ldap1.example.com is be the “reference server”..
    
    will updates made directly to ldap2.example.com be pushed to ldap1.example.com and ldap3 & 4
    
    if ldap1.example.com goes down.. will updates on ldap2.example.com be pushed to ldap3.example.com and ldap4.example.com when ldap1 is down ?
    
    thank you lots in advance for reply.
  - Ludo 20 February 2014 / 14:25
    
    Yes, replication is built so that if a server is down, changes done on a server are still pushed to all other working servers. When the stopped server rejoins the replication ring, all missing changes will be sent automatically (unless the downtime was too large, the default being 3 days).
Chetan 19 May 2015 / 11:52

Hi Ludo, this is Chetan. I am having few queries related to Replication server clean-up. I went through lot of OpenAM documentation but couldn’t get much help.

Here is how my set-up looks:
We have multi-server OpenAM set-up where each OpenAM instance using embedded OpenDJ. When we launch second OpenAM instance, we provide first instance as replication source and provide the corresponding port, dn etc details. We are using single domain for replication. This set-up works well in normal condition. However, we have also enabled Auto Scaling in/out wherein we launch or terminate openam instances at will. Also the OpenAM instance host names are dynamic. So when our infrastructure scales in we terminate one or more openam instances but we do not perform any clean-up for removing these terminated instances from OpenAM Site or OpenDJ replication server configuration. This causes some stale hosts entries in the replication server list.

When I list the replication servers using dsconfig I get the following. However actually only 2 of these 3 instances are active.

Replication Server : replication-server-id : replication-port : replication-server
——————-:———————–:——————:——————————————————————————————————————————————————————————————–
replication-server : 28648 : 50889 : openam-1-xxxx.example.net:50889, openam-2-xxxx.example.net:50889, openam-3-xxxx.example.net:50889

Now when we scale out again, we provide any of the active OpenAM instance as replication server (host1) and new instance try to replicate the data from that active instance. So here we are not having any particular instance as a reference but we assume that over the time all servers would hold same data.

Now the questions-
1. Do you see any problem in such set-up? Is there anything we are doing wrong or not the best practice?
2. How to clean-up the old replication servers. dsreplication supports disable command but for that (I guess) the server must be alive and listening on admin port (4444). In our case we don’t know when the server will be terminated (based on scale-in policy). Can we remove/disable the replication server after its terminated?
3. Does stale or not reachable replication hosts cause any problem in data replication?

Reply
- Ludo 19 May 2015 / 16:03
  
  There are 2 sides to what you are doing.
  One is about the list of servers that are known by all other instances at the replication level. This was addressed in another post : https://ludopoitou.com/2012/01/09/disabling-replication-in-opendj-2-4/, and this should answer question 2.
  The other side, is about what we call the replication vector within the data itself. The replication vector identifies the state of the data by a list of pairs : ReplicaID, ChangeSequenceNumber of the most recent change for that replica. Because this identifies state of data, and allows the replication service to identify changes that may be missing, there are no easy way to remove references of replicas in that vector. As a result, if you do instantiate a new replica and remove it regularly, the replication vector will increase in size, and this can have some negative impact on replication over time, mostly in term of resource consumption and performances (I’ve seen server with 60 replica identifiers whereas there were only 6 active servers, in a test environment).
  I hope this helps.
  
  Reply
  - Chetan 01 June 2015 / 12:43
    
    Thanks for the reply Ludo. This helps. However I am having another question regarding setting up multimaster replication. You have mentioned that while setting up 3rd replication server (say RS3) we should run “dsreplication enable” with host1 as RS1 and host2 as RS3.
    So assuming we already have RS1 and RS2 configured, can we use RS2 instead of RS1 as host1 here?
    Reason I am asking this because in my setup the replication servers are ephemeral and we can’t rely on single replication server as primary server from which rest/new servers can be initialized. Currently as we add new RS, we identify/find any of the active RS and try to replicate/initialize from that server. But with approach we are facing issue where not all the servers lists same list of replication servers (by using dsconfig list-replication-server cmd).
    
    Do we need to run enable replication command sequentially for all the existing replication servers as host1 and new host as host2?
  - Ludo 01 June 2015 / 19:56
    
    Yes, you can choose any existing replica to enable replication.
    However, it is a best practice to always use the same one for host1.
    
    You should not need to enable replication sequentially with all existing servers… The knowledge is distributed through replication, but may come after initialization and changes have been received.
Ajay Kumar 15 January 2016 / 12:12

Hi Ludo,

I have setup replication on version 2. 6.0 for four servers.

Post initializing replication whenever I am making a new entry on server 1 it is not getting replicated on other replicated servers.

Could you please suggest what could be the reason: below is output of replication.

Suffix DN : Server : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
——————————–:———————-:———:———————:——-:——-:————-:———-:————–:————-
o=XXX,dc=XXX,dc=com : Server1:4444 : 54 : true : 3496 : 24637 : 8989 : 0 : : false
o=XXX,dc=XXX,dc=com : Server2:4444 : 54 : true : 30630 : 30260 : 8989 : 0 : : false
o=XXX,dc=XXX,dc=com : Server3:4444 : 55 : true : 23803 : 16978 : 8989 : 0 : : false
o=XXX,dc=XXX,dc=com : Server4:4444 : 54 : true : 29669 : 2696 : 8989 : 0 : : false

Thanks in advance.

Reply
apapap 11 April 2016 / 07:50

I am using OpenDJ-2.4.6 along with Oracle JDK 7.80, and I want to run Multi-master replication on 2 of my servers, the OS for these servers is Amazon Linux.

The OpenDJ setup runs perfectly fine; I can start the server too without any errors.

It is when I run the “dsreplication” script as follows:

./dsreplication enable –host1 server1.example,com –port1 4444 –bindDN1 “cn=Directory Manager” –bindPassword1 “Passw0rd” –replicationPort1 1388 –host2 server2.example,com –port2 4444 –bindDN2 “cn=Directory Manager” –bindPassword2 “Passw0rd” –replicationPort2 1388 –adminUID admin –adminPassword “Passw0rd” –baseDN “dc=example,dc=com”
the script hangs on the following step:

Initializing registration information on server server2.example.com:4444 with the contents of server server1.example.com:4444 …..

And on checking the logs, there is no error reported in there.
But, when I run the following command:

./dsreplication status -h localhost -p 4444 –adminUID admin –adminPassword “Passw0rd” -X

it throws the following error:

The displayed information might not be complete because the following errors were encountered reading the configuration of the existing servers: Error on server2.example.com:4444: An error occurred connecting to the server. Details: javax.naming.AuthenticationException: [LDAP: error code 49 – Invalid Credentials] Error on server:4444: An error occurred connecting to the server. Details: javax.naming.AuthenticationException: [LDAP: error code 49 – Invalid Credentials]

Please help me.

Thanks in advance.

Reply
apapap 11 April 2016 / 13:19

How can I change the OpenDJ configuration ?
Example: While OpenDJ setup, I mistakenly provided incorrect host2 address, then how can I edit the configuration after the setup is completed successfully ?

Reply
- Ludo 12 April 2016 / 09:23
  
  It is probably faster to delete the 2 instances and re-run the commands than trying to change the host name of a server and regenerate the certificates required for administration and replication.
  
  Reply
Sheeba Elizabeth John 29 November 2018 / 11:54

Hi Ludo,
I have configured OpenAM on 2 servers which are running under a load balancer i.e. both the servers are configured in a site. Both these servers make use of an external data/LDAP store, which is OpenDJ, setup on the 1st server on which OpenAM1 runs.
I have also enabled session failover for both these OpenAMs via CTS and the tokens generated from CTS are stored in the same external storage(OpenDJ).
Then I have setup another OpenDJ on the second server on which OpenAM2 is running and I have enabled replication for both the OpenDJs.
Now, what all are the configuration changes need to be setup in the OpenAM in order to support the OpenDJ replication. If the primary OpenDJ goes down, the second one is unable to allow me to login to OpenAM itself. It displays an ‘Internal Server’ error or ‘Authentication failed’.
Kindly help me with this.

Thanks in advance.

-Sheeba John

Reply
- Ludo 29 November 2018 / 12:35
  
  Your issue is an issue with the configuration of OpenAM. There are manuals to explain how to configure AM with failover with OpenDJ. Sorry I cannot help you further with this.
  
  Reply
Regis Poulin 10 January 2019 / 17:33

Hello, I am trying to use OpenDJ for a project where I need a large number of Directory Server on the same network. I have 300 machines that EACH need to have its own Directory Server. Note that the number of user in the DS is 2000-5000 users.

The reason behind this is that I want each machine to be independent from each other so that if there is a network outage, authentication can be done locally on each machines.

I also need data to be replicated on each of those 300 machines, so I need replication servers.

I was able to create such a topology on AWS (Amazon) using OpenDJ 4.2.5 (Community Edition) where I had 100 of those machines (100 DS) with a single RS. Everything works as expected. Data gets replicated to all 100 machines and if the RS is down or if the network is down, authentication can be done locally on any DS.

However, when I add more DS in my topology, in fact, when I get more than 120-130 DS, the RS starts complaining. See example errors below:

In Replication server Replication Server 8989 32424: servers Replica DS(8973) for domain “cn=admin data” and Replica DS(8973) for domain “cn=admin data” have the same ServerId : 8973

Exception when reading messages from Replica DS(8194) for domain “cn=schema”: IOException: no more data (Session.java:441 Session.java:403 ServerReader.java:86)

Replication server accepted a connection from ip-172-20-1-243.ec2.internal/172.20.1.243:38552 to local address 0.0.0.0/0.0.0.0:8989 but the SSL handshake failed. This is probably benign, but may indicate a transient network outage or a misconfigured client application connecting to this replication server. The error was: null cert chain

Replication server accepted a connection from ip-172-20-1-136.ec2.internal/172.20.1.136:53748 to local address 0.0.0.0/0.0.0.0:8989 but the SSL handshake failed. This is probably benign, but may indicate a transient network outage or a misconfigured client application connecting to this replication server. The error was: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: signature check failed

Directory server DS(26935) encountered an error while receiving changes for domain “cn=admin data” from replication server RS(21230) at 127.0.0.1:8989. The connection will be closed, and this directory server will now try to connect to another replication server.

I have tried using a second RS with 2 replication groups, but result is the same. When getting above 120 DS, the RS starts failing.

Is there a reason to believe that OpenDJ CANNOT work in my use case. Is this a configuration issue on my end, is this a topology issue (Should I use more RS, etc), is this code issue?

Can OpenDJ fullfill my requirements.

Thank you

Reply
- Ludo 10 January 2019 / 17:49
  
  OpenDJ 2.4.5 Community Edition is a very old release and there’s been a ton of improvement made to the server and to replication in the ForgeRock DS product, including better handling of server IDs (with 2.4 they are randomly chosen and we’ve seen collisions once in a while), better messages…
  If an RS starts failing about 120DS, it’s probably a resource issue as handling that amount of servers require CPUs, files descriptors, threads… In our scalability tests, we haven’t gone above 100 servers (as it’s time and resource consuming to run this kind of test) but we do have some customers with close to 200 servers, all replicating. There is no hard coded limits in OpenDJ with that regards.
  Honestly, if you want to remain using an open source version, I’m afraid the available versions might not be able to entirely meet your requirements. A lot of the work I mentioned has been done in ForgeRock DS 6.x.
  
  Reply
Regis Poulin 10 January 2019 / 18:03

Thank you for the fast response!
– Are there configuration I could change to improve performance, my RS is running on a machine with 4 cores and 16 Gb or RAM. Are there file descriptors, socket, or other network parameters I should configure?
– Do you have any idea of the pricing of the ForgeRock version?

Reply
- Ludo 10 January 2019 / 19:15
  
  Configuration: First, I would check if there is some system errors reported by the OS. I would check different limits at the OS level. Then the remaining would be at the JVM level. The rest would depend on the load. But one thing is sure, if one server is issued a serverID that is already used, it creates some mess.
  Pricing: I’m sorry, but I don’t have an idea. The pricing is usually based on identities and not the number of servers. There is an evaluation version, only restricted by license, that can be downloaded from ForgeRock.com.
  
  Reply
  - Regis Poulin 11 January 2019 / 21:42
    
    Is there a resource limitation that would prevent more than 128 clients, ex: 128 tcp connections or something like that, that would either be related to the OS or to OpenDJ. It looks like I am having issue when getting about 128 DS.
    
    I am running on Linux.
    
    Thank you
  - Ludo 12 January 2019 / 00:35
    
    AFAIK, there is no hard coded limits, on especially at 128. But as I said you might be hitting resource limits either on the OS or the JVM.
Regis Poulin 10 January 2019 / 18:06

I am using OpenDJ 4.2.5… 2.4.5 was a typo….

Reply
Regis Poulin 11 January 2019 / 22:32

Can OpenDJ work without a DNS. On my setup, I do not have a DNS and I was hoping to use ipaddress directly. Is this supported.

I have also experience freezes when doing “dsreplication enable” on 2 instances at the same time. I have found the folloging note on oracle web site (https://docs.oracle.com/cd/E22289_01/html/821-1273/configuring-data-replication-with-dsreplication.html): “Note: You cannot run more than one instance of the dsreplication enable command to set up replication between multiple servers in parallel. Rather, run the dsreplication enable command successively for each pair of replicated servers in the topology.” Is this true? Is there a workaround?

As I keep adding DS, where enabling replication on new nodes, it gets longer and longer to add nodes as replicating configuration (“Updating registration configuration on server…”, “Updating replication configuration…”, “Updating remote references…”) gets longer and longer since it is replicated to all nodes.

Reply
- Ludo 12 January 2019 / 00:38
  
  OpenDJ can work without DNS, with direct ip adresses, but it’s not very flexible and practical in the long run.
  
  And yes both issues are known. We are working towards solving them for the next major release.
  
  Reply
Giampaolo 21 July 2020 / 17:17

Hello, I am running an old version of OpenDJ (3.0). I’m having issues with a replication chain;

I have three servers: replication has been enabled between server1 and server2 and then between server2 and 3. The problem is that there is a firewall between server 1 and 3, so they cannot communicate directly.

When there is an update on server1 it is replicated on server2 but not on server3, even if when i run dsreplication on server3 I see an increasing number of missing changes. Am I wrong expecting server2 to relay updates from server1 to server3?

Thanks for your help!
g.

Reply
- Ludo 21 July 2020 / 17:39
  
  Replication has been designed so that all replication servers are fully connected. There’s no options to have a chain like that.
  One option could be to deploy a DS only server on the location that is behind a firewall. The DS will connect to one of the other servers and get updated.
  
  Reply
Amit Kumar 07 September 2020 / 17:01

Hi,

We have forgeRock deployed in AWS EKS and need to achieve userstore data replication across namespaces but I don’t know where that configuration exist that I can leverage to add other userstore which is in different namespace to be part of replication.

Can you help me here ?

Reply
- Ludo 09 September 2020 / 10:14
  
  If you are a ForgeRock customer, you should reach out our support organisation.
  
  Reply