OpenDJ: Quick Replication setup

OpenDJ Servers Replication

As we develop OpenDJ, we spend a lot of time testing, whether it’s a new feature or a correction to an existing one. We usually write some unit tests to validate the code and then some functional tests to check the feature from a “user” point of view. While the unit tests are typically run with a single server, the functional or integration tests are run with configurations that match our customers deployment. And one of the given fact for any directory service deployment with OpenDJ that I’m aware of, is that the service is made of two or more OpenDJ directory servers with Multi-Master Replication enabled between them.

Setting up Multi-Master Replication with OpenDJ is quite easy and I’m going to demonstrate it here:

Lets assume we want to install 2 OpenDJ servers on the following hosts : ldap1.example.com and ldap2.example.com. For simplicity and because for test we avoid running tests with root privileges, we will configure the server to use port 1389 and 1636 for LDAP and LDAPS respectively.

On ldap1.example.com

$ unzip OpenDJ-2.4.2.zip
$ cd OpenDJ-2.4.2
$ ./setup -i -n -b "dc=example,dc=com" -d 20 -h ldap1.example.com -p 1389 \
  --adminConnectorPort 4444 -D "cn=Directory Manager" -w "secret12" -q -Z 1636 \
  --generateSelfSignedCertificate

Do the same on ldap2.example.com, the parameters being the same except for the -h option that should be ldap2.example.com

Now, you have 2 instances of OpenDJ configured and running with 20 sample entries in the suffix “dc=example,dc=com”. Let’s enable replication:

$ bin/dsreplication enable --host1 ldap1.example.com --port1 4444\
  --bindDN1 "cn=directory manager" \
  --bindPassword1 secret12 --replicationPort1 8989 \
  --host2 ldap2.example.com --port2 4444 --bindDN2 "cn=directory manager" \
  --bindPassword2 secret12 --replicationPort2 8989 \
  --adminUID admin --adminPassword password --baseDN "dc=example,dc=com" -X -n

And now make sure they both have the same data:

$ bin/dsreplication initialize --baseDN "dc=example,dc=com" \
  --adminUID admin --adminPassword password \
  --hostSource ldap1.example.com --portSource 4444 \
  --hostDestination ldap2.example.com --portDestination 4444 -X -n

For my daily tests I’ve put the commands in a script that I can run and will deploy 2 servers, enable replication between them and initialize them, all on a single machine (using different ports for LDAP and LDAPS).

Now if you want to add a 3rd server in the replication topology, install and configure it like the first 2 ones. And join it to the replication topology by repeating the last 2 commands above, replacing ldap2.example.com with the hostname of the 3rd server. Need a 4th one ? Repeat again, keeping ldap1.example.com as the server of reference.

33 thoughts on “OpenDJ: Quick Replication setup

  1. Vladimir Dzhuvinov 10 May 2011 / 13:27

    If I remember properly, the OpenDJ installation wizard also has a nice screen for enabling replication at setup time.

    • Ludo 10 May 2011 / 15:02

      You’re correct. When doing tests with several servers, I find it much faster to use command line utilities, and I even have scripts that will setup 2 or 3 servers replicated without a single keystroke !

    • permanentr 30 October 2012 / 16:51

      Hi Ludovic,

      Your example is quite simple and to the point, I am trying to simulate your exercise on windows 7.
      I am able to replicate my initial ldap server (data etc.) multiple times, but not with the same ports.
      How does the replicas take over (automatically) when the others are down? are there other settings for that?
      Thus far, the only way I am able to test the replicated servers is to change the port number and/or hosts ID in my client application, looking to automate this.

      Regards
      Sam Saba

      • Ludo 30 October 2012 / 17:01

        When testing replication on a single machine, you need to use different port numbers for each server.
        I run this everyday (and my scripts have hardcoded ports 1389/2389 for LDAP, 4444/5444 for Admin, 8989/9989 for replication).
        As for the take over, your applications must switch over either implicitly or through a load balancer. OpenDJ replication is using multi-master replication which means that any server can respond to all requests at any point.

        Regards,
        Ludo

    • Ludo 30 June 2011 / 23:06

      Yes secure replication is that fast and easy:

      $ bin/dsreplication enable –host1 ldap1.example.com –port1 4444\
      –bindDN1 “cn=directory manager” \
      –bindPassword1 secret12 –replicationPort1 8989 –secureReplication1 \
      –host2 ldap2.example.com –port2 4444 –bindDN2 “cn=directory manager” \
      –bindPassword2 secret12 –replicationPort2 8989 –secureReplication2 \
      –adminUID admin –adminPassword password –baseDN “dc=example,dc=com” -X -n

      That’s it !

  2. Karthik 18 April 2012 / 12:26

    Hi,

    Actually I have a Master-slave server setup where I want to achieve data synch.
    I have 2 baseDN which should in synch – say baseDN_1 and baseDN_2.

    baseDN_1 should always be enabled, where as baseDN_2 should be eabled on demand basis. Initaill baseDN_2 will be enable and after initial synhc up immediately we have to disable it – but some operation by user would expect synch up of baseDN_2. My question is, is there any way to disable specificof the baseDNs leaving rest of the baseDNs enabled or we should use dsreplication disble command only.

    P.S. In normal ldap dsconf has different commands like to disable-repl-agmt, delete-repl-agmt, disable-repl, etc

    I am badly in need of help. Could you please guide me on this. Thanks in advance.

    • Ludo 18 April 2012 / 16:39

      Hi Karthik,

      You can enable /disable replication for each baseDN separately. Each one is a separated replication domain.
      Check the Replication chapter of the OpenDJ Administration Guide: http://opendj.forgerock.org/doc/admin-guide/

      Regards,

      Ludovic.

      BTW, what you call “normal ldap” looks like Sun DS (it’s the only one I know that has a dsconf tool) and is legacy.

      • Karthik 19 April 2012 / 08:49

        Thanks a ton!!!

        I am trying to configure data synch btwn 2 servers and as per my underdstanding I have to do following steps. Please correct me if I am worng:

        1. Install OpenDJ on both servers – say S1 and S2
        2. Now run the dsreplication enable command on S2 – This will make sure the data is synched between servers.
        3. Initailize-all command wrt to both servers.
        4. Now I am resady with servers which are in synch.

        Is there anything else that has to be done for successful establishement of synch between two servers.

      • Ludo 21 April 2012 / 19:03

        Hi Karthik,
        No, nothing else to do for replicating 2 servers.
        Install the 2 servers, one may be loaded with the data.
        Run dsreplication enable from either server, to setup replication.
        Run dsreplication initialize or initialize-all to push the data from the primary server to the replica.
        And any change you do on any server will be replicated.
        🙂

  3. Arnljot Arntsen (@arnljot76) 14 June 2012 / 16:39

    Hi Ludovic

    I’ve got problems getting this to go as smooth as this using 2.4.5 on 64bit linux (Amazon EC2) using Oracle Hotspot 1.6.0_33 64bit JVM.

    Client side in the shell I get:
    Initializing registration information on server idp02:4444 with the contents of server idp01:4444 …..
    Error during the initialization with contents from server
    idp01:4444. Last log details: [14/Jun/2012:14:30:41 +0000]
    severity=”NOTICE” msgCount=0 msgID=9896349 message=”Set Generation ID task
    quicksetup-reset-generation-id-1 started execution”. Task state:
    STOPPED_BY_ERROR. Check the error logs of idp01:4444 for more
    information.
    See /tmp/opends-replication-8413019362871675684.log for a detailed log of this
    operation.

    My set command is:
    Machine ONE
    sudo -u opendj ./setup -i -n -a -b “dc=opensso,dc=java,dc=net” -h idp01 -p 1389 –adminConnectorPort 4444 -D “cn=Directory Manager” -w “managerpassword” -q -Z 1636 –generateSelfSignedCertificate -x 1689

    Machine TWO
    sudo -u opendj ./setup -i -n -a -b “dc=opensso,dc=java,dc=net” -h idp02 -p 1389 –adminConnectorPort 4444 -D “cn=Directory Manager” -w “managerpassword” -q -Z 1636 –generateSelfSignedCertificate -x 1689

    Then I enable replication:
    sudo -u opendj bin/dsreplication enable –host1 idp01 –port1 4444 –bindDN1 “cn=Directory Manager” –bindPassword1 managerpassword –replicationPort1 8989 –host2 idp02 –port2 4444 –bindDN2 “cn=Directory Manager” –bindPassword2 managerpassword –replicationPort2 8989 –adminUID admin –adminPassword adminpassword –baseDN “dc=opensso,dc=java,dc=net” -X -n

    The log on both machines is very chatty in the /usr/share/opendj/logs/errors, with lots of (machine ONE):
    [14/Jun/2012:14:20:34 +0000] category=SYNC severity=SEVERE_ERROR msgID=14942387 msg=Replication server 27005 was attempting to connect to replication server idp01/xx.xxx.xxx.90:8989 but has disconnected in handshake phase

    And this:
    [14/Jun/2012:14:12:30 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811281 msg=Timed out while trying to acquire the domain lock for domain “cn=admin data”. The connection attempt from replication server RS(27005) at xx.xxx.xxx.90/xx.xxx.xxx.90:58019 to this replication server RS(27005) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt

    What should I look into to get this fixed?

    • Ludo 14 June 2012 / 18:29

      Hi Arnljot,

      I believe this is an issue of having an EC2 instance being able to communicate to the other one, and possibly with SSL based connections.
      I know that we’ve been able to do it, but I haven’t done it myself, and I don’t have the details at hand. I will look for them and post the notes then.

      • Arnljot Arntsen (@arnljot76) 15 June 2012 / 13:05

        Thank you! 🙂

        I’ve verified that the ports are open, but I haven’t enabled SSL other than what’s done in the example (generating self signed certificates, and opening LDAPS port).

        I’d really appreciate the help.

        With the setup as it is now, and using initialize-all instead of doing it “one by one” replication seemingly works. But the log complaints that the admin user has different id on the two boxes.

        [14/Jun/2012:15:56:17 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811232 msg=Directory server DS(25634) has connected to replication server RS(4956) for domain “cn=admin data” at idp01/xx.xxx.xxx.90:8989, but the generation IDs do not match, indicating that a full re-initialization is required. The local (DS) generation ID is 138543 and the remote (RS) generation ID is 67162

        I think a consequence of this is that on idp02 (the second box) the admin-backend.ldiff in config is without userpassword entry.

      • Ludo 19 June 2012 / 22:22

        The message is not about the admin user, but the “admin data” internal backend with contains information for replication of all replicas.
        It might be worth trying to run dsreplication initialize for “cn=Admin data” suffix.
        But I have the feeling that you might be better to disable replication and try to re-enable it fully.
        Also, I’m not sure the blog is the best place for troubleshooting issues. You might want to bring the discussion to the OpenDJ mailing list : https://lists.forgerock.org/mailman/listinfo/opendj.

  4. Karthik S Patawardhan 26 June 2012 / 17:00

    Hi Ludovic,

    I have got a query regarding subtree replication. We have “o=data1” as one backend. Under “o=data1” we have “cn=subdata1, o=data1” and “cn=subdata2, o=data1” each are in separate backends on server 1 and server 2.

    We have following backends for following DNs ():
    1). “o=data1”
    2). “cn=subdata1, o=data1”
    3). “cn=subdata2, o=data1”

    With above configuration in place, if we enable replication for “o=data1” between server 1 and server 2 then the “cn=subdata1, o=data1” and “cn=subdata2, o=data1” will also be getting replicated.

    Could you please guide me in finding if its possible to block replication for “cn=subdata1, o=data1” and “cn=subdata2, o=data1” “o=data1” being enabled.

    Best Regards,
    Karthik S Patawardhan

  5. Micky 28 February 2013 / 20:56

    Hi Ludovic,

    In your 2nd and 3rd step you suddenly start using “–adminUID admin –adminPassword password” for the enable and initialize replication options. What is the default password for this user ? How does one go about resetting it or creating it from scratch ? I don’t see the “admin” user created anywhere in the db.

    Thanks in advance

    _Micky

    • Micky 28 February 2013 / 21:24

      Sorry, I take this comment back. The ‘enable’ replication command does create the ‘admin’ user with the specified password. My apologies.

      _Micky

  6. jang 16 October 2013 / 03:23

    possible a few multi-master replication ?
    And if you are using four multi-master replication ,load Is there a correlation?

  7. php newbie 21 January 2014 / 01:06

    if i have 3 servers or 4 or more….

    on the 1st server i run the command:
    bin/dsreplication enable –host1 ldap1.example.com –port1 4444\
    –bindDN1 “cn=directory manager” \
    –bindPassword1 secret12 –replicationPort1 8989 \
    –host2 ldap2.example.com –port2 4444 –bindDN2 “cn=directory manager” \
    –bindPassword2 secret12 –replicationPort2 8989 \
    –adminUID admin –adminPassword password –baseDN “dc=example,dc=com” -X -n

    and then i run :
    bin/dsreplication initialize –baseDN “dc=example,dc=com” \
    –adminUID admin –adminPassword password \
    –hostSource ldap1.example.com –portSource 4444 \
    –hostDestination ldap2.example.com –portDestination 4444 -X -n

    to add the 2rd server you said replace ldap2.example.com with ldap3.example.com in the dsreplication commands
    but do i not have to also have to replace –host2 with –host3 ?

    or will all the other seervers like ldap3.example.com ldap4.example.com ldap5.example.com be referred to as –host2 in the dsreplication enable and dsreplication initialize command ?

    • Ludo 03 February 2014 / 22:47

      The dsreplication enable command line only has options of -host1 and -host2 (we could have chosen -hostSource -hostDestination, but that’s not really the proper semantic for enable, even though -host1 is kind of the reference server, and some data like schema may be pushed to host2).
      So to keep your example, ldap3.example.com, ldap4.example.com will be referred as -host2 in dsreplication enable and –hostDestination in dsreplication initialize commands.

      Regards

      • php newbie 10 February 2014 / 16:56

        thanking you for reply….

        so if ldap1.example.com is be the “reference server”..

        will updates made directly to ldap2.example.com be pushed to ldap1.example.com and ldap3 & 4

        if ldap1.example.com goes down.. will updates on ldap2.example.com be pushed to ldap3.example.com and ldap4.example.com when ldap1 is down ?

        thank you lots in advance for reply.

      • Ludo 20 February 2014 / 14:25

        Yes, replication is built so that if a server is down, changes done on a server are still pushed to all other working servers. When the stopped server rejoins the replication ring, all missing changes will be sent automatically (unless the downtime was too large, the default being 3 days).

  8. Chetan 19 May 2015 / 11:52

    Hi Ludo, this is Chetan. I am having few queries related to Replication server clean-up. I went through lot of OpenAM documentation but couldn’t get much help.

    Here is how my set-up looks:
    We have multi-server OpenAM set-up where each OpenAM instance using embedded OpenDJ. When we launch second OpenAM instance, we provide first instance as replication source and provide the corresponding port, dn etc details. We are using single domain for replication. This set-up works well in normal condition. However, we have also enabled Auto Scaling in/out wherein we launch or terminate openam instances at will. Also the OpenAM instance host names are dynamic. So when our infrastructure scales in we terminate one or more openam instances but we do not perform any clean-up for removing these terminated instances from OpenAM Site or OpenDJ replication server configuration. This causes some stale hosts entries in the replication server list.

    When I list the replication servers using dsconfig I get the following. However actually only 2 of these 3 instances are active.

    Replication Server : replication-server-id : replication-port : replication-server
    ——————-:———————–:——————:——————————————————————————————————————————————————————————————–
    replication-server : 28648 : 50889 : openam-1-xxxx.example.net:50889, openam-2-xxxx.example.net:50889, openam-3-xxxx.example.net:50889

    Now when we scale out again, we provide any of the active OpenAM instance as replication server (host1) and new instance try to replicate the data from that active instance. So here we are not having any particular instance as a reference but we assume that over the time all servers would hold same data.

    Now the questions-
    1. Do you see any problem in such set-up? Is there anything we are doing wrong or not the best practice?
    2. How to clean-up the old replication servers. dsreplication supports disable command but for that (I guess) the server must be alive and listening on admin port (4444). In our case we don’t know when the server will be terminated (based on scale-in policy). Can we remove/disable the replication server after its terminated?
    3. Does stale or not reachable replication hosts cause any problem in data replication?

    • Ludo 19 May 2015 / 16:03

      There are 2 sides to what you are doing.
      One is about the list of servers that are known by all other instances at the replication level. This was addressed in another post : https://ludopoitou.com/2012/01/09/disabling-replication-in-opendj-2-4/, and this should answer question 2.
      The other side, is about what we call the replication vector within the data itself. The replication vector identifies the state of the data by a list of pairs : ReplicaID, ChangeSequenceNumber of the most recent change for that replica. Because this identifies state of data, and allows the replication service to identify changes that may be missing, there are no easy way to remove references of replicas in that vector. As a result, if you do instantiate a new replica and remove it regularly, the replication vector will increase in size, and this can have some negative impact on replication over time, mostly in term of resource consumption and performances (I’ve seen server with 60 replica identifiers whereas there were only 6 active servers, in a test environment).
      I hope this helps.

      • Chetan 01 June 2015 / 12:43

        Thanks for the reply Ludo. This helps. However I am having another question regarding setting up multimaster replication. You have mentioned that while setting up 3rd replication server (say RS3) we should run “dsreplication enable” with host1 as RS1 and host2 as RS3.
        So assuming we already have RS1 and RS2 configured, can we use RS2 instead of RS1 as host1 here?
        Reason I am asking this because in my setup the replication servers are ephemeral and we can’t rely on single replication server as primary server from which rest/new servers can be initialized. Currently as we add new RS, we identify/find any of the active RS and try to replicate/initialize from that server. But with approach we are facing issue where not all the servers lists same list of replication servers (by using dsconfig list-replication-server cmd).

        Do we need to run enable replication command sequentially for all the existing replication servers as host1 and new host as host2?

      • Ludo 01 June 2015 / 19:56

        Yes, you can choose any existing replica to enable replication.
        However, it is a best practice to always use the same one for host1.

        You should not need to enable replication sequentially with all existing servers… The knowledge is distributed through replication, but may come after initialization and changes have been received.

  9. Ajay Kumar 15 January 2016 / 12:12

    Hi Ludo,

    I have setup replication on version 2. 6.0 for four servers.

    Post initializing replication whenever I am making a new entry on server 1 it is not getting replicated on other replicated servers.

    Could you please suggest what could be the reason: below is output of replication.

    Suffix DN : Server : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ——————————–:———————-:———:———————:——-:——-:————-:———-:————–:————-
    o=XXX,dc=XXX,dc=com : Server1:4444 : 54 : true : 3496 : 24637 : 8989 : 0 : : false
    o=XXX,dc=XXX,dc=com : Server2:4444 : 54 : true : 30630 : 30260 : 8989 : 0 : : false
    o=XXX,dc=XXX,dc=com : Server3:4444 : 55 : true : 23803 : 16978 : 8989 : 0 : : false
    o=XXX,dc=XXX,dc=com : Server4:4444 : 54 : true : 29669 : 2696 : 8989 : 0 : : false

    Thanks in advance.

  10. apapap 11 April 2016 / 07:50

    I am using OpenDJ-2.4.6 along with Oracle JDK 7.80, and I want to run Multi-master replication on 2 of my servers, the OS for these servers is Amazon Linux.

    The OpenDJ setup runs perfectly fine; I can start the server too without any errors.

    It is when I run the “dsreplication” script as follows:

    ./dsreplication enable –host1 server1.example,com –port1 4444 –bindDN1 “cn=Directory Manager” –bindPassword1 “Passw0rd” –replicationPort1 1388 –host2 server2.example,com –port2 4444 –bindDN2 “cn=Directory Manager” –bindPassword2 “Passw0rd” –replicationPort2 1388 –adminUID admin –adminPassword “Passw0rd” –baseDN “dc=example,dc=com”
    the script hangs on the following step:

    Initializing registration information on server server2.example.com:4444 with the contents of server server1.example.com:4444 …..

    And on checking the logs, there is no error reported in there.
    But, when I run the following command:

    ./dsreplication status -h localhost -p 4444 –adminUID admin –adminPassword “Passw0rd” -X

    it throws the following error:

    The displayed information might not be complete because the following errors were encountered reading the configuration of the existing servers: Error on server2.example.com:4444: An error occurred connecting to the server. Details: javax.naming.AuthenticationException: [LDAP: error code 49 – Invalid Credentials] Error on server:4444: An error occurred connecting to the server. Details: javax.naming.AuthenticationException: [LDAP: error code 49 – Invalid Credentials]

    Please help me.

    Thanks in advance.

  11. apapap 11 April 2016 / 13:19

    How can I change the OpenDJ configuration ?
    Example: While OpenDJ setup, I mistakenly provided incorrect host2 address, then how can I edit the configuration after the setup is completed successfully ?

    • Ludo 12 April 2016 / 09:23

      It is probably faster to delete the 2 instances and re-run the commands than trying to change the host name of a server and regenerate the certificates required for administration and replication.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s