How long can a slave be isolated from the master and still re-establish replication?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How long can a slave be isolated from the master and still re-establish replication?

Adam Lewenberg
We recently had some network issues where some of the KDC slaves were
isolated from the master for a few hours. When the network came back,
the slaves were no longer attempting to contact the master.

Should a properly configured replication setup be able to automatically
re-establish replication even after such long network outage? Are there
configuration options for ipropd-master/ipropd-slave that need to be set
to take into account long outages?

Thank you, Adam Lewenberg


Reply | Threaded
Open this post in threaded view
|

Re: How long can a slave be isolated from the master and still re-establish replication?

Harald Barth-2

> Should a properly configured replication setup be able to
> automatically re-establish replication even after such long network
> outage? Are there configuration options for ipropd-master/ipropd-slave
> that need to be set to take into account long outages?

Have you found the time-missing, time-gone and time-lost options in the man page?

However, I have seen that different versions had bugs which made it
not work anyway (you did not tell me your versions) and this
functionality is probably poorly tested.

My experience is that it's the best to wrap ipropd-* processes with
some nanny scripts which either restart automatically them or pull an
alarm (depending on your paranoia level) if they go missing.

Harald.
Reply | Threaded
Open this post in threaded view
|

Re: How long can a slave be isolated from the master and still re-establish replication?

Adam Lewenberg
We will have to do some testing and see what happens.

I looked at the man page and this is what it says:

    --time-gone=time
              time of inactivity after which a slave is considered gone
(default 5 min)

   --time-lost=time
              time before server is considered lost (default 5 min)

This raises more questions than it answers:

For --time-lost does "server" refer to the slave or to the master?

What does the master do when it considers a slave "gone"?

What are the time units for those parameters (seconds, minutes, hours)?



On 5/15/2018 6:30 AM, Harald Barth wrote:

>
>> Should a properly configured replication setup be able to
>> automatically re-establish replication even after such long network
>> outage? Are there configuration options for ipropd-master/ipropd-slave
>> that need to be set to take into account long outages?
>
> Have you found the time-missing, time-gone and time-lost options in the man page?
>
> However, I have seen that different versions had bugs which made it
> not work anyway (you did not tell me your versions) and this
> functionality is probably poorly tested.
>
> My experience is that it's the best to wrap ipropd-* processes with
> some nanny scripts which either restart automatically them or pull an
> alarm (depending on your paranoia level) if they go missing.
>
> Harald.
>