Some principals not replicating

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Some principals not replicating

Adam Lewenberg
I have a kerberos master running version 7.1. I am attempting to
replicate to slaves some of which run version 7.1 and some of which run
version 1.5.2.

PROBLEM: Some of the principals will not replicate.

If I go on the master and change the password of one of these
problematic principals, I see this in the replica's logs:

--------------------------------------------------------------
(version 1.5.2)
2018-06-15T21:17:47 replaying entry 131870
2018-06-15T21:17:47 kadm5_log_replay: 131870. Lost entry entry, Database
out of sync ?: No such entry in the database (36150275)
2018-06-15T21:17:47 Ignoring command 8

(version 7.1)
2018-06-15T14:17:47.138560-07:00 kdc-test1 ipropd-slave[18033]: slave
status change: up-to-date with version: 131870 at 2018-06-15T14:17:47
--------------------------------------------------------------

In both cases, the principal is not in either replica's database. That
is, using a 'get' command returns "Principal does not exist".

On the master, the principal looks like this:
--------------------------------------------------------------
            Principal: [hidden email]
     Principal expires: never
      Password expires: 2019-06-15 21:17:47 UTC
  Last password change: 2018-06-15 21:17:47 UTC
       Max ticket life: 1 day 1 hour
    Max renewable life: 1 week
                  Kvno: 7
                 Mkvno: unknown
Last successful login: never
     Last failed login: never
    Failed login count: 0
         Last modified: 2018-06-15 21:17:47 UTC
              Modifier: kadmin/[hidden email]
            Attributes: disallow-svr, requires-pre-auth
              Keytypes: aes256-cts-hmac-sha1-96(pw-salt)[7],
aes128-cts-hmac-sha1-96(pw-salt)[7], des3-cbc-sha1(pw-salt)[7],
arcfour-hmac-md5(pw-salt)[7]
           PK-INIT ACL:
               Aliases:
--------------------------------------------------------------

One extra piece of information. The master's database came by hprop'ing
to it from a 1.5.2 master.


QUESTION: What could be a reason for this principal not to replicate?



Adam Lewenberg

Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Viktor Dukhovni-2


> On Jun 15, 2018, at 5:31 PM, Adam Lewenberg <[hidden email]> wrote:
>
> PROBLEM: Some of the principals will not replicate.

Well updates to the principal are not replicating...

> If I go on the master and change the password of one of these problematic principals, I
> see this in the replica's logs:

That's a "modify" not a "create" and modify requires the object
to already be there.  The iprop log is "sparse", recording only
the modified data when doing "modify", so the principal can't
be created just from the latest "modify" record.

> QUESTION: What could be a reason for this principal not to replicate?

You need to stop the slaves, blow away their database and logs,
and replicate the full database from scratch.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Adam Lewenberg


On 6/15/2018 3:04 PM, Viktor Dukhovni wrote:

>
>
>> On Jun 15, 2018, at 5:31 PM, Adam Lewenberg <[hidden email]> wrote:
>>
>> PROBLEM: Some of the principals will not replicate.
>
> Well updates to the principal are not replicating...
>
>> If I go on the master and change the password of one of these problematic principals, I
>> see this in the replica's logs:
>
> That's a "modify" not a "create" and modify requires the object
> to already be there.  The iprop log is "sparse", recording only
> the modified data when doing "modify", so the principal can't
> be created just from the latest "modify" record.
>
>> QUESTION: What could be a reason for this principal not to replicate?
>
> You need to stop the slaves, blow away their database and logs,
> and replicate the full database from scratch.

I did this. On three different slaves. The problematic principals do not
appear in the slave's database. To be clear: even after initial
replication (starting from nothing on the slave) some of the principal's
do not appear in the slave's database.

This (or something much like it) appears in the initial replication on
three separate 1.5.2 slaves:

2018-06-15T17:45:12 ipropd-slave started at version: 0
2018-06-15T17:45:12 receive complete database
2018-06-15T17:45:47 receive complete database, version 114134
2018-06-15T17:46:44 replaying entry 114135
2018-06-15T17:46:44 replaying entry 114136
2018-06-15T17:46:44 replaying entry 114137
2018-06-15T17:46:44 replaying entry 114138

... many lines like this until ...

2018-06-15T17:46:45 replaying entry 131686
2018-06-15T17:46:45 replaying entry 131687
2018-06-15T17:46:45 replaying entry 131688
2018-06-15T17:46:45 replaying entry 131689
2018-06-15T17:46:45 Ignoring command 8
2018-06-15T17:50:03 replaying entry 131690
2018-06-15T17:50:03 Ignoring command 8
2018-06-15T17:50:03 replaying entry 131691
2018-06-15T17:50:03 Ignoring command 8
2018-06-15T17:50:03 replaying entry 131692
2018-06-15T17:50:03 Ignoring command 8
2018-06-15T17:51:03 replaying entry 131693
2018-06-15T17:51:03 Ignoring command 8
2018-06-15T17:56:52 replaying entry 131694
2018-06-15T17:56:52 Ignoring command 8
2018-06-15T18:00:03 replaying entry 131695
2018-06-15T18:00:03 Ignoring command 8

... more lines much like until ...

2018-06-15T20:16:57 Ignoring command 8
2018-06-15T20:18:53 replaying entry 131814
2018-06-15T20:18:53 kadm5_log_replay: 131814. Lost entry entry, Database
out of sync ?: No such entry in the database (36150275)
2018-06-15T20:18:53 Ignoring command 8
2018-06-15T20:19:23 Ignoring command 8
2018-06-15T20:20:02 replaying entry 131815



Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Greg Hudson
On 06/15/2018 06:29 PM, Adam Lewenberg wrote:
> I did this. On three different slaves. The problematic principals do not
> appear in the slave's database. To be clear: even after initial
> replication (starting from nothing on the slave) some of the principal's
> do not appear in the slave's database.

What database type is the master KDC using?  If you dump the master DB
and look for one of the principals which is missing in the other
databases, is it present in the dump file?
Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Adam Lewenberg
In reply to this post by Adam Lewenberg
I think I was not clear in my original post. Let me clarify.

I have a master KDC running Heimdal 7.1. In its database is a principal
called "fprefect" which, as far as I can tell, acts like a normal
principal. I can do "get fprefect" and the output looks normal. If I
point to this master and do a "kinit fprefect" I get a TGT.

However, if I bring up a new slave KDC (no database, no transaction log)
that points to this master, the KDC _appears_ to get the entire database
from the master, except that the principal "fprefect" is missing. This
happens if the slave KDC runs 7.1 or if it runs 1.5.2. (There are some
strange messages in the iprop log on the 1.5.2 slave; see my original
e-mail for details.)

I don't know how this principal got into this strange state on the
master, and I don't know how to replicate this issue.

It makes me think that the database on the master is corrupted in some
subtle way.

I am hoping that someone can tell me some way to query or examine the
database on the master to get some information that might throw some
light on why this particular principal behaves this way.

Adam Lewenberg


On 6/15/2018 3:29 PM, Adam Lewenberg wrote:

>
>
> On 6/15/2018 3:04 PM, Viktor Dukhovni wrote:
>>
>>
>>> On Jun 15, 2018, at 5:31 PM, Adam Lewenberg <[hidden email]> wrote:
>>>
>>> PROBLEM: Some of the principals will not replicate.
>>
>> Well updates to the principal are not replicating...
>>
>>> If I go on the master and change the password of one of these
>>> problematic principals, I
>>> see this in the replica's logs:
>>
>> That's a "modify" not a "create" and modify requires the object
>> to already be there.  The iprop log is "sparse", recording only
>> the modified data when doing "modify", so the principal can't
>> be created just from the latest "modify" record.
>>
>>> QUESTION: What could be a reason for this principal not to replicate?
>>
>> You need to stop the slaves, blow away their database and logs,
>> and replicate the full database from scratch.
>
> I did this. On three different slaves. The problematic principals do not
> appear in the slave's database. To be clear: even after initial
> replication (starting from nothing on the slave) some of the principal's
> do not appear in the slave's database.
>
> This (or something much like it) appears in the initial replication on
> three separate 1.5.2 slaves:
>
> 2018-06-15T17:45:12 ipropd-slave started at version: 0
> 2018-06-15T17:45:12 receive complete database
> 2018-06-15T17:45:47 receive complete database, version 114134
> 2018-06-15T17:46:44 replaying entry 114135
> 2018-06-15T17:46:44 replaying entry 114136
> 2018-06-15T17:46:44 replaying entry 114137
> 2018-06-15T17:46:44 replaying entry 114138
>
> ... many lines like this until ...
>
> 2018-06-15T17:46:45 replaying entry 131686
> 2018-06-15T17:46:45 replaying entry 131687
> 2018-06-15T17:46:45 replaying entry 131688
> 2018-06-15T17:46:45 replaying entry 131689
> 2018-06-15T17:46:45 Ignoring command 8
> 2018-06-15T17:50:03 replaying entry 131690
> 2018-06-15T17:50:03 Ignoring command 8
> 2018-06-15T17:50:03 replaying entry 131691
> 2018-06-15T17:50:03 Ignoring command 8
> 2018-06-15T17:50:03 replaying entry 131692
> 2018-06-15T17:50:03 Ignoring command 8
> 2018-06-15T17:51:03 replaying entry 131693
> 2018-06-15T17:51:03 Ignoring command 8
> 2018-06-15T17:56:52 replaying entry 131694
> 2018-06-15T17:56:52 Ignoring command 8
> 2018-06-15T18:00:03 replaying entry 131695
> 2018-06-15T18:00:03 Ignoring command 8
>
> ... more lines much like until ...
>
> 2018-06-15T20:16:57 Ignoring command 8
> 2018-06-15T20:18:53 replaying entry 131814
> 2018-06-15T20:18:53 kadm5_log_replay: 131814. Lost entry entry, Database
> out of sync ?: No such entry in the database (36150275)
> 2018-06-15T20:18:53 Ignoring command 8
> 2018-06-15T20:19:23 Ignoring command 8
> 2018-06-15T20:20:02 replaying entry 131815
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Viktor Dukhovni-2
In reply to this post by Adam Lewenberg


> On Jun 15, 2018, at 6:29 PM, Adam Lewenberg <[hidden email]> wrote:
>
> This (or something much like it) appears in the initial replication on three separate 1.5.2 slaves:

You *really* should upgrade the slaves as soon as possible,
however:

> 2018-06-15T17:45:12 ipropd-slave started at version: 0
> 2018-06-15T17:45:12 receive complete database
> 2018-06-15T17:45:47 receive complete database, version 114134

The master has a complete database snapshot whose version is
contiguous with the log.  Therefore, instead of sending you
the full database, you're getting the snapshot + incremental
logs after that:

> 2018-06-15T17:46:44 replaying entry 114135
> 2018-06-15T17:46:44 replaying entry 114136
> 2018-06-15T17:46:44 replaying entry 114137
> 2018-06-15T17:46:44 replaying entry 114138

...

However, there's something amiss with the snapshot
or logs.  Better to delete the snapshot on the
master and let it generate a new one, then resync
the slaves.  Or there's something wrong with the
iprop code on the slaves, in any case a truly
complete snapshot stands a better chance.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Some principals not replicating

Adam Lewenberg


On 6/15/2018 6:21 PM, Viktor Dukhovni wrote:

>
>
>> On Jun 15, 2018, at 6:29 PM, Adam Lewenberg <[hidden email]> wrote:
>>
>> This (or something much like it) appears in the initial replication on three separate 1.5.2 slaves:
>
> You *really* should upgrade the slaves as soon as possible,
> however:
>
>> 2018-06-15T17:45:12 ipropd-slave started at version: 0
>> 2018-06-15T17:45:12 receive complete database
>> 2018-06-15T17:45:47 receive complete database, version 114134
>
> The master has a complete database snapshot whose version is
> contiguous with the log.  Therefore, instead of sending you
> the full database, you're getting the snapshot + incremental
> logs after that:
>
>> 2018-06-15T17:46:44 replaying entry 114135
>> 2018-06-15T17:46:44 replaying entry 114136
>> 2018-06-15T17:46:44 replaying entry 114137
>> 2018-06-15T17:46:44 replaying entry 114138
>
> ...
>
> However, there's something amiss with the snapshot
> or logs.  Better to delete the snapshot on the
> master and let it generate a new one, then resync
> the slaves.  Or there's something wrong with the
> iprop code on the slaves, in any case a truly
> complete snapshot stands a better chance.

Thanks for your quick reply.

When you say "delete the snapshot on the master and let it generate a
new one" I assume you meant "iprop-log truncate --reset", yes?

Anyway, I did that. All the slaves re-synced and now the "bad"
principals are showing up on the slaves.

Thanks!

Adam Lewenberg



>