rcache ops must be O(log N), not O(N)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

rcache ops must be O(log N), not O(N)

Nico Williams

Any rcache design based on a linear log structure will be O(N).  For a
busy service this is not as good as O(log N), of course.  Even with
fsync() avoidance, O(N) is probably damaging to the mechanism's
reputation.

Roland's concept for this is a scalable hash table data structure in a
file.  An rcache entry must be searched for in every hash table in the
file, and must be inserted in a slot occupied by an expired entry, or
else in an empty slot (adding a table if need be).

The layout is trivial: the file consists of some number of hash
tables, each of which is bigger than the previous one, and all of which
consist of some number of fixed- and same-sized buckets.

Cuckoo hashing can be used to improve locality of reference.

This data structure is self-cleaning.  Its size is limited by peak
throughput.  There is never a need to re-initialize or truncate it.

Because the entries are fixed-sized, a hash (or similar; see below) is
needed, which renders the rcache a probabilistic data structure.  This
can be remedied trivially (a subject for another post, if there's
interest).

More interesting is the fact that such a structure could be used for
various caches.  Roland suggests using it as a cache of ticket
decryptions to speed up AP-REQ processing times...

...which brings me to another important optimization: there's no need to
hash the Authenticator for the rcache.  Just take however many bits are
needed from the enc-part's ciphertext (preferably the authentication
tag, but this isn't strictly necessary).  It's pseudo-random enough
already.  (A client can make some bits of the ciphertext be whatever it
wants in the case of CTS modes, but recall that we're talking about
attackers who lack credentials and are trying to replay messages.)

This last optimization can also be applied to Roland's ticket decryption
cache idea.  Hashing the ticket wouldn't yield that much of a
performance improvement otherwise.

A scalable hash table has other cache uses as well.  For example, it
could be used for speeding up a fast re-auth ticket protocol extension.
Suppose the AP-REP could carry a small "fast re-auth ticket" when a
client's ticket is too big (e.g., because of PACs).  A fast re-auth
ticket might be as small as a small integer, provided there was a fast
cache from which to retrieve the original decrypted ticket.

Heck, a scalable hash table could be used as an index for the
traditional file-based ccache.

With a simple extension to be non-probabilistic, and to allow
variable-sized entries, the scalable hash table can even be used as a
simple DB for the KDC.

IOW, here we have a single data structure that works for almost every
case where we need an index on stable storage in Kerberos
implementations!  All except keytabs, which are generally small enough
that there's no point in using anything other than a linear list of
entries.

Nico
--
_______________________________________________
krbdev mailing list             [hidden email]
https://mailman.mit.edu/mailman/listinfo/krbdev