fd (file descriptor) leak in replay cache

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

fd (file descriptor) leak in replay cache

Parity error
Hi,

We have been using the kerberos 1.10.3 library and we find that
occasionally a lot of the following files are kept open by the library
and they fill up the fd limit of the process,

lsof output
--------------
myproc 8777 nutanix  254u   REG       9,1       7048    537432
/var/tmp/krb5_RCWZN1Er (deleted)
myproc 8777 nutanix  255u   REG       9,1      19059    537404
/var/tmp/krb5_RCijDPtN (deleted)
myproc 8777 nutanix  256u   REG       9,1      19059    537404
/var/tmp/krb5_RCijDPtN (deleted)

Is there any fd leak that has been fixed since ?

- Gautham
________________________________________________
Kerberos mailing list           [hidden email]
https://mailman.mit.edu/mailman/listinfo/kerberos
Reply | Threaded
Open this post in threaded view
|

Re: fd (file descriptor) leak in replay cache

Robbie Harwood
Parity error <[hidden email]> writes:

> We have been using the kerberos 1.10.3 library and we find that
> occasionally a lot of the following files are kept open by the library
> and they fill up the fd limit of the process,

Hopefully someone else has a more detailed answer for you, but there
have been 82 commits since then which are leak fixes, some of which may
relate to the problem.  So: "probably".

Unfortunately, krb5-10 is from early 2012.  MIT upstream focuses most
support efforts around 1.15-series (current release) and 1.14-series
(maintenance release).

If you can reproduce it on another system, perhaps try with a newer krb5
and see?  (Based on the version, you're using Centos6; Centos7 has
krb5-1.14.1 at the time of writing.)

--Robbie

________________________________________________
Kerberos mailing list           [hidden email]
https://mailman.mit.edu/mailman/listinfo/kerberos

signature.asc (847 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: fd (file descriptor) leak in replay cache

Parity error
Rob, i have tried with the latest 1.14.5 and still face the same issue.
Basically the number of open fds to files like /var/tmp/krb5_RCxxxxxx
just keeps on increasing, almost monotonically. There are open fds to
/var/tmp/host_1000, but these increase and then decrease and stay
within 20 or 30 descriptors (to the same file). However the fds to
/var/tmp/krb5_RCxxxxxx has just kept on increasing to thousands..

It would help a lot for my debugging if you could tell me how these
krb5_RCxxxxxx files are used. There is a rename and dup also going on.
I have made sure that the security context is deleted with a call to
gss_delete_sec_context(). However the acceptor_cred_handle is obtained
once when the process starts and is given to each invocation of
gss_accept_sec_context() and only freed when the process terminates.

On 4/20/17, Robbie Harwood <[hidden email]> wrote:

> Parity error <[hidden email]> writes:
>
>> We have been using the kerberos 1.10.3 library and we find that
>> occasionally a lot of the following files are kept open by the library
>> and they fill up the fd limit of the process,
>
> Hopefully someone else has a more detailed answer for you, but there
> have been 82 commits since then which are leak fixes, some of which may
> relate to the problem.  So: "probably".
>
> Unfortunately, krb5-10 is from early 2012.  MIT upstream focuses most
> support efforts around 1.15-series (current release) and 1.14-series
> (maintenance release).
>
> If you can reproduce it on another system, perhaps try with a newer krb5
> and see?  (Based on the version, you're using Centos6; Centos7 has
> krb5-1.14.1 at the time of writing.)
>
> --Robbie
>
________________________________________________
Kerberos mailing list           [hidden email]
https://mailman.mit.edu/mailman/listinfo/kerberos
Reply | Threaded
Open this post in threaded view
|

Re: fd (file descriptor) leak in replay cache

Greg Hudson
On 04/21/2017 10:27 AM, Parity error wrote:
> It would help a lot for my debugging if you could tell me how these
> krb5_RCxxxxxx files are used. There is a rename and dup also going on.

In its current design, the replay cache needs to be periodically
expunged so that it does not grow without bound.  To do this, the code
opens a temporary file named krb5_RCxxxxxx, writes the non-expired
entries to the file, then renames it over the existing rcache.

It's possible that lsof is reporting the krb5_RCxxxxxx names when the fd
is actually (after the rename) pointing to the host_1000 file.
________________________________________________
Kerberos mailing list           [hidden email]
https://mailman.mit.edu/mailman/listinfo/kerberos