open - resource temporarily unavailable

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

open - resource temporarily unavailable

Andrew P. Cherepenko

Hello list,
   'open()' for creating file sometimes returns an error:
     couldn't open "myfile.txt": resource temporarily unavailable
     either in background process or interactively (ex: in Emacs when
trying to save a file).

system: Debian 8.5
with kernel: linux-image-3.16.0-4-amd64
and systemd 215-17+deb8u4

Is that concerned with some imposed limits on the kernel resources ?

Following comprehensible suggestions about checking/changing such limits
from here:
http://unix.stackexchange.com/questions/253903/creating-threads-fails-with-resource-temporarily-\
unavailable-with-4-3-kernel

I checked my circumstances.

me:~$ cat /proc/sys/kernel/threads-max
96126

me:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 48063
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 32768
cpu time               (seconds, -t) unlimited
max user processes              (-u) 48063
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

me:~$ ls -1d /proc/*/task/* | wc -l
618

I thought the most important - the limit on open files:
me:~$ ulimit -n
65536

But in use there are:
jeder@mhfpklytstime:~$ lsof | wc -l
26117

So those seemingly reasonable suggestions gave me nothing.
It is quite upsetting too that
   neither the journal nor /var/log/syslog even mentioned the errors.

Am I missing something basic ?


Thanks,
   Andrey

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

Sven Joachim
On 2016-06-16 21:46 +0600, Andrew P. Cherepenko wrote:

> Hello list,
>   'open()' for creating file sometimes returns an error:
>     couldn't open "myfile.txt": resource temporarily unavailable
>     either in background process or interactively (ex: in Emacs when
> trying to save a file).

Are you sure that it's opening the file which fails in that way, and not
writing to it?  Because, according to the manpages, EAGAIN is a possible
error in write(2), but not in open(2).  Quoting the write() manpage:

,----
| EAGAIN The file descriptor fd refers to a file other than a socket
|        and has been marked nonblocking (O_NONBLOCK), and the write
|        would block.  See open(2) for further details on the O_NON‐
|        BLOCK flag.
`----

> system: Debian 8.5
> with kernel: linux-image-3.16.0-4-amd64
> and systemd 215-17+deb8u4
>
> Is that concerned with some imposed limits on the kernel resources ?

Probably not.  The only limit that should play a role here is the number
of file descriptors, and the error would be EMFILE ("Too many open
files") rather than EAGAIN.

Cheers,
       Sven

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user

Sven Joachim <svenjoac <at> gmx.de> writes:

>
> On 2016-06-16 21:46 +0600, Andrew P. Cherepenko wrote:
>
> > Hello list,
> >   'open()' for creating file sometimes returns an error:
> >     couldn't open "myfile.txt": resource temporarily unavailable
> >     either in background process or interactively (ex: in Emacs when
> > trying to save a file).
>
> Are you sure that it's opening the file which fails in that way, and not
> writing to it?

Yes. I am pretty sure.

>
> > system: Debian 8.5
> > with kernel: linux-image-3.16.0-4-amd64
> > and systemd 215-17+deb8u4
> >
> > Is that concerned with some imposed limits on the kernel resources ?
>
> Probably not.  The only limit that should play a role here is the number
> of file descriptors, and the error would be EMFILE ("Too many open
> files") rather than EAGAIN.

The computer is a kind of server.
It is writing every second to about 100 files, every 8 hours creating new
files and closing old ones.
It has been running 7/24 for years without stopping.
All files are on its local disk (in three partitions).
It all worked fairly well on Debian 7 (linux kernel 3.20-4-amd64) for quite
long time until I moved it to Debian 8 a few weeks ago.


Thank you Sven, but I'd rather like to get some advice how I could trace and
pin down the problem.

For the present I have not even got a clue where it could be.
Is it in the kernel or systemd or elsewhere ?

following -
http://0pointer.de/blog/projects/resources.html
(Is it perhaps outdated ?)

I've put DefaultControllers= in /etc/systemd/system.conf
for my desktop and reboot it.
I couldn't see any difference in journal
 there isn't DefaultControllers
in 'man systemd-system.conf' )
Surely it was a blind shoot - I don't like such alike.
But what can I do ?
There is nothing in logs concerning the error and it's not easy to reproduce.


Thanks,
  Andrey


Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 17, 2016 at 09:24:37AM +0000, Andrey wrote:

>
> Sven Joachim <svenjoac <at> gmx.de> writes:
>
> >
> > On 2016-06-16 21:46 +0600, Andrew P. Cherepenko wrote:
> >
> > > Hello list,
> > >   'open()' for creating file sometimes returns an error:
> > >     couldn't open "myfile.txt": resource temporarily unavailable
> > >     either in background process or interactively (ex: in Emacs when
> > > trying to save a file).
> >
> > Are you sure that it's opening the file which fails in that way, and not
> > writing to it?
>
> Yes. I am pretty sure.

If the problem is somehow repeatable, you might try to run your
program under strace to actually pinpoint the system call setting
EAGAIN (perhaps you've already done that, but I couldn't imply
that from your post).

Other things to check: does that happen on any files? On a
specific file system? If yes: how is that one mounted?

regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldjxrkACgkQBcgs9XrR2kb10QCfbG3dnreanrEM7PkE4C4PO/vI
/RwAnRFhCuZYevv8PBzt+50rR9O4v4rm
=T3NR
-----END PGP SIGNATURE-----

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
 <tomas <at> tuxteam.de> writes:

>
>
> On Fri, Jun 17, 2016 at 09:24:37AM +0000, Andrey wrote:
> >
> > Sven Joachim <svenjoac <at> gmx.de> writes:
> >
> > >
> > > On 2016-06-16 21:46 +0600, Andrew P. Cherepenko wrote:
> > >
> > > > Hello list,
> > > >   'open()' for creating file sometimes returns an error:
> > > >     couldn't open "myfile.txt": resource temporarily unavailable
> > > >     either in background process or interactively (ex: in Emacs when
> > > > trying to save a file).
> > >
> > > Are you sure that it's opening the file which fails in that way, and not
> > > writing to it?
> >
> > Yes. I am pretty sure.
>
> If the problem is somehow repeatable, you might try to run your
> program under strace to actually pinpoint the system call setting
> EAGAIN (perhaps you've already done that, but I couldn't imply
> that from your post).
>

It happens very rare and rather unpredictable
sometimes when a lot of files are created
and sometimes without such obvious signs.

> Other things to check: does that happen on any files? On a
> specific file system? If yes: how is that one mounted?
>

It happens to any file on any ext4 partition which are locally mounted.

Is there a way to find out at least which part of the system is responsible
for 'resource temporarily unavailable'.


All the best,
  Andrey



Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 17, 2016 at 11:06:15AM +0000, Andrey wrote:
>  <tomas <at> tuxteam.de> writes:

[...]

> > If the problem is somehow repeatable [...]

> It happens very rare and rather unpredictable
> sometimes when a lot of files are created
> and sometimes without such obvious signs.

Darn. This makes the problem "interesting".

> > Other things to check: does that happen on any files? On a
> > specific file system? If yes: how is that one mounted?
> >
>
> It happens to any file on any ext4 partition which are locally mounted.
>
> Is there a way to find out at least which part of the system is responsible
> for 'resource temporarily unavailable'.

OK. I've got one more hint. Reading through the open(2) man page
(assuming it is really open what's failing on you -- what evidence
do you have?), EAGAIN isn't listed among the possible errno values,
but EWOULDBLOCK

   EWOULDBLOCK
          The O_NONBLOCK flag was specified, and an incompatible
          lease was held on the file (see fcntl(2)).

Now, on POSIX systems EWOULDBLOCK and EAGAIN could be one and the
same. Lo and behold, a small test program on my box reveals that
both at least translate to 'Resource temporarily unavailable':

    #include <stdio.h>
    #include <errno.h>
    #include <string.h>
   
    int main(int argc, char *argv[])
    {
     printf("EAGAIN is '%s'\n"
            "EWOULDBLOCK is '%s'\n",
            strerror(EAGAIN),
            strerror(EWOULDBLOCK));
    }
   
    ==>
   
    tomas@rasputin:~/prog/C$ ./errno
    EAGAIN is 'Resource temporarily unavailable'
    EWOULDBLOCK is 'Resource temporarily unavailable'


This will all depend on things like kernel version, libc and whatnot,
but the most likely candidate at the moment seems to be the app playing
games with fcntl leases.

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldj35cACgkQBcgs9XrR2kbwSwCfU1kdxFV6vp/xvS0/O1WuUEj8
IWgAnA9jtNDLJxc4AVtg17gDNYrxX3AR
=pnjY
-----END PGP SIGNATURE-----

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
 <tomas <at> tuxteam.de> writes:

>
>
> On Fri, Jun 17, 2016 at 11:06:15AM +0000, Andrey wrote:
> >  <tomas <at> tuxteam.de> writes:
>
> [...]
>
> > > If the problem is somehow repeatable [...]
>
> > It happens very rare and rather unpredictable
> > sometimes when a lot of files are created
> > and sometimes without such obvious signs.
>
> Darn. This makes the problem "interesting".
>
> > > Other things to check: does that happen on any files? On a
> > > specific file system? If yes: how is that one mounted?
> > >
> >
> > It happens to any file on any ext4 partition which are locally mounted.
> >
> > Is there a way to find out at least which part of the system is responsible
> > for 'resource temporarily unavailable'.
>
> OK. I've got one more hint. Reading through the open(2) man page
> (assuming it is really open what's failing on you -- what evidence
> do you have?),

well, although it may be not convincing to you:
in Tcl it's return from -
'open $fname w'
from man open(3tcl):
'w  Open  the  file  for  writing only.  Truncate it if it exists.  If it
     does not exist, create a new file.'

which is translated to libc open
open (fname, O_CREAT|O_WRONLY|O_TRUNC) (or create())


> EAGAIN isn't listed among the possible errno values,
> but EWOULDBLOCK
>
>    EWOULDBLOCK
>           The O_NONBLOCK flag was specified, and an incompatible
>           lease was held on the file (see fcntl(2)).
>

But in my case it's normal - not NONBLOCK file

> Now, on POSIX systems EWOULDBLOCK and EAGAIN could be one and the
> same. Lo and behold, a small test program on my box reveals that
> both at least translate to 'Resource temporarily unavailable':
>
>     #include <stdio.h>
>     #include <errno.h>
>     #include <string.h>
>
>     int main(int argc, char *argv[])
>     {
>      printf("EAGAIN is '%s'\n"
>             "EWOULDBLOCK is '%s'\n",
>             strerror(EAGAIN),
>             strerror(EWOULDBLOCK));
>     }
>
>     ==>
>
>     tomas <at> rasputin:~/prog/C$ ./errno
>     EAGAIN is 'Resource temporarily unavailable'
>     EWOULDBLOCK is 'Resource temporarily unavailable'
>
> This will all depend on things like kernel version, libc and whatnot,
> but the most likely candidate at the moment seems to be the app playing
> games with fcntl leases.
>

how it could be when I tried to save a file from Emacs and got 'Resource
temporarily unavailable'

well, I included all open(2) errors in your test:

int nerr[]={EACCES,EEXIST,EFAULT,EFBIG,EINTR,EINVAL,EISDIR,ELOOP,
            EMFILE,ENAMETOOLONG,ENODEV,ENOENT,ENOMEM,ENOSPC,ENOTDIR,ENXIO,
            EOPNOTSUPP,EOVERFLOW,EPERM,EROFS,ETXTBSY,EWOULDBLOCK};
char
*terr[]={"EACCES","EEXIST","EFAULT","EFBIG","EINTR","EINVAL","EISDIR","ELOOP",
             
"EMFILE","ENAMETOOLONG","ENODEV","ENOENT","ENOMEM","ENOSPC","ENOTDIR","ENXIO",
              "EOPNOTSUPP","EOVERFLOW","EPERM","EROFS","ETXTBSY","EWOULDBLOCK"};
           
int main(int argc, char *argv[])
{ int i;
  for (i=0; i<sizeof(nerr)/sizeof(int); i++) {
    printf("%s - %s\n", terr[i], strerror(nerr[i]));
  }
}

serv:~$ ./errno |grep -i resource
EWOULDBLOCK - Resource temporarily unavailable

but it isn't the case, there aren't any muddling with fcntl leases -
 it's plain straight forward file create which could be BLOCKED for a while
but never denied

It may not appear for a week and
then one day I get the error three times but always unexpected :(

I can't trace every file access but
what would help me, I think, if there is a way to get the error in some log.


regards,
  Andrey



Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 17, 2016 at 01:12:00PM +0000, Andrey wrote:

[...]

> well, although it may be not convincing to you:
> in Tcl it's return from -
> 'open $fname w'
> from man open(3tcl):
> 'w  Open  the  file  for  writing only.  Truncate it if it exists.  If it
>      does not exist, create a new file.'

You don't need to convince me :) -- I just noted that I didn't remember seeing
any evidence (which could well have been blindness on my side).

[...]

But writing a minimal Tcl program and running it through strace might shake
out whether they do any fcntl behind the scenes...

> how it could be when I tried to save a file from Emacs and got 'Resource
> temporarily unavailable'
>
> well, I included all open(2) errors in your test:
>
> int nerr[]={EACCES,EEXIST,EFAULT,EFBIG,EINTR,EINVAL,EISDIR,ELOOP,
>    EMFILE,ENAMETOOLONG,ENODEV,ENOENT,ENOMEM,ENOSPC,ENOTDIR,ENXIO,
>    EOPNOTSUPP,EOVERFLOW,EPERM,EROFS,ETXTBSY,EWOULDBLOCK};

I see. So still EWOULDBLOCK is the likely "culprit".

I see three things you might try:

  - sift through the kernel sources watching out for a possible
    EWOULDBLOCK return on open()

  - have a look at the Emacs sources

  - use the LD_PRELOAD trick [1] to install a little spy on open()
    and let the system running for a while like this (the last one
    depends on the ratio of how critical your system is and how
    corageous you are ;-)

regards
[1] https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/

- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldj/1MACgkQBcgs9XrR2kYOGQCggYh3j5nex/pHIo5mp1WQ+Vor
sysAn2v8YPaeyN6ImvBvWKCg/QAvgSpc
=7A6z
-----END PGP SIGNATURE-----

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
 <tomas <at> tuxteam.de> writes:

>
>
> On Fri, Jun 17, 2016 at 01:12:00PM +0000, Andrey wrote:
>
> [...]
>
> But writing a minimal Tcl program and running it through strace might shake
> out whether they do any fcntl behind the scenes...
>

o.k.
serv:~$ cat >t.tcl
set f [open tst.tst w]
puts $f test
close $f
serv:~$ strace tclsh t.tcl
...
open("tst.tst", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 5
fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,
0x7ffc5a3bb3e0) = -1 ENOTTY (Inappropriate ioctl for device)
write(5, "test\n", 5)                   = 5
close(5)                                = 0
...

as it was expected - nothing miraculous


> > how it could be when I tried to save a file from Emacs and got 'Resource
> > temporarily unavailable'
> >
> > well, I included all open(2) errors in your test:
> >
> > int nerr[]={EACCES,EEXIST,EFAULT,EFBIG,EINTR,EINVAL,EISDIR,ELOOP,
> >    EMFILE,ENAMETOOLONG,ENODEV,ENOENT,ENOMEM,ENOSPC,ENOTDIR,ENXIO,
> >    EOPNOTSUPP,EOVERFLOW,EPERM,EROFS,ETXTBSY,EWOULDBLOCK};
>
> I see. So still EWOULDBLOCK is the likely "culprit".
>

I am afraid not, it has nothing to do with NONBLOCKED i/o,
it's normal BLOCKED.

> I see three things you might try:
>
>   - sift through the kernel sources watching out for a possible
>     EWOULDBLOCK return on open()
>
>   - have a look at the Emacs sources
>

there is nothing particular about emacs
it my have been 'bash' or anything that uses libc
 
>   - use the LD_PRELOAD trick [1] to install a little spy on open()
>     and let the system running for a while like this (the last one
>     depends on the ratio of how critical your system is and how
>     corageous you are
>
> regards
> [1]
https://rafalcieslak.wordpress.com/2013/04/02
/dynamic-linker-tricks-using-ld_preload
-to-cheat-inject-features-and-investigate-programs/
>

Thank you Thomas! it is something I'll look at later, although system is
quite busy and critical.

I am still wondering could be systemd somehow imposes limits on resources,
 not having a clue what should I watch for.

All the best,
  Andrey



Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

Mart van de Wege
In reply to this post by dummy user
Andrey <[hidden email]> writes:

>  <tomas <at> tuxteam.de> writes:
>
>> Other things to check: does that happen on any files? On a
>> specific file system? If yes: how is that one mounted?
>>
>
> It happens to any file on any ext4 partition which are locally mounted.
>
> Is there a way to find out at least which part of the system is responsible
> for 'resource temporarily unavailable'.
>
>
What's the output of 'df -i'? If you create lots of files, maybe you ran
out of inodes?

Mart

--
"We will need a longer wall when the revolution comes."
    --- AJS, quoting an uncertain source.

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

Sven Joachim
On 2016-06-17 18:12 +0200, Mart van de Wege wrote:

> Andrey <[hidden email]> writes:
>
>>  <tomas <at> tuxteam.de> writes:
>>
>>> Other things to check: does that happen on any files? On a
>>> specific file system? If yes: how is that one mounted?
>>>
>>
>> It happens to any file on any ext4 partition which are locally mounted.
>>
>> Is there a way to find out at least which part of the system is responsible
>> for 'resource temporarily unavailable'.
>>
>>
> What's the output of 'df -i'? If you create lots of files, maybe you ran
> out of inodes?

In this case the error would be "No space left on device".

,----
|  ENOSPC pathname  was to be created but the device containing path‐
|         name has no room for the new file.
`----

Cheers,
       Sven

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

Sven Joachim
In reply to this post by tomas@tuxteam.de
On 2016-06-17 13:31 +0200, [hidden email] wrote:

> OK. I've got one more hint. Reading through the open(2) man page
> (assuming it is really open what's failing on you -- what evidence
> do you have?), EAGAIN isn't listed among the possible errno values,
> but EWOULDBLOCK
>
>    EWOULDBLOCK
>           The O_NONBLOCK flag was specified, and an incompatible
>           lease was held on the file (see fcntl(2)).

Thanks, I had only looked at EAGAIN.

> Now, on POSIX systems EWOULDBLOCK and EAGAIN could be one and the
> same.

On Linux, EWOULDBLOCK is #defined as EAGAIN in asm-generic/errno.h.
Also they are always the same in the glibc, regardless of the operating
system kernel.

> Lo and behold, a small test program on my box reveals that
> both at least translate to 'Resource temporarily unavailable':
>
>     #include <stdio.h>
>     #include <errno.h>
>     #include <string.h>
>    
>     int main(int argc, char *argv[])
>     {
>      printf("EAGAIN is '%s'\n"
>             "EWOULDBLOCK is '%s'\n",
>             strerror(EAGAIN),
>             strerror(EWOULDBLOCK));
>     }
>    
>     ==>
>    
>     tomas@rasputin:~/prog/C$ ./errno
>     EAGAIN is 'Resource temporarily unavailable'
>     EWOULDBLOCK is 'Resource temporarily unavailable'

You don't have to write your own program, there is already an 'errno'
utility in the moreutils package. :-)

Cheers,
       Sven

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
In reply to this post by dummy user
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 17, 2016 at 02:37:11PM +0000, Andrey wrote:

>  <tomas <at> tuxteam.de> writes:
>
> >
> >
> > On Fri, Jun 17, 2016 at 01:12:00PM +0000, Andrey wrote:
> >
> > [...]
> >
> > But writing a minimal Tcl program and running it through strace might shake
> > out whether they do any fcntl behind the scenes...
> >
>
> o.k.
> serv:~$ cat >t.tcl
> set f [open tst.tst w]
> puts $f test
> close $f
> serv:~$ strace tclsh t.tcl
> ...
> open("tst.tst", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 5
> fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
> ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,
> 0x7ffc5a3bb3e0) = -1 ENOTTY (Inappropriate ioctl for device)
> write(5, "test\n", 5)                   = 5
> close(5)                                = 0
> ...
>
> as it was expected - nothing miraculous

OK. So it seems there's another path open() --> EWOULDBLOCK in the kernel.
That would be a chance to read some kernel sources... I fear I must give
up here. $DAYJOB and that :-)

[...]

> > I see. So still EWOULDBLOCK is the likely "culprit".
> >
>
> I am afraid not, it has nothing to do with NONBLOCKED i/o,
> it's normal BLOCKED.

Sorry I wasn't clear: I was just talking about the errno -- we still
haven't an idea how it comes about. The manpage is but an approximation
to reality :-)

[...]

> >   - have a look at the Emacs sources
> >
>
> there is nothing particular about emacs
> it my have been 'bash' or anything that uses libc

I wasnt implying that: any of the apps you've caught complaining about
"Resource temporarily unavailable" might do as a help in thinking about
what can be occurring. Can we really say open() returned error and
errno was EWOULDBLOCK after this? (the evidence you've collected seems
to point strongly at this; is there any other possibility?)

> >   - use the LD_PRELOAD trick [1] to install a little spy on open()
> >     and let the system running for a while like this (the last one
> >     depends on the ratio of how critical your system is and how
> >     corageous you are

Note that this can be a pretty intrusive technique, depending on which
programs you "bug" this way. Tread carefully :-)

regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldkWBcACgkQBcgs9XrR2kb8LACeJWTB4vfOfmVOFY894ssx/FSA
YG8An0xHRzRnAW+zM9di7VRiCdGd9SKd
=WxkA
-----END PGP SIGNATURE-----

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
In reply to this post by Sven Joachim
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, Jun 17, 2016 at 09:03:30PM +0200, Sven Joachim wrote:

> On 2016-06-17 13:31 +0200, [hidden email] wrote:
>
> > OK. I've got one more hint. Reading through the open(2) man page
> > (assuming it is really open what's failing on you -- what evidence
> > do you have?), EAGAIN isn't listed among the possible errno values,
> > but EWOULDBLOCK
> >
> >    EWOULDBLOCK
> >           The O_NONBLOCK flag was specified, and an incompatible
> >           lease was held on the file (see fcntl(2)).
>
> Thanks, I had only looked at EAGAIN.

Yes, I missed that at first too.

> > Now, on POSIX systems EWOULDBLOCK and EAGAIN could be one and the
> > same.
>
> On Linux, EWOULDBLOCK is #defined as EAGAIN in asm-generic/errno.h.
> Also they are always the same in the glibc, regardless of the operating
> system kernel.

Thanks for checking!

[...]

> You don't have to write your own program, there is already an 'errno'
> utility in the moreutils package. :-)

Thanks for the hint :-)

regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldkWIsACgkQBcgs9XrR2kZAqACfXWGraeoxAEBexs0Aw38PW4sR
OSUAn3XKB78drfAUepHV8XipDecv/01F
=fTHI
-----END PGP SIGNATURE-----

Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
In reply to this post by tomas@tuxteam.de
 <tomas <at> tuxteam.de> writes:

>
>
> On Fri, Jun 17, 2016 at 02:37:11PM +0000, Andrey wrote:
> >  <tomas <at> tuxteam.de> writes:
> >
> > >
> > >
> > > On Fri, Jun 17, 2016 at 01:12:00PM +0000, Andrey wrote:
> > >
> > > [...]
> > open("tst.tst", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 5
> > ...
> >
> > as it was expected - nothing miraculous
>
> OK. So it seems there's another path open() --> EWOULDBLOCK in the kernel.
> That would be a chance to read some kernel sources... I fear I must give
> up here. $DAYJOB and that
>

To be precise it was "resource temporarily unavailable".
I'll try to get a real errnum next time.
 
> [...]
> Sorry I wasn't clear: I was just talking about the errno -- we still
> haven't an idea how it comes about. The manpage is but an approximation
> to reality
>

checking correctness of libc isn't very promising :-)

> [...]
> > >   - use the LD_PRELOAD trick [1] to install a little spy on open()
> > >     and let the system running for a while like this (the last one
> > >     depends on the ratio of how critical your system is and how
> > >     corageous you are
>
> Note that this can be a pretty intrusive technique, depending on which
> programs you "bug" this way. Tread carefully
>

LD_PRELOAD for the whole system isn't really possible - only for particular
programs.
I'll begin with my programs.
There I can check the result of syscall (I need neither LD_PRELOAD nor
strace for them :)
It'll be only about libc but the problem may lay well beneath the libc.


all the best,
  Andrey



Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
In reply to this post by Mart van de Wege
Mart van de Wege <mvdwege <at> gmail.com> writes:

>
> Andrey <thatisme <at> inp.nsk.su> writes:
>
> >  <tomas <at> tuxteam.de> writes:
> >
> >> Other things to check: does that happen on any files? On a
> >> specific file system? If yes: how is that one mounted?
> >>
> >
> > It happens to any file on any ext4 partition which are locally mounted.
> >
> > Is there a way to find out at least which part of the system is responsible
> > for 'resource temporarily unavailable'.
> >
> >
> What's the output of 'df -i'? If you create lots of files, maybe you ran
> out of inodes?
>

It would have stopped working if there weren't enough inodes

me:~$ df -i
Filesystem      Inodes  IUsed   IFree IUse% Mounted on
/dev/sda2      1001712 146662  855050   15% /
udev           1538029    531 1537498    1% /dev
tmpfs          1540151   1095 1539056    1% /run
tmpfs          1540151      2 1540149    1% /dev/shm
tmpfs          1540151      5 1540146    1% /run/lock
tmpfs          1540151     13 1540138    1% /sys/fs/cgroup
/dev/sda7      9158656 142644 9016012    2% /home/me/test/toshiba
/dev/sda8      9158656  49954 9108702    1% /home/me/test/other
/dev/sda6      9158656 192671 8965985    3% /home/me/test/thales
tmpfs          1540151      6 1540145    1% /run/user/116
tmpfs          1540151     10 1540141    1% /run/user/1001
tmpfs          1540151      4 1540147    1% /run/user/1000


regards,
  Andrey


Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

dummy user
In reply to this post by dummy user
Andrey <thatisme <at> inp.nsk.su> writes:

> > On Fri, Jun 17, 2016 at 02:37:11PM +0000, Andrey wrote:
> > >  <tomas <at> tuxteam.de> writes:
> > [...]
> > Sorry I wasn't clear: I was just talking about the errno -- we still
> > haven't an idea how it comes about. The manpage is but an approximation
> > to reality

you are right, it's  EAGAIN
"open $currname w" -> couldn't open ../../current/south/S_DAQconnMod.txt:
errorCode=POSIX EAGAIN {resource temporarily unavailable}

although it's only confirmation to what has been already posted by
Sven Joachim <svenjoac <at> gmx.de> writes:
> On Linux, EWOULDBLOCK is #defined as EAGAIN in asm-generic/errno.h.
> Also they are always the same in the glibc, regardless of the operating
> system kernel.

But how can I find out why it happens ?
Still have no hints how to proceed :(

regards,
  Andrey
 




Reply | Threaded
Open this post in threaded view
|

Re: open - resource temporarily unavailable

tomas@tuxteam.de
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Jun 19, 2016 at 05:15:47PM +0000, Andrey wrote:

> Andrey <thatisme <at> inp.nsk.su> writes:
>
> > > On Fri, Jun 17, 2016 at 02:37:11PM +0000, Andrey wrote:
> > > >  <tomas <at> tuxteam.de> writes:
> > > [...]
> > > Sorry I wasn't clear: I was just talking about the errno -- we still
> > > haven't an idea how it comes about. The manpage is but an approximation
> > > to reality
>
> you are right, it's  EAGAIN
> "open $currname w" -> couldn't open ../../current/south/S_DAQconnMod.txt:
> errorCode=POSIX EAGAIN {resource temporarily unavailable}

OK. So now we can be (more or less) sure it happens on open(2). Of course
I'd sleep better if we had a way to positively know there's nothing
happening behind our backs: perhaps Tcl is being extra clever[1] and
Tcl's open does a bit more than just libc's/syscall's open() and we
are seeing some spurious error from another system call: remember, error
paths in the code tend to be less tested...)

> although it's only confirmation to what has been already posted by
> Sven Joachim <svenjoac <at> gmx.de> writes:
> > On Linux, EWOULDBLOCK is #defined as EAGAIN in asm-generic/errno.h.
> > Also they are always the same in the glibc, regardless of the operating
> > system kernel.
>
> But how can I find out why it happens ?
> Still have no hints how to proceed :(

I fear I can't suggest much more than I have already. Keeping to user
space (e.g. with the help of LD_PRELOAD) might help you find patterns
(is this always in some region of the file system, is there any
correlation to other logs in the system, etc.)

But since user space only sees the kernel returning from a syscall
with EAGAIN set, not "why", you'll end up trying to find out what
the kernel is "thinking" to pin-point the cause.

The more hints you collect (e.g. "Is it really open(2)?" "Is it always
open(2)?" "Which files?") the better your chances to navigate through
kernel code.

I'd start setting up a "trap" (either by a suitable strace/ltrace
incantation (easier) or with LD_PRELOAD (more difficult, and libc
specific) to just log those events in a couple of chosen apps where
you know from time to time. Try to extract as much info from those
events (strace would tell you what system calls, which parameters,
and so on).

Sorry to be so unspecific

regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAldnpJ4ACgkQBcgs9XrR2kZmpACeLcs1vzS/dmmgMLqhgf/UWhV8
RN8An1jAyNeFo53Y3+6P2jQOzxaW/GcR
=CovP
-----END PGP SIGNATURE-----