open on nfs server -> resource temporarily unavailable

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

open on nfs server -> resource temporarily unavailable

dummy user

Hello *,
   creating a file in the directory exported by the NFS server
     sometimes returns an error - "resource temporarily unavailable",
     when a client on importing computer is keeping reading that file.

serv:~# cat /etc/exports
/home/me/data-t cli(ro,sync,no_subtree_check)

serv:~# systemctl restart nfs-kernel-server

cli:~# cat /etc/fstab
...
mhfpklytsserv:/home/me/data-t /home/me/dt nfs noauto,ro,noac,user 0 0

me@cli:~$ mount dt
me@cli:~$ mount
...
serv:/home/me/data-t on /home/me/dt type nfs4 (ro,nosuid,nodev,noexec,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.209.174,local_lock=none,addr=192.168.238.113,user=me)

It could be provoked in a few minutes by running on the server
serv:~$ while : ; do date > data-t/now.txt; sleep 1; done

while on the client computer
me@cli:~$ while : ; do cat dt/now.txt; sleep 1; done

After a while (it's unpredictable might take up 10 minutes or a few seconds,
but in average about 2 minutes)
on the server I get:
data-t/now.txt: Resource temporarily unavailable

Lesser 'sleep', of 0.2 for ex., doesn't necessarily make the error appear quicker.
Also it doesn't depend whether a directory is exported read-only or read-write.
But without reading client the error never appears.

To narrow down the case a bit I made a test on 'C' for the server
which does the same thing as the script above - just writing down a date in the same file
- but checking an error on creating a file and writing to it.

Sometimes error appears just in a few seconds:
serv:~$ date;./tnfs-open data-t/now.txt ;date
Mon Aug  1 18:58:23 CEST 2016
open fd=-1 errno=11 -> Resource temporarily unavailable
Mon Aug  1 18:58:25 CEST 2016

serv:~$ errno 11
EAGAIN 11 Resource temporarily unavailable

I am at a loss :(

my system on the server and the client computes is the same Debian 8.5
:~$ dpkg -l 'nfs*'|grep '^ii'
ii  nfs-common        1:1.2.8-9    amd64        NFS support files common to client and server
ii  nfs-kernel-server 1:1.2.8-9    amd64        support for NFS kernel server

:~$ dpkg -l 'systemd*'|grep '^ii'
ii  systemd        215-17+deb8u4 amd64        system and service manager
ii  systemd-sysv   215-17+deb8u4 amd64        system and service manager - SysV links

:~$ dpkg -l 'linux-*'|grep '^ii'
ii  linux-base                          3.5                   all          Linux image base package
ii  linux-compiler-gcc-4.8-x86          3.16.7-ckt25-2+deb8u3 amd64        Compiler for Linux on x86 (meta-package)
ii  linux-headers-3.16.0-4-amd64        3.16.7-ckt25-2+deb8u3 amd64        Header files for Linux 3.16.0-4-amd64
ii  linux-headers-3.16.0-4-common       3.16.7-ckt25-2+deb8u3 amd64        Common header files for Linux 3.16.0-4
ii  linux-headers-amd64                 3.16+63               amd64        Header files for Linux amd64 configuration (meta-package)
ii  linux-image-3.16.0-4-amd64          3.16.7-ckt25-2+deb8u3 amd64        Linux 3.16 for 64-bit PCs
ii  linux-image-amd64                   3.16+63               amd64        Linux for 64-bit PCs (meta-package)
ii  linux-kbuild-3.16                   3.16.7-ckt20-1        amd64        Kbuild infrastructure for Linux 3.16
ii  linux-libc-dev:amd64                3.16.7-ckt25-2+deb8u3 amd64        Linux support headers for userspace development

Am I wrong with NFS options ?
But it was working on Debian 7 (and perhaps 6/5) and it began in May I upgraded to Debian 8.


Thanks,
    Andrey

PS:
I already posted the question to the list in June
https://lists.debian.org/debian-user/2016/06/msg00738.html

But at that time I was looking for some kind of imposed limits.


PPS:
For those who would rather try that 'C' test above
cat tnfs-open.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <time.h>

int main (int argc, char *argv[])
{
   int fd,err,e,sec,ms;
   time_t tm;
   struct timespec tdly={.tv_sec=1, .tv_nsec=0};
   char *ept=NULL,*bf=NULL;

   if (argc < 2) {
     fprintf(stderr, "Usage: %s file-to-overwrite [milliseconds]\n", argv[0]);
     exit(1);
   }

   if (argc > 2) {
     ms=strtol(argv[2],&ept,10);
     if (ept==argv[2]) {
       fprintf(stderr, "interval should be integer number of milliseconds\n", ms);
       exit(1);
     } else {
       fprintf(stderr, "interval=%d ms\n", ms);
       sec=ms/1000;
       tdly.tv_sec=sec;
       tdly.tv_nsec=(ms-sec*1000)*1000000;
     }
   }
   for (;;) {
     tm=time(NULL);
     bf=ctime(&tm);
     fd=open(argv[1], O_WRONLY|O_CREAT|O_TRUNC, 0666);
     if (fd<0) {
       err=errno;
       fprintf(stderr, "open fd=%d errno=%d -> %s\n", fd, err,strerror(err));
       exit(2);
     }
     e=write(fd, bf, strlen(bf));
     if (e<0) {
       err=errno;
       fprintf(stderr, "write error=%d errno=%d\n", e, err,strerror(err));
       exit(3);
     }
     close(fd);
     nanosleep(&tdly,NULL);
   }
}

Reply | Threaded
Open this post in threaded view
|

Re: open on nfs server -> resource temporarily unavailable

Mike Kupfer
Andrey wrote:

> me@cli:~$ mount
> ...
> serv:/home/me/data-t on /home/me/dt type nfs4 (ro,nosuid,nodev,noexec,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.209.174,local_lock=none,addr=192.168.238.113,user=me)
>
> It could be provoked in a few minutes by running on the server serv:~$
> while : ; do date > data-t/now.txt; sleep 1; done
>
> while on the client computer
> me@cli:~$ while : ; do cat dt/now.txt; sleep 1; done
>
> After a while (it's unpredictable might take up 10 minutes or a few seconds,
> but in average about 2 minutes)
> on the server I get:
> data-t/now.txt: Resource temporarily unavailable

It's likely that the server is granting the client a read delegation
on the file, which would have to be recalled before the server can write
to the file.

Try mounting with version 3 instead of version 4.

Is there much of a network delay between the client and server?  If not,
I'd open a bug.

regards,
mike

Reply | Threaded
Open this post in threaded view
|

Re: open on nfs server -> resource temporarily unavailable

Salvatore Bonaccorso-4
In reply to this post by dummy user
Hi Andrey

> Hello *,
>   creating a file in the directory exported by the NFS server
>     sometimes returns an error - "resource temporarily unavailable",
>     when a client on importing computer is keeping reading that file.
>
> serv:~# cat /etc/exports
> /home/me/data-t cli(ro,sync,no_subtree_check)
>
> serv:~# systemctl restart nfs-kernel-server
>
> cli:~# cat /etc/fstab
> ...
> mhfpklytsserv:/home/me/data-t /home/me/dt nfs noauto,ro,noac,user 0 0
>
> me@cli:~$ mount dt
> me@cli:~$ mount
> ...
> serv:/home/me/data-t on /home/me/dt type nfs4 (ro,nosuid,nodev,noexec,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.209.174,local_lock=none,addr=192.168.238.113,user=me)
>
> It could be provoked in a few minutes by running on the server serv:~$ while : ; do date > data-t/now.txt; sleep 1; done
>
>
> while on the client computer
> me@cli:~$ while : ; do cat dt/now.txt; sleep 1; done
>
> After a while (it's unpredictable might take up 10 minutes or a few seconds,
> but in average about 2 minutes)
> on the server I get:
> data-t/now.txt: Resource temporarily unavailable
>
> Lesser 'sleep', of 0.2 for ex., doesn't necessarily make the error appear quicker.
> Also it doesn't depend whether a directory is exported read-only or read-write.
> But without reading client the error never appears.
>
> To narrow down the case a bit I made a test on 'C' for the server
> which does the same thing as the script above - just writing down a date in the same file
> - but checking an error on creating a file and writing to it.
>
> Sometimes error appears just in a few seconds:
> serv:~$ date;./tnfs-open data-t/now.txt ;date
> Mon Aug  1 18:58:23 CEST 2016
> open fd=-1 errno=11 -> Resource temporarily unavailable
> Mon Aug  1 18:58:25 CEST 2016
>
> serv:~$ errno 11
> EAGAIN 11 Resource temporarily unavailable
>
> I am at a loss :(
>
> my system on the server and the client computes is the same Debian 8.5

Were you ever able to narrow down the issue? I'm able to reproduce the issue
easily as well just on localhost doing the following on a Debian 8, running
3.16.43-2+deb8u5 or 3.16.48-1, but the issue seems disapeared (or at least
harder to reproduce in 4.9, when installed from jessie-backports):

Sort of "minimal" reproducing steps:

# apt-get install nfs-kernel-server
# mkdir -p /srv/test
# echo '/srv/test *' >> /etc/exports
# systemctl restart nfs-kernel-server.service
# mount localhost:/srv/test /mnt

1. terminal
# while : ; do date >/srv/test/foo ; sleep 1 ; done

2. terminal
# while : ; do cat /mnt/foo ; sleep 1 ; done

I'm currently trying to bisect the issue. But since in the good cases it's not
clear if it's always fixed I can only guess at the moment that the 4.9 claim is
true.

Were you sucessful on isolating the issue?

Regards,
Salvatore

Reply | Threaded
Open this post in threaded view
|

Re: open on nfs server -> resource temporarily unavailable

Salvatore Bonaccorso-4
Hi,

On Wed, Nov 29, 2017 at 07:19:42AM +0100, Salvatore Bonaccorso wrote:
[...]

> > my system on the server and the client computes is the same Debian 8.5
>
> Were you ever able to narrow down the issue? I'm able to reproduce the issue
> easily as well just on localhost doing the following on a Debian 8, running
> 3.16.43-2+deb8u5 or 3.16.48-1, but the issue seems disapeared (or at least
> harder to reproduce in 4.9, when installed from jessie-backports):
>
> Sort of "minimal" reproducing steps:
>
> # apt-get install nfs-kernel-server
> # mkdir -p /srv/test
> # echo '/srv/test *' >> /etc/exports
> # systemctl restart nfs-kernel-server.service
> # mount localhost:/srv/test /mnt
>
> 1. terminal
> # while : ; do date >/srv/test/foo ; sleep 1 ; done
>
> 2. terminal
> # while : ; do cat /mnt/foo ; sleep 1 ; done
>
> I'm currently trying to bisect the issue. But since in the good cases it's not
> clear if it's always fixed I can only guess at the moment that the 4.9 claim is
> true.
>
> Were you sucessful on isolating the issue?

Unless done mistakes, trying to bisect the issue leads to that v3.17
is broken, v3.18 fixed the issue. Further bisecting between v3.17 and
v3.18 seem to indicate the "fixing" commit is

https://git.kernel.org/linus/03d12ddf845a4eb874ffa558d65a548aee9b715b
(and possibly the other prerequisites up to that).

Regards,
Salvatore