Flushing all Buffers Before Exiting

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Flushing all Buffers Before Exiting

martin McCormick-3
        I have been using unix of various flavors for 30 years so
this is a bit of a bone-head question except that different
styles of unix handle this situation somewhat differently.

        Imagine that you run a process whose output you want to
catch so you run it as someproc >catchfile.  The process has an
end point so anything it produced gets saved in catchfile and all
is well.

        Now imagine you run someproc and it either has no end
condition or you haven't reached it yet so you kill it with
Control-C.  Some unixen like FreeBSD seem to flush all the
buffers  and you still get your output but Debian appears to not
flush the buffers and you get nothing or maybe a partial capture
with the most recent data lost.

        Is there a way to make sure we got everything that was
produced?

        I have noticed that the tee program in Debian also
appears to buffer data that get lost if you end early.

        Many thanks.

        Martin McCormick

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Kenneth Parker-2
Have you tried the Command Line:   "sync"?

Kenneth Parker 

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

tomas@tuxteam.de
On Thu, Mar 21, 2019 at 10:32:06AM -0400, Kenneth Parker wrote:
> Have you tried the Command Line:   "sync"?

That won't help in the OP's case, I think: sync is about writing out
the operating system's buffers to the file system. In the OP's case
it's about the process's I/O buffers which haven't yet gone to the
operating system.

Cheers
-- t

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

martin McCormick-3
In reply to this post by Kenneth Parker-2
Kenneth Parker <[hidden email]> writes:
> Have you tried the Command Line:   "sync"?

        Excellent question and I did, in fact, try that command
just before killing the running process.

        It had no effect.

<[hidden email]> also writes:
> That won't help in the OP's case, I think: sync is about writing out
> the operating system's buffers to the file system. In the OP's case
> it's about the process's I/O buffers which haven't yet gone to the
> operating system.

        Thanks to both of you.  I hadn't thought of that but that
probably explains why nothing happened other than the command
"sync" successfully ran.

        I wrote the application that is creating this output in
perl and there may be a unique solution there that solves this specific
problem.  That is not as good as a general course of action which
works in all cases of output redirection but it beats nothing.
        A suggestion on a posting in stackoverflow was that one
could open the file for appending, append your new output and
then close it.

        I'll give that a try which should solve this one case.
Apparently others have dealt with how to shake the most recent
data out of buffers and commit it to disk and it is highly
dependant on the operating system as to when the write actually
goes to disk.

        Again many thanks.

Martin McCormick

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Greg Wooledge
On Thu, Mar 21, 2019 at 11:35:51AM -0500, Martin McCormick wrote:
> I wrote the application that is creating this output in
> perl

https://perl.plover.com/FAQs/Buffering.html

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

tomas@tuxteam.de
On Thu, Mar 21, 2019 at 12:46:17PM -0400, Greg Wooledge wrote:
> On Thu, Mar 21, 2019 at 11:35:51AM -0500, Martin McCormick wrote:
> > I wrote the application that is creating this output in
> > perl
>
> https://perl.plover.com/FAQs/Buffering.html

This is it, thanks, Greg.

Most run times (C's FILE interface, i.e. fopen() and friends also)
have this interface. An abnormal end (i.e. signal) don't give the
application time to flush the buffers. You might want to catch
the signals and flush, but depending on the signal that might be
iffy (you sure you want to crawl on after having received a
SIGSEGV?) and sometimes impossible (SIGKILL, e.g.).

Cheers
-- t

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Curt
In reply to this post by tomas@tuxteam.de
On 2019-03-21, <[hidden email]> <[hidden email]> wrote:
>
>
> On Thu, Mar 21, 2019 at 10:32:06AM -0400, Kenneth Parker wrote:
>> Have you tried the Command Line:   "sync"?
>
> That won't help in the OP's case, I think: sync is about writing out
> the operating system's buffers to the file system. In the OP's case
> it's about the process's I/O buffers which haven't yet gone to the
> operating system.

I'm reading a pty app won't buffer (script, screen, etc.).

And then there's a program called 'unbuffer'?

> Cheers


--
“Let us again pretend that life is a solid substance, shaped like a globe,
which we turn about in our fingers. Let us pretend that we can make out a plain
and logical story, so that when one matter is despatched--love for instance--
we go on, in an orderly manner, to the next.” - Virginia Woolf, The Waves

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Greg Wooledge
On Thu, Mar 21, 2019 at 06:01:26PM -0000, Curt wrote:
> I'm reading a pty app won't buffer (script, screen, etc.).

Well, it's a convention, adopted by the C library functions in stdio.

stdio(3) says:

       At  program  startup, three text streams are predefined and need not be
       opened explicitly: standard input  (for  reading  conventional  input),
       standard  output  (for writing conventional output), and standard error
       (for  writing  diagnostic  output).   These  streams  are   abbreviated
       stdin,stdout and stderr.  When opened, the standard error stream is not
       fully buffered;  the  standard  input  and  output  streams  are  fully
       buffered  if  and  only  if  the streams do not refer to an interactive
       device.
       
       Output streams that refer to terminal devices are always line  buffered
       by  default;  pending  output  to such streams is written automatically
       whenever an input stream that refers to a terminal device is read.   In
       cases  where  a large amount of computation is done after printing part
       of a line on an output terminal, it is necessary to fflush(3) the stan‐
       dard  output  before  going  off  and computing so that the output will
       appear.


> And then there's a program called 'unbuffer'?

It's a hack, built around Expect.  It basically tries to fool the
application into thinking that standard output is a terminal, so the
application will operate in line-buffering mode.

I've been told that it works quite often, but you can't ever guarantee
success with it.

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Luís Gomes
In reply to this post by Curt
Try stdbuf.
Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

David Wright-3
In reply to this post by martin McCormick-3
On Thu 21 Mar 2019 at 11:35:51 (-0500), Martin McCormick wrote:

> I wrote the application that is creating this output in
> perl and there may be a unique solution there that solves this specific
> problem.  That is not as good as a general course of action which
> works in all cases of output redirection but it beats nothing.
> A suggestion on a posting in stackoverflow was that one
> could open the file for appending, append your new output and
> then close it.

An efficient way of doing this is to trap a signal, like USR1,
in your program, and react by either your close/open-append or
just flushing the buffers. That way, the program will run
normally most of the time, without wasting all that time
opening/closing files.

If there's not too much output compared with the computation necessary
to generate it, just setting line-buffering on the output stream
can be sufficient.

I've read that when the program is already running, some languages
(like Python, so probably Perl too) offer a debugger that can
allow you to flush the buffers from "within", but I've not tried
it.

Cheers,
David.

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Lee-7
In reply to this post by martin McCormick-3
On 3/21/19, Martin McCormick <[hidden email]> wrote:

> I have been using unix of various flavors for 30 years so
> this is a bit of a bone-head question except that different
> styles of unix handle this situation somewhat differently.
>
> Imagine that you run a process whose output you want to
> catch so you run it as someproc >catchfile.  The process has an
> end point so anything it produced gets saved in catchfile and all
> is well.
>
> Now imagine you run someproc and it either has no end
> condition or you haven't reached it yet so you kill it with
> Control-C.  Some unixen like FreeBSD seem to flush all the
> buffers  and you still get your output but Debian appears to not
> flush the buffers and you get nothing or maybe a partial capture
> with the most recent data lost.
>
> Is there a way to make sure we got everything that was
> produced?

https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe/

Regards,
Lee

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

martin McCormick-3
In reply to this post by David Wright-3
David Wright <[hidden email]> writes:

> An efficient way of doing this is to trap a signal, like USR1,
> in your program, and react by either your close/open-append or
> just flushing the buffers. That way, the program will run
> normally most of the time, without wasting all that time
> opening/closing files.
>
> If there's not too much output compared with the computation necessary
> to generate it, just setting line-buffering on the output stream
> can be sufficient.
>
> I've read that when the program is already running, some languages
> (like Python, so probably Perl too) offer a debugger that can
> allow you to flush the buffers from "within", but I've not tried
> it.
>
> Cheers,
> David.

        Before reading this posting, I added code in my perl
script to open, append and close the file but the suggestion to
add a signal handler is a much better idea so thanks for the
suggestion.

        Opening, appending and closing for each new line of
output made me a bit squeamish.  The program is monitoring a
stream of data from a radio scanner.  The data spew in at about
20 or 30 lines per second.  When nothing is happening, there are
3 possible strings that indicate nothing is happening right now.
When something changes, the strings stop matching 3 comparison
strings i put in which match each of the 3 "nothing is happening
right now" strings and the different strings get printed to the
screen and to the disk.  In reality, these strings don't exactly
mean that nothing is happening but that the same non events are
happening.

        When things change and there is output of interest, that
output also spews in at 20 or 30 lines per second so I need to do
as little as possible to handle that so the system doesn't get
swamped.  The signal handler is most likely far more efficient a
method to capture the data of interest as it will essentially not
have to make any decisions until time to shut down the program
and look at the data.

        Even with the open, append and close routine, the strings
it is capturing appear to be good but it could be capturing for
minutes on end at times and it needs to just be able to run like
the wind and store lines as quickly as it can.

        Thank you.

Martin McCormick
amateur radio WB5AGZ

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

deloptes-2
Martin McCormick wrote:

> Before reading this posting, I added code in my perl
> script to open, append and close the file but the suggestion to
> add a signal handler is a much better idea so thanks for the
> suggestion.

I always use

# Execute anytime before the <STDIN>.
# Causes the currently selected handle to be flushed after every print.
$| = 1;

when needed - but not sure if applies to your case

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

Jeremy Nicoll
In reply to this post by martin McCormick-3
On Fri, 22 Mar 2019, at 00:53, Martin McCormick wrote:

> Opening, appending and closing for each new line of
> output made me a bit squeamish. ...

You could always count lines written and do a close & reopen
every (say) 1000 lines.  That way there's less overhead but
the amount of data you might lose is reduced.

--
Jeremy Nicoll - my opinions are my own.

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

tomas@tuxteam.de
In reply to this post by martin McCormick-3
On Thu, Mar 21, 2019 at 07:52:33PM -0500, Martin McCormick wrote:

[...]

> Opening, appending and closing for each new line of
> output made me a bit squeamish.  The program is monitoring a
> stream of data from a radio scanner.  The data spew in at about
> 20 or 30 lines per second.

Don't fear. Measure :-)

Doesn't sound outrageous to line-buffer your output to file.

The output to screen is already line-buffered (by default,
at least) and isn't killing you, so if I were you, I'd set
up a benchmark run and torture things a bit. Then, *if* you
notice any whiff of a problem, you could try a more clever
scheme like timeout based flush to better get hold of bursts
(if I understood your description, things go out in bursts).

Cheers
-- tomás

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

David Wright-3
On Fri 22 Mar 2019 at 17:45:50 (+0100), [hidden email] wrote:

> On Thu, Mar 21, 2019 at 07:52:33PM -0500, Martin McCormick wrote:
>
> [...]
>
> > Opening, appending and closing for each new line of
> > output made me a bit squeamish.  The program is monitoring a
> > stream of data from a radio scanner.  The data spew in at about
> > 20 or 30 lines per second.
>
> Don't fear. Measure :-)
>
> Doesn't sound outrageous to line-buffer your output to file.
>
> The output to screen is already line-buffered (by default,
> at least) and isn't killing you, so if I were you, I'd set
> up a benchmark run and torture things a bit. Then, *if* you
> notice any whiff of a problem, you could try a more clever
> scheme like timeout based flush to better get hold of bursts
> (if I understood your description, things go out in bursts).

Reading the OP's problem, I wonder how you're meant to detect
"any whiff of a problem". All we know is that the maximum rate
*might* be 30 lines per second, but is that a guess? Are we
already losing the odd line? How would we replicate test runs?

Probably not, at 30 lps, but in principal I would say that
this is a sticking plaster while you write your better method.

The main concern raised in the OP was flushing before termination,
for which a signal is ideal. And for best performance, I'd forget
tee and just look at the output file occasionally, with tail.

Cheers,
David.

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

tomas@tuxteam.de
On Sat, Mar 23, 2019 at 10:27:01AM -0500, David Wright wrote:
> On Fri 22 Mar 2019 at 17:45:50 (+0100), [hidden email] wrote:

> Reading the OP's problem, I wonder how you're meant to detect
> "any whiff of a problem" [...]

Torture tests.

> The main concern raised in the OP was flushing before termination,
> for which a signal is ideal. And for best performance, I'd forget
> tee and just look at the output file occasionally, with tail.

A signal handler is definitely an option, but it can be pretty tricky.
Especially if you are catching things like SEGV (you dare a write()
after that?)

Cheers
-- t

signature.asc (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

martin McCormick-3
In reply to this post by Lee-7
Lee <[hidden email]> writes:
> https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe/
>
> Regards,
> Lee

        Thank you and all others.  It turns out that getting the
autoflush to work in perl is on a par with falling off of a log
for ease of execution.

        There is a perl variable called $| which, when set to a
non-0 value can cause an immediate flush of a buffer but perl
experts recommend you not do it that way.  It's even clearer.

        At the beginning of your perl program you add a line

use IO::Handle;

That is like an include in C so the perl language handler will
know what you mean when you add the next line right after you
write to your file:

FH->autoflush(1);

FH is just my example name for the open file handle.  Perl custom
is to use all caps for the name of the file descriptor so FH or
LOGDATA refers to that file and flushes that buffer.

        In this case, it is an endless loop so the buffer gets
flushed after every new write, but that's how you make it happen.

Martin

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

David Wright-3
In reply to this post by tomas@tuxteam.de
On Sat 23 Mar 2019 at 18:23:47 (+0100), [hidden email] wrote:
> On Sat, Mar 23, 2019 at 10:27:01AM -0500, David Wright wrote:
> > On Fri 22 Mar 2019 at 17:45:50 (+0100), [hidden email] wrote:
>
> > Reading the OP's problem, I wonder how you're meant to detect
> > "any whiff of a problem" [...]
>
> Torture tests.

Like, multiply the number of sources by stealing a few more radio
scanners to connect up, which then all burst into life as the
police scour the neighbourhood for thieves?

When dealing with realtime real information coming in, over which you
have no control, it can be non-trivial to set up such scenarios.
That's why I thought it best to devise a method that's more
efficient than line buffering. After all, that's why buffering
was invented, wasn't it.

> > The main concern raised in the OP was flushing before termination,
> > for which a signal is ideal. And for best performance, I'd forget
> > tee and just look at the output file occasionally, with tail.
>
> A signal handler is definitely an option, but it can be pretty tricky.
> Especially if you are catching things like SEGV (you dare a write()
> after that?)

SIGUSR1 to flush the buffers; that's all (see Subject line). If you
find progamming it tricky, that's a good reason for line buffering as
a stopgap. What would I do with SEGV, having trapped it? (Or most of
the other signals…)

Cheers,
David.

Reply | Threaded
Open this post in threaded view
|

Re: Flushing all Buffers Before Exiting

martin McCormick-3
David Wright <[hidden email]> writes:

> On Sat 23 Mar 2019 at 18:23:47 (+0100), [hidden email] wrote:
> > On Sat, Mar 23, 2019 at 10:27:01AM -0500, David Wright wrote:
> > > On Fri 22 Mar 2019 at 17:45:50 (+0100), [hidden email] wrote:
> >
> > > Reading the OP's problem, I wonder how you're meant to detect
> > > "any whiff of a problem" [...]
> >
> > Torture tests.
>
> Like, multiply the number of sources by stealing a few more radio
> scanners to connect up, which then all burst into life as the
> police scour the neighbourhood for thieves?
>
> When dealing with realtime real information coming in, over which you
> have no control, it can be non-trivial to set up such scenarios.
> That's why I thought it best to devise a method that's more
> efficient than line buffering. After all, that's why buffering
> was invented, wasn't it.

        Apparently, the flush after each new cycle of data isn't
taxing the system too much as the output looks correct.  This is
a 600 MHZ Pentium which would have gone in to the recycle bin
years ago if not for Linux.  Older systems like this tend to
accentuate the effects of not being able to keep up much more
obviously than if this was a quad-core 64-bit modern design.

        The best test I can do is to look at the output which is
quite repetitive as it is designed to allow radios to almost
immediately figure out what frequency and "talk group" they
should be on even if their owner turns on the radio in the middle
of a conversation.  Subsequent lines all look the same so if one
is missing part of the data, it looks wrong especially if you
have watched enough of this gibberish to damage one's brain to
the point where it starts making sense.

        Martin

12