Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
Background: I am considering buying a new disk (and will write an email later
with some other questions or observations about the process), but I know that,
at least often for SSD drives, they now specify what I will call the longevity
in terms of TB TBW (iiuc, that is terabytes total bytes written).

Anyway, I edit large files many times a day and try to save it at each edit or
partial edit (at a guess, one particular file is around 100 MB, and I may save
it 200 or more times a day).

There are two things I'd like to measure, and I'm wondering what tools (or
approaches) are available:

1. I'd like to count how many times a day I actually save the file.  (One
approach (at least I think I could do this) could be to write a sort of shell
script wrapper and always initiate saves using the shell script, but I was
hoping there was more of pre-built solution.)

2. A lot of my editing involves editing near (but not at) the end of a file.  I
assume (I know) that the software that saves the file is smart enough not to
rewrite the entire file but instead to preserve the beginning of the file and
just rewrite the changed part of the file (or from there to the end of the
file).

Can anyone confirm that, and, if so, suggest any way of measuring how much is
written to a given file in a given time period (e.g., per day)?

I guess at a very deep level (I mean like at the level of the disk firmware or
driver level), this may differ between an SSD and an HDD -- if you have any
insight into that, I'd appreciate that.

Thanks!

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Alexander V. Makartsev
On 06.04.2019 22:39, [hidden email] wrote:
Background: I am considering buying a new disk (and will write an email later 
with some other questions or observations about the process), but I know that, 
at least often for SSD drives, they now specify what I will call the longevity 
in terms of TB TBW (iiuc, that is terabytes total bytes written).

Anyway, I edit large files many times a day and try to save it at each edit or 
partial edit (at a guess, one particular file is around 100 MB, and I may save 
it 200 or more times a day).

There are two things I'd like to measure, and I'm wondering what tools (or 
approaches) are available:

1. I'd like to count how many times a day I actually save the file.  (One 
approach (at least I think I could do this) could be to write a sort of shell 
script wrapper and always initiate saves using the shell script, but I was 
hoping there was more of pre-built solution.)

2. A lot of my editing involves editing near (but not at) the end of a file.  I 
assume (I know) that the software that saves the file is smart enough not to 
rewrite the entire file but instead to preserve the beginning of the file and 
just rewrite the changed part of the file (or from there to the end of the 
file).

Can anyone confirm that, and, if so, suggest any way of measuring how much is 
written to a given file in a given time period (e.g., per day)?

I guess at a very deep level (I mean like at the level of the disk firmware or 
driver level), this may differ between an SSD and an HDD -- if you have any 
insight into that, I'd appreciate that.

Thanks!

As far as I understand, your final goal is to know how much data was written per day.
One way to find out is to use 'smartctl' utility and collect values daily from attribute #241 called "Lifetime_Writes_GiB".
Some SSD vendors could have it under different ID number or different name.

$ sudo smartctl -A /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-0.bpo.2-amd64] (local build)
...
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       13244
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       1350
...
241 Lifetime_Writes_GiB     0x0012   100   100   000    Old_age   Always       -       3526
242 Lifetime_Reads_GiB      0x0012   100   100   000    Old_age   Always       -       4581
...

Of course, if you are not familiar with programming, or not interested in writing a text parser, to setup a cronjob and
possibly to install monitoring software with fancy graphs, then my advice is not so good, but that is how I'd done it.
Now, I'm kinda interested to know too if there is a less hacky methods to determine a volume of written data per device per day.

-- 
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀ 
Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Stefan Monnier
In reply to this post by rhkramer
> Anyway, I edit large files many times a day and try to save it at each
> edit or partial edit (at a guess, one particular file is around 100
> MB, and I may save  it 200 or more times a day).

So, we're looking in the order of 100GB / day.

> 1. I'd like to count how many times a day I actually save the file.  (One
> approach (at least I think I could do this) could be to write a sort of shell
> script wrapper and always initiate saves using the shell script, but I was
> hoping there was more of pre-built solution.)

It likely depends on the tool you're using to edit that file (in many
tools you can tweak the tool to keep track of that info).

> 2. A lot of my editing involves editing near (but not at) the end of
> a file.  I assume (I know) that the software that saves the file is
> smart enough not to rewrite the entire file but instead to preserve
> the beginning of the file and just rewrite the changed part of the
> file (or from there to the end of the file).

Not completely sure if "you assume" or "you know" it to be the case.
Especially given that you then add:

> Can anyone confirm that,

which suggest you're not really sure (unless it referred to something else).


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Nicholas Geovanis-2

On Sat, Apr 6, 2019, 4:44 PM Stefan Monnier <[hidden email]> wrote:

> 2. A lot of my editing involves editing near (but not at) the end of
> a file.  I assume (I know) that the software that saves the file is
> smart enough not to rewrite the entire file but instead to preserve
> the beginning of the file and just rewrite the changed part of the
> file (or from there to the end of the file).

Not completely sure if "you assume" or "you know" it to be the case.
Especially given that you then add:

> Can anyone confirm that,

which suggest you're not really sure (unless it referred to something else).

Let's say you are using vi. Last I heard, it will buffer the entire file contents on the initial open (unless you use the right option etc). How about on file write?
I don't know. 
I submit that rather than try to figure out each app, just install sar and related utilities and view the aggregate throughput. Maybe use cacti for quicker-to-screen graphics if you want.


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

andy smith-10
In reply to this post by rhkramer
Hi,

On Sat, Apr 06, 2019 at 01:39:27PM -0400, [hidden email] wrote:
> Background: I am considering buying a new disk (and will write an email later
> with some other questions or observations about the process), but I know that,
> at least often for SSD drives, they now specify what I will call the longevity
> in terms of TB TBW (iiuc, that is terabytes total bytes written).

"TBW" in the endurance specs for SSDs is normally "Terabytes
Written". Also that may be 10¹² (10^12) bytes or 2⁴⁰ (2^40) bytes.
Another common metric is Drive Writes Per Day (DWPD).

Like Alexander I use SMART attributes to monitor this. As Alexander
says, the usual attribute is 241. You will have to check what 241
corresponds to though. For example, on some of my machines 241 is
described as "Total_LBAs_Written" and measures 512 byte sectors. On
others I've found it uses units of 1MiB (2²⁰ bytes), 25MiB or 1GiB!

You can test by writing a known quantity of data to the device (say,
with dd) and then checking out with smartctl how much the counters
altered. Here's a blog post where I did this with some flash devices
to determine the 241 unit:

    http://strugglers.net/~andy/blog/2016/11/26/supermicro-sata-dom-flash-devices-dont-report-lifetime-writes-correctly/

Given that you can easily measure how much is written to the device,
do you still need to measure how much is written when editing
specific files?

Cheers,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Curt
In reply to this post by Stefan Monnier
On 2019-04-06, Stefan Monnier <[hidden email]> wrote:
>
>> 2. A lot of my editing involves editing near (but not at) the end of
>> a file.  I assume (I know) that the software that saves the file is
>> smart enough not to rewrite the entire file but instead to preserve
>> the beginning of the file and just rewrite the changed part of the
>> file (or from there to the end of the file).
>
> Not completely sure if "you assume" or "you know" it to be the case.
> Especially given that you then add:

He assumes he knows, I guess.

It might be pertinent for us to know what "the software" is exactly. At
any rate, why anyone would write "the software" rather the name of the
application involved defies my imaginative faculty.

Maybe it's privileged information the OP must keep under wraps.

>> Can anyone confirm that,
>
> which suggest you're not really sure (unless it referred to something else).
>
>
>         Stefan
>
>


--
“Let us again pretend that life is a solid substance, shaped like a globe,
which we turn about in our fingers. Let us pretend that we can make out a plain
and logical story, so that when one matter is despatched--love for instance--
we go on, in an orderly manner, to the next.” - Virginia Woolf, The Waves

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Curt
On 2019-04-07, Curt <[hidden email]> wrote:

> On 2019-04-06, Stefan Monnier <[hidden email]> wrote:
>>
>>> 2. A lot of my editing involves editing near (but not at) the end of
>>> a file.  I assume (I know) that the software that saves the file is
>>> smart enough not to rewrite the entire file but instead to preserve
>>> the beginning of the file and just rewrite the changed part of the
>>> file (or from there to the end of the file).
>>
>> Not completely sure if "you assume" or "you know" it to be the case.
>> Especially given that you then add:
>
> He assumes he knows, I guess.
>
> It might be pertinent for us to know what "the software" is exactly. At
> any rate, why anyone would write "the software" rather the name of the
> application involved defies my imaginative faculty.

I probably should've have said name or nature of the software involved
(and cover the case where it's some sort of home-brew thing with no
denomination, or an obscure proprietary app whose name wouldn't mean
anything to anybody, or God knows what).


> Maybe it's privileged information the OP must keep under wraps.
>
>>> Can anyone confirm that,
>>
>> which suggest you're not really sure (unless it referred to something else).
>>
>>
>>         Stefan
>>
>>
>
>


--
“Let us again pretend that life is a solid substance, shaped like a globe,
which we turn about in our fingers. Let us pretend that we can make out a plain
and logical story, so that when one matter is despatched--love for instance--
we go on, in an orderly manner, to the next.” - Virginia Woolf, The Waves

Reply | Threaded
Open this post in threaded view
|

Re: [OFF-LIST] Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by rhkramer
Thanks to all who responded (even off list)!

I will respond to some of the other posts if they did something like ask a
question.

I am exploring smartctl and sar (I found atsar for wheezy and loaded it, adn
smartctl in smartmontools -- I had heard of (and even used smartctl sometime
in the distant past but don't recall ever having heard of sar -- it looks
useful but will require some man reading.)

On Saturday, April 06, 2019 04:01:14 PM Richard Owlett wrote:
> *BEWARE*
> I've replied offline because I've done some *HEAVY* handed editing.

Everybody should do heavy handed editing -- edit it down to what you are
responding too.

If you edit appropriately, it is much more appropriate to reply to the list
where your response (and the responders to your post) can help anybody /
everybody -- I will send my response to the list.

(Aside: You surely don't need to use bold caps, either would be enough by
itself.)

> I wanted to respond to specific details without causing chaos.
> i am *NO WAY* an expert authority
>
> On 04/06/2019 12:39 PM, [hidden email] wrote:
> > Background: I am considering buying a new disk but I know that,
> > at least often for SSD drives, they now specify what I will call the
> > longevity in terms of TB TBW (iiuc, that is terabytes total bytes
> > written).
>
> Will the new disk be mechanical or SSD?

That is part of the decision I'm tryig to make.

> The references you refer to --
> Do they consider newer technologies referred to as "wear leveling"?

I didn't look for that, from other reading, my understanding is that "good"
SSDs do wear leveling, that is how they obtain large TBW ratings when each
cell (iiautrw (if I am using the right word)) has a rating of something like
1000 write cycles.  (I mean, if you wrote to the same place 1000 times, you
might wear it out, but by using wear leveling, you write the same (or some of
the same) information to different parts of the disk.  You might write some
files 10,000 times, but not to the same place, thus the 1000 writes don't wear
out one spot.

> > Anyway, I edit large files many times a day and try to save it at
> > each edit or  partial edit (at a guess, one particular file is
> > around 100 MB, and I may save it 200 or more times a day).
>
> 100 MB is *NOT* large.>

It is if it is all my own text / prose.  (And, in any case, it is to me ;-)

> > [SNIP}
> >
> > 2. A lot of my editing involves editing near the end of a file.  I
> > assume that the software that saves the file is smart enough not to
> > rewrite the entire file but instead to preserve the beginning of the
> > file and just rewrite the changed part of the file (or from there to
> > the end of the file).
>
> As phrased, I doubt that is a safe assumption.
>
> Before commenting
>     I assert I am *NOT* an expert ;/
>
> 1. I suggest you describe what you are editing.
> 2. As 100MB is rather small, why not do total save with serial number as
> part of filename and once a day/week/month save data under different
> filename and then reuse previous data space.

I am not sure what that would accomplish (except requiring more work on my
part).

> This is worth exactly what you paid for it ;/
> I'm satisfied if I provided food for thought.
>
> *YMMV*

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by Stefan Monnier
Again, thanks to all who replied, some comments below.

On Saturday, April 06, 2019 05:44:19 PM Stefan Monnier wrote:
> > 2. A lot of my editing involves editing near (but not at) the end of
> > a file.  I assume (I know) that the software that saves the file is
> > smart enough not to rewrite the entire file but instead to preserve
> > the beginning of the file and just rewrite the changed part of the
> > file (or from there to the end of the file).
>
> Not completely sure if "you assume" or "you know" it to be the case.

Sorry, I should have tried to be more clear -- sort of a digression, but I
came from an environment where anytime someone used the word assume, someone
else would point out what (they thought) that meant (it makes an ass out of
[yo]u and me).

I still use the word, but use the "(I know)" as a defensive mechanism to stave
off the expected response.
>
> Especially given that you then add:
> > Can anyone confirm that,
>
> which suggest you're not really sure (unless it referred to something
> else).

To (try to) be clear, I am not sure whether only the changed part (or from the
changed part of the file to the end is written).  I can imagine that is
reasonably possible -- I mean, the file is stored in blocks on the disk, and
some of those blocks are not changed, so why rewrite them.  OTOH, something
has to be smart enough (be programmed well enough) to recognize that and avoid
the writes.

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Erik Christiansen
On 07.04.19 08:12, [hidden email] wrote:
> Sorry, I should have tried to be more clear -- sort of a digression, but I
> came from an environment where anytime someone used the word assume, someone
> else would point out what (they thought) that meant (it makes an ass out of
> [yo]u and me).

Not the boys in blue, by any chance? Station sergeants tend to plant
that lesson early in a rookie's consciousness, I hear.

> I still use the word, but use the "(I know)" as a defensive mechanism to stave
> off the expected response.

Given that the implication of "assume" is to take something on faith,
without supporting evidence, a safer word might be "surmise". A guess
with an implication of thoughtful deliberation behind it leaves little
for the overly opinionated to gnaw on.

Erik

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by andy smith-10

Again, thanks to all who replied -- one comment below -- oops, that changed ;-)

 

On Saturday, April 06, 2019 09:22:17 PM Andy Smith wrote:

> You can test by writing a known quantity of data to the device (say,

> with dd) and then checking out with smartctl how much the counters

> altered. Here's a blog post where I did this with some flash devices

> to determine the 241 unit:

>

>

> http://strugglers.net/~andy/blog/2016/11/26/supermicro-sata-dom-flash-devi

> ces-dont-report-lifetime-writes-correctly/

>

> Given that you can easily measure how much is written to the device,

> do you still need to measure how much is written when editing

> specific files?

 

No. And the approach of using dd (or one of my editors) to write a known quantity of data (or file with a known size in the case of trying to determine if the entire file is written or just the changed part) will be useful.

 

(Aside: I do have to confirm the meaning of some of the "raw" numbers from smartctl -- some of them don't seem reasonable, e.g.,

 

Selected smartctl attributes and other info from my 5 year old (well, 5 years in service) WD 250 GB drive:

 

(Aside: I'm not really looking for anyone to respond to my comments / questions embedded below, I guess I'll be doing some reading / googling, but, any responses are welcome.>

 

 

<quotes, out of sequence, these for the HDD> <I put some comments / questions in angle brackets on these lines>

240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 217192201232731 <really, or is that a negative number or something>

241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 372986671 <I guess I need to read the provided link to determine how many bytes in an LBA on my system>

242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 2162381618

 

...

 

1 Raw_Read_Error_Rate 0x000f 111 099 006 Pre-fail Always - 41619524 <that seems high, maybe the disk is approaching failure>

3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0

4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 25 <I guess this gives me a good count of the number of reboots in those 5 years, (although the other drive shows only 22 -- maybe I installed that drive later)>

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 <that sounds like a good thing>

7 Seek_Error_Rate 0x000f 072 060 030 Pre-fail Always - 16695206 <that seems high, maybe the disk is approaching failure>

9 Power_On_Hours 0x0032 053 053 000 Old_age Always - 41241 <correlates with a drive in service for 5 years>

 

...

 

Device is: Not in smartctl database [for details use: -P showall] <I'm not sure whether that is important or not -- the other drive in this system (the SSD drive) is in the smartctl database) -- I wonder if being in the database implies more reads or writes in order to collect smartctl data?>

 

...

 

Sector Size: 512 bytes logical/physical <I'm pretty sure that does not tell me the LBA size, (but, with a headache this morning, I'm not thinking very well.

</quotes, out of sequence, these for the HDD>

 

<quotes, out of sequence, these for the SSD (80 GB)>

3 Spin_Up_Time 0x0020 100 100 000 Old_age Offline - 0 <those SSDs spin up pretty fast ;-) >

4 Start_Stop_Count 0x0030 100 100 000 Old_age Offline - 0

5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 11 <Hmm, a little bit of a surprise, because the other drive has 0>

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 41270 <interesting in that the other drive has 41241 close, but not exact>

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22

192 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 4 <hmm, this is not collected for the other drive, but it lists 25 power cycles -- maybe the other drive doesn't distinguish, but, even if I add this to the 22 power cycles (=26) there is a fencepost error (off by one)>

225 Host_Writes_32MiB 0x0030 100 100 000 Old_age Offline - 468009 <so I guess I can multiply 32 MiB times 468009 and then divide by 5 years worth of days to get an approximate MiB per day figure = 8.2 GiB / day -- seems surprisingly high because I tried to put the system stuff (the executables and such) on this drive with the expectation that they might get read often, but not written often -- I'll probably look into this a little more at some point)>

226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 5133309 <it would be interesting to know what this is telling me>

227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 0

228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 1012829346 <my calculation (1825 days x 60 x 24 shows only 2,628,000 in 5 years -- maybe the drive is in some kind of time warp ;-) >

232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0

233 Media_Wearout_Indicator 0x0032 096 096 000 Old_age Always - 0 <it would be interesting to know what this is telling me>

184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0 <this sounds good, whatever it means>

</quotes, out of sequence, these for the SSD (80 GB)>

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by Curt
On Sunday, April 07, 2019 05:41:42 AM Curt wrote:

> On 2019-04-06, Stefan Monnier <[hidden email]> wrote:
> >> 2. A lot of my editing involves editing near (but not at) the end of
> >> a file.  I assume (I know) that the software that saves the file is
> >> smart enough not to rewrite the entire file but instead to preserve
> >> the beginning of the file and just rewrite the changed part of the
> >> file (or from there to the end of the file).
> >
> > Not completely sure if "you assume" or "you know" it to be the case.
>
> > Especially given that you then add:
> He assumes he knows, I guess.

I tried to address that in a different response today.  I do find it fascinating
that, so often (it seems to me) people on this list speculate on what someone
else means, or answers for them instead of either asking them or waiting a
reasonable time for a response.  (I guess a reasonable time is in the mind of
the beholder.)

And forgive me if I'm a little abrupt this morning, I'll use my headache as an
excuse.

> It might be pertinent for us to know what "the software" is exactly. At
> any rate, why anyone would write "the software" rather the name of the
> application involved defies my imaginative faculty.
>
> Maybe it's privileged information the OP must keep under wraps.

I thought "edit" would be pretty clear (implying a text editor / word
processor), but, to get more specific, depending on the file I use kate, nedit,
or kwrite, and I am starting to migrate toward using any scintilla based
editor -- I need some software written (in scintilla terms, some lexer /
folders to make that migration -- I've now found a student in Serbia who is
helping me with that (because I could never grok C/C++ or the scintilla code
base).  (But, by paying attention to what he is writing, I think I may
overcome that problem.)

And thinking about it today, editing could refer to things like audio or video
files (and maybe other things), but I'm not sure that would make a difference to
my questions or the answers.
 
> >> Can anyone confirm that,

Well, with atsar or smartctl, I anticipate some experiments that might confirm
that for me.

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by Erik Christiansen

Thanks for the response -- two comments below:

 

On Sunday, April 07, 2019 08:53:16 AM Erik Christiansen wrote:

> Not the boys in blue, by any chance? Station sergeants tend to plant

> that lesson early in a rookie's consciousness, I hear.

 

No -- a large steel company that went out of business around 2000, built the Empire State Building and the Golden Gate Bridge, and a ship a day for a period of time (something like 2 years, iiuc) during WWII. (Bethlehem Steel)

> > I still use the word, but use the "(I know)" as a defensive mechanism to

> > stave off the expected response.

>

> Given that the implication of "assume" is to take something on faith,

> without supporting evidence, a safer word might be "surmise". A guess

> with an implication of thoughtful deliberation behind it leaves little

> for the overly opinionated to gnaw on.

 

Good suggestion, thanks!

 

Just looking for the definition of assume -- I don't think I ever looked it up (never had to, I was always given the definition, as explained elsewhere ;-), this is one (there are others, some relate to things like "assuming responsibility", for example:

 

<quote>

Assume | Definition of Assume by Merriam-Webster

https://www.merriam-webster.com/dictionary/assume

Assume and presume both mean "to take something for granted" or "to take something as true," but the words differ in the degree of confidence the person assuming or presuming has. Presume is used when someone is making an informed guess based on reasonable evidence. Assume is used when the guess is based on little or no evidence.

</quote>

 

Anyway, based on this, i might also consider "presume" as an alternative to assume.

 

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Stefan Monnier
In reply to this post by rhkramer
>> Not completely sure if "you assume" or "you know" it to be the case.
> Sorry, I should have tried to be more clear -- sort of a digression, but I
> came from an environment where anytime someone used the word assume, someone
> else would point out what (they thought) that meant (it makes an ass out of
> [yo]u and me).

[ I think I understand what you mean.  In French, "assume" means something
  quite different from the use we're discussing so I use "présume"
  instead, which also works in English (modulo the accent, obviously).  ]

> I thought "edit" would be pretty clear (implying a text editor / word
> processor), but, to get more specific, depending on the file I use
> kate, nedit, or kwrite, and I am starting to migrate toward using any
> scintilla based editor.

I don't specifically know what those do, but at least I know Emacs tries
to avoid "unnecessary" modifications when *re-reading* a file, but makes
no such effort when writing it: it would just blindly write the 100MB on
top of the old content.

So I would not be surprised if those other text editors do likewise.

> To (try to) be clear, I am not sure whether only the changed part (or
> from the  changed part of the file to the end is written).  I can
> imagine that is  reasonably possible -- I mean, the file is stored in
> blocks on the disk, and  some of those blocks are not changed, so why
> rewrite them.

Indeed, it's definitely possible.  But there can be various reasons not
to do that:
- it's simpler to send the 100MB and forget about it than having to
  first read (the beginning of) those 100MB to see which part was
  left unchanged.
- in order to make the save atomic, the editor may prefer to write the
  100MB to another file and only when that's done rename that file to
  overwrite the old file.
- ... probably other reasons ...

> Well, with atsar or smartctl, I anticipate some experiments that might
> confirm that for me.

Sounds like a better approach, indeed (after all, you don't really care
about what your tool does so much as you care about the resulting amount
of writes that gets sent to the disk).


        Stefan

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Carles Pina i Estany-2
In reply to this post by rhkramer

Hi,

On Apr/06/2019, [hidden email] wrote:

> Can anyone confirm that, and, if so, suggest any way of measuring how much is
> written to a given file in a given time period (e.g., per day)?
>
> I guess at a very deep level (I mean like at the level of the disk firmware or
> driver level), this may differ between an SSD and an HDD -- if you have any
> insight into that, I'd appreciate that.

In my SSDs I have:
/sys/fs/ext4/dm-0/lifetime_write_kbytes

I'm not sure if this is specific for SSD? But if this was available in
your existing hardware would help you to answer some of your questions.

Cheers,

--
Carles Pina i Estany
        GPG Key 0x8CD5C157

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Reco
        Hi.

On Sun, Apr 07, 2019 at 10:10:58PM +0200, Carles Pina i Estany wrote:

>
> Hi,
>
> On Apr/06/2019, [hidden email] wrote:
>
> > Can anyone confirm that, and, if so, suggest any way of measuring how much is
> > written to a given file in a given time period (e.g., per day)?
> >
> > I guess at a very deep level (I mean like at the level of the disk firmware or
> > driver level), this may differ between an SSD and an HDD -- if you have any
> > insight into that, I'd appreciate that.
>
> In my SSDs I have:
> /sys/fs/ext4/dm-0/lifetime_write_kbytes
>
> I'm not sure if this is specific for SSD?

No, it's not. It's filesystem-specific though.
Meaning - you have to use ext4 to see this attribute, but the device
where the ext4 filesystem resides does not matter.

Reco

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Curt
On 2019-04-07, Reco <[hidden email]> wrote:
>>
>> I'm not sure if this is specific for SSD?
>
> No, it's not. It's filesystem-specific though.
> Meaning - you have to use ext4 to see this attribute, but the device
> where the ext4 filesystem resides does not matter.
>
> Reco
>

Maybe an SSD (arriving as I am now at the chocolate-covered banana
response to Euro Zone monetary integration) is not the most appropriate
storage device for frequent editing of large files.

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by Reco
On Sunday, April 07, 2019 04:22:41 PM Reco wrote:
> On Sun, Apr 07, 2019 at 10:10:58PM +0200, Carles Pina i Estany wrote:

> > In my SSDs I have:
> > /sys/fs/ext4/dm-0/lifetime_write_kbytes
> >
> > I'm not sure if this is specific for SSD?
>
> No, it's not. It's filesystem-specific though.
> Meaning - you have to use ext4 to see this attribute, but the device
> where the ext4 filesystem resides does not matter.

Well, to clarify, if you have multiple ext4  filesystems, does that represent
the sum of lifetime_write_kbytes of all of those filesystems?  

(And,iiuc, any ext2 (or others) are not included in that sum.)

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

rhkramer
In reply to this post by Curt
On Monday, April 08, 2019 03:40:54 AM Curt wrote:
> Maybe an SSD is not the most appropriate
> storage device for frequent editing of large files.

That is what I'm trying to decide / determine.

> (arriving as I am now at the chocolate-covered banana
> response to Euro Zone monetary integration)

I have no idea what that means (although I'm pretty sure it is not germane to
this discussion).

Reply | Threaded
Open this post in threaded view
|

Re: Measuring (or calculating) how many bytes are actually written to disk when I repeatedly save a file

Curt
On 2019-04-08, [hidden email] <[hidden email]> wrote:
> On Monday, April 08, 2019 03:40:54 AM Curt wrote:
>> Maybe an SSD is not the most appropriate
>> storage device for frequent editing of large files.
>
> That is what I'm trying to decide / determine.

It is? Sorry. I guess I was tragically thrown off the scent by your
subject line (as well as other bytes in your body).

How about:

 Subject: SSD for frequent edits of large text files?

 Hi kids!

 I frequently edit large text files (100-200Mb) in Kwrite and Kate and
 am in the market for a new hard drive. Would an SSD be a judicious
 choice for this scenario? Thanks!

12