Because I can and it’s needed
As much as there is disk space available to the user.
On average, about twice
(read rate / write rate)
6 months of access to nextGEMS Cycle 3 - full resolution daily output
blue = all regions | red = all time steps | white = both equally | gray = none
Disk space / simulation duration
(for models doing asynchronous output)
As much as we can get
For restart files
Yes, sure, postprocessing for example
Based on one- or two-week write speed data kindly provided by Carsten Beyer. Includes /scratch/
and /work/
.
I hide post-processing the in the data flow for simplicity
40 drives, half of them for writing, 300MB/s per drive
Off by a factor of two, but still not too bad – probably need to account more strongly for data that is re-processed or garbage
Binning of one-minute traffic data for one- or two-week intervals kindly provided by Carsten Beyer. Includes /scratch/
and /work/
.
Green = Lustre read
Yellow = Lustre write
(work and scratch)
Thanks to Carsten Beyer for pointing this out.
Binning of one-minute traffic data for one- or two-week intervals kindly provided by Carsten Beyer. Includes /scratch/
and /work/
.
(and a bunch of similar jobs on the other GPU nodes)
See Pay’s Tech Talk on Jan 16 for details on ClusterCockpit.
Based on one- or two-week traffic data kindly provided by Carsten Beyer. Includes /scratch/
and /work/
.
Users requesting tons of files from one node
Users compensating for bad access patterns and file formats (GRIB) by throwing many nodes at the problem.