Infrastructure meetings take place every other Thurs. at 9:00 Pacific on the BlueJeans infrastructure-meeting channel: https://bluejeans.com/383721668

Date

Goals


ItemWhoNotes
Review of last meeting notes
  • Any updates

Dedicated batch compute nodes (slurm)

Unknown User (mbutler)
  • Need a long term allocation process
  • need a policy on who should run on the batch compute and not on lsst-dev01
  • lsst-dev01 gets bogged down sometimes... need to manage that resource a bit better from a project perspective.
  • Do we need more LSST-dev01 systems? (lsst-dev02...etc)
  • more login nodes as submit machines, and then lsst-dev01 is a more managed resource?
  • something to think about
GPFS usageUnknown User (mbutler)
  • 3PB installed. NCSA is working on the areas of where all this storage will be allocated. Much storage is needed for incoming datasets and others. Need to put forth a proposal. After JTM
  • Home Directories quota. –
    • what users are effected by the quota, and get them underneath the quota, and the post in the MOTD or something about it.
    • Want a Inode quota too and same as above – who would be impacted by said quota (probably no one.. they are usually huge!)
DTN nodesUnknown User (mbutler)have been ordered but not arrived.
Kubernetes nodesUnknown User (mbutler)
  • Are installed and functioning. Very understanding early users are being put on the systems this week. (today/tomorrow) – Adam
  • Should create a doc for configuration and set up of the nodes
  • After early tests are complete, the process for changes are to open a ticket.
NCSA maintenanceUnknown User (mbutler)March has the meltdown and spectra security patches. We are waiting for few more updates from vendors. Vendors are saying that they will post this week. I will nail down the timings of the update next week.
lsst-dev01 /tmp filling up

/tmp filled up this week due some 32G size files in the /tmp. Impact SHOULD have been minimal, but /tmp is not it's own file system and shares /root.

/tmp and /root need to be split so that they have their own file systems, and probably a cleanup/purge process for /tmp should be done daily to get of cruft.. but need to look at the files contained in /tmp to see what kind of policy might work.

revamp of accounts processUnknown User (mbutler)I have no idea what this is, and will need to investigate this just a tad.
PDAC StatusUnknown User (xiuqin)Fritz said on the call today that he is going to start working on the Kubernetes config for the PDAC nodes.
Topics for next meeting



Action items

Please enter action items in the form

Responsible Person, Due Date, Description