Date

Location

Browser

Room System

Phone Dial-in

https://bluejeans.com/690803707

or 

https://ls.st/rsv


  1. Dial: 199.48.152.152 or bjn.vc
  2. Enter Meeting ID: 690803707 -or- use the pairing code

Dial-in numbers:

  • +1 408 740 7256
  • +1 888 240 2560 (US Toll Free)
  • +1 408 317 9253 (Alternate Number)

Meeting ID: 690803707

Attendees

Regrets

Metric Tracking Dashboard


URL: 

Discussion items

ItemWhoPre-meeting notesNotes and  Action Items 
News, announcements and task review

Welcome to Erik Dennihy



Bugs, issues, topics, etc of the week

All
  • JSON/lsst.verify NaN issue in faro was resolved by turning faro off.  We need to
    • a) get a new validation dataset in place and turn it back on and
    • b) work out how to deal with NaN in reporting to squash (we will have to deal with NaN) 
  • validate_drp is also failing due to NaN. Recommend leaving to pipelines to decide what to do as they are still using it for G2/3 parity testing. 
  • RFC-793 : use PyTest in science pipelines. Support! We don't have enough unit tests in faro.

Failure due to a change in json reader. Python requests model does not want shape invaild json over http. The datasets being used for validate_drp. Don't compute valid metrics due to changes in implementation. Disable in nighly builds. Longer term solution is to develop a better CI dataset.

Should we stop using NaN as a sentinel value when attempted to compute the metric but insufficient data.

Multiple issues

  • Transport layer
  • Do we want to have more classes of failure status? For example:
    • Not enough data
    • Algorithm failed
  • What happens when NaN goes to influxDB? Currently treats as those metrics do not exist.

There is a distinction between visualization and what is stored in the database. NaN do not show up as entries in table in Chronograph. We want to be aware of results when.

  • Jeffrey Carlin make a proposal to define failure status for metrics.  

Are we concerned that we do not currently have valid metrics values from nightly CI to look on dashboard? Breakages are on CI, not RC2 dataset. Currently thinking about the nightly CI mainly for checking breakages, not looking at performance for nightly CI. We do care about this for larger datasets.

Simon has been compiling larger dataset that could be a middle ground.

Simon met with DRP folks. Once Dan Taranu is back from vacation, create dashboard for DRP metrics monitoring meeting. This should change soon. faro development group may need to help with dashboard creation.

Reprocessing status and metrics review  

Dashboards can be found here: https://chronograf-demo.lsst.codes/sources/2/dashboards

Monthly reprocessing: 

w30 processing in progress. Few errors. Mostly done. Metrics should have been calculated, but are not yet being displayed on a dashboard

  • Jeffrey Carlin Check with Yusra on status of faro dashboard for RC2  
Development status 
  • Fall 2021 epic 
  • Backlog epic: 
  •   : there has been a  lot of discussion on this PR. Summarize the main takeaways
    • Add DM license preamble to all python files
      • Started this for all the files touched so far
    • Create / improved doc strings
    • Homogenize run method signatures across all the measurement tasks. Suggest in general to inherit run method from CatalogMeasurementBaseTask base class, which can accept and pass keyword arguments. We should avoid passing dataIds to measurement tasks were possible. For example, we can use FilterLabel or list of FilterLabels instead of passing the dataIds if we just need to know which bands are being used.
    • Added most of the utility for loading external reference catalogs into the CatalogMeasurementBaseTask base class
    • Add unit tests for task code in faro
    • Use consistent DM class naming convention for base classes
      • Started this
    • Development of reference catalog loading capability in faro with proper configuration options led to discovery and resolution of a bug in LoadReferenceCatalogTask. Thanks to Eli.
AOB

Next meeting  



List of tasks (Confluence)