Skip to end of metadata
Go to start of metadata

This page is currently under development: text may change at any time! The content has not yet been reviewed for accuracy and completeness.

This tutorial shows how to process a single image in two ways. The first example uses a simple image in a disk file, without the built-in machinery of a data repository, in order to generate a catalog of detected sources. The second part of this tutorial exposes users to the data access mechanisms of the Stack: the butler and a data repository. The latter example, while more complicated, is good preparation for the more advanced tutorials.

 

In This Chapter

Preliminaries

Load the LSST Environment

You must have the LSST Stack installed on your system (see LSST Stack Installation) to proceed. The commands listed in the code blocks below primarily assume you are using the bash shell; analogous commands for (t)csh should work as well. If you have not already done so, load the LSST environment:

source $INSTALL_DIR/loadLSST.bash          # bash users

where $INSTALL_DIR is the directory where the LSST Stack was installed. 

Process a Single Image File

obs_file Task Currently Broken

The obs_file task does not work for v8.0 of the LSST Stack. The recent upgrade for the CameraGeom library rendered the obs_file task incompatible with these underlying utilities. This will be fixed in a future release.

In the mean time, skip to Processing with a Repository.

Initial Steps

Download the obs_file package from the git repository. 

git clone https://github.com/LSST-nonproject/obs_file.git

This package is written purely in python, so it needs no compilation. Now setup the package in your current working directory

cd obs_file
setup -r .

It is handy to define an environment variable for the /bin directory of this package, and helpful to browse the task command-line options:

export OBSFILE_BIN=/path/to/obs_file/bin
$OBSFILE_BIN/processFile.py -h  # Generates brief summary of command-line options

Process the Image

While the task processFile.py has a number of options, most defaults are acceptable for this example.

processFile.py command-line options
% processFile.py -h
usage: processFile.py input [options]
positional arguments:
  input                 path to input data repository, relative to
                        $PIPE_INPUT_ROOT
optional arguments:
  -h, --help            show this help message and exit
  --calib CALIB         path to input calibration repository, relative to
                        $PIPE_CALIB_ROOT
  --output OUTPUT       path to output data repository (need not exist),
                        relative to $PIPE_OUTPUT_ROOT
  -c [NAME=VALUE [NAME=VALUE ...]], --config [NAME=VALUE [NAME=VALUE ...]]
                        config override(s), e.g. -c foo=newfoo bar.baz=3
  -C [CONFIGFILE [CONFIGFILE ...]], --configfile [CONFIGFILE [CONFIGFILE ...]]
                        config override file(s)
  -L LOGLEVEL, --loglevel LOGLEVEL
                        logging level
  -T [COMPONENT=LEVEL [COMPONENT=LEVEL ...]], --trace [COMPONENT=LEVEL [COMPONENT=LEVEL ...]]
                        trace level for component
  --debug               enable debugging output?
  --doraise             raise an exception on error (else log a message and
                        continue)?
  --logdest LOGDEST     logging destination
  --show [{config,data,tasks,run} [{config,data,tasks,run} ...]]
                        display the specified information to stdout and quit
                        (unless run is specified).
  -j PROCESSES, --processes PROCESSES
                        Number of processes to use
  --clobber-output      remove and re-create the output directory if it
                        already exists (safe with -j, but not all other forms
                        of parallel execution)
  --clobber-config      backup and then overwrite existing config files
                        instead of checking them (safe with -j, but not all
                        other forms of parallel execution)
  --id [KEY=VALUE1[^VALUE2[^VALUE3...] [KEY=VALUE1[^VALUE2[^VALUE3...] ...]]
                        data ID, e.g. --id calexp=XXX
Notes:
* --config, --configfile, --id, --trace and @file may appear multiple times;
    all values are used, in order left to right
* @file reads command-line options from the specified file:
    * data may be distributed among multiple lines (e.g. one option per line)
    * data after # is treated as a comment and ignored
    * blank lines and lines starting with # are ignored
* To specify multiple values for an option, do not use = after the option name:
    * wrong: --configfile=foo bar
    * right: --configfile foo bar

Here we will process an SDSS fpC file from Stripe 82 (an input file included in the Installing the Stack demo). Specify an output subdirectory (cleverly named output), and allow any existing configuration files to be overwritten. In this case, we need to set the gain and build a variance plane in order for the down-stream measurements to be performed. 

cd /path/to/your/fpC_example_file
$OBSFILE_BIN/processFile.py fpC-004192-r4-0300.fits \
  --output output --clobber-config \
  -c gain=2 doVariance=True 

Examine the Output Catalog

A new /output directory will be created if it did not exist before, and will include the following contents:

/output
    /config
        processFile.py        # parameter=value settings for processing
    /fpC-004192-r4-0300
        background.fits       # image background characterization
        icSrc.fits            # detected sources, initial pass
        src.fits              # final detected sources
    fpC-004192-r4-0300.fits   # calexp MEF image: science, mask, variance
    /schema                   # database schema for catalogs
        icSrc.fits            # schema for initial-pass sources
        scr.fits              # schema for detected sources

The /output/fpC-004192-r4-0300/src.fits FITS table contains the final catalog of 640 sources detected at 5-σ significance. Source brightnesses are measured in a variety of ways; the units are counts (corrected for gain) above background. Quality flags are given for each source, the meanings for which appears in the table header. The /output/fpC-004192-r4-0300.fits FITS MEF image is the fully qualified calexp image used in all of LSST processing, and includes image extensions for the science array, the quality mask, and the variance array. 

Processing with a Repository

It is possible to process a single file with the LSST Stack using built-in data access mechanisms. You will need two things:

  1. a data repository that contains the input image(s) and a registry of the contents 
  2. a directory containing Astrometry.net indexes for the sky coordinate range that covers your image

This example will describe the processing of an SDSS image in Stripe 82, for which the Astrometry.net indexes have already been built. If you want to process an image for another region of sky, see Building Astrometry.net Index Files for an example of how to build the indexes. The recipe for processing an image obtained with another camera is very similar (see., e.g., the tutorial Process PhoSim Images), provided the camera is one of those supported in the LSST Stack

This example makes use of a script in the tutorials package, which you can clone from the code repository: 

cd /path/to/tutorials/install/directory
git clone https://github.com/lsst-dm/tutorials.git
export DRPDEMO_BIN=$PWD/tutorials/sdssDrpTutorial/python

Begin by creating a working directory and a handy environment variable:

cd /path/to/working/directory
mkdir single && cd single
export DEMO_DIR=$PWD

Create a Data Repository

A repository is basically a directory structure that is understood by the software, with contents that include the input images, a mapper (i.e., an indication to the software of which camera model to use), and a registry of the relevant metadata. Create a directory for the input data repository, an environment variable for that location, and the mapper: 

mkdir input && cd input
mkdir runs
export DATA_DIR=$DEMO_DIR/input/runs
echo "lsst.obs.sdss.sdssMapper.SdssMapper" > runs/_mapper

We will use an image from the SDSS Stripe 82 (actually, one of the images used in the LSST Stack test demo) as input. Capture the field identifier in a file, and generate the URLs (using genRetrieveList.py) with which to retrieve the data from the SDSS archive: 

echo "--id run=4192 rerun=40 camcol=4 field=300 filter=r" > rawInput.txt
python DRPDEMO_BIN$genRetrieveList.py rawInput.txt retrieve.txt
wget -r -b -R "index.html*" -np -nH --cut-dirs=1 -P ./runs -i retrieve.txt

Note that using the above wget command preserves the organization of the input data from the SDSS archive, which is critical. The data consist of a number of files, including a mask and data quality information, which are needed for processing. Now setup some packages for processing, and create the registry for the data repository: 

setup obs_sdss
setup pipe_tasks
genInputRegistry.py ./runs
mv registry.sqlite3 ./runs

Install Astrometry.net Index Files

Install the pre-built Astrometry.net index files for SDSS Stripe 82, and setup the astrometry_net_data package to use these indexes. 

cd $DEMO_DIR
curl -O http://lsst-web.ncsa.illinois.edu/sdss-s12/sdss-2012-05-01-0.tgz
tar xzf sdss-2012-05-01-0.tgz
eups declare -r sdss-2012-05-01-0 astrometry_net_data sdss-2012-05-01-0
setup astrometry_net_data sdss-2012-05-01-0 --keep

Process the Image

Now fetch the processing configuration file (download: processConfig.py) and place it in $DEMO_DIR. This will set some default processing parameters that are appropriate for SDSS data. Finally, process the image and direct the output in the /calexp_dir subdirectory (which will be created if necessary). 

processCcdSdss.py $DATA_DIR/ --output ./calexp_dir --configfile ./processConfig.py @rawInput.txt

Examine the Output Catalog

The output is organized in a directory hierarchy similar to that of the input, namely by SDSS run/rerun/filter. The catalog of 648 sources detected at 5-σ significance is contained in:

DEMO_DIR$calexp_dir/sci-results/4192/4/r/src/src-004192-r4-0300.fits

which is a FITS binary table. 

  • No labels