From 3ab93dc944af426f6501f57cc2b2fe7fde7c2a1c Mon Sep 17 00:00:00 2001 From: fred3m Date: Wed, 15 Aug 2018 09:22:34 -0700 Subject: [PATCH 1/8] add blending tutorials from LSST 2018 blending workshop --- Deblending/lsst_stack_deblender.ipynb | 628 ++++++++++++++++++++++++++ Deblending/scarlet_tutorial.ipynb | 470 +++++++++++++++++++ 2 files changed, 1098 insertions(+) create mode 100755 Deblending/lsst_stack_deblender.ipynb create mode 100755 Deblending/scarlet_tutorial.ipynb diff --git a/Deblending/lsst_stack_deblender.ipynb b/Deblending/lsst_stack_deblender.ipynb new file mode 100755 index 00000000..4427f74a --- /dev/null +++ b/Deblending/lsst_stack_deblender.ipynb @@ -0,0 +1,628 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# LSST Stack Multiband Deblender Tutorial\n", + "\n", + "This tutorial is designed to illustrate how to execture the multiband deblender (*scarlet*) in the LSST stack. This includes a brief introduction to LSST stack objects including:\n", + "\n", + " - Geometry classes from `lsst.geom`, such as points and boxes.\n", + " - Higher-level astronomical primitives from `lsst.afw`, such as the `Image`, `Exposure`, and `Psf` classes.\n", + " - Our core algorithmic `Task` classes, including those for source detection, deblending, and measurement.\n", + " \n", + "We'll be working with coadded images made from Subaru Hyper Suprime-Cam (HSC) data in the COSMOS field. We've taken a recent LSST reprocessing of the HSC-SSP UltraDeep COSMOS field (see [this page](https://confluence.lsstcorp.org/display/DM/S18+HSC+PDR1+reprocessing) for information on that reprocessing, and [this page](https://hsc-release.mtk.nao.ac.jp/doc/) for the data), and added simulated stars from a scaled [SDSS catalog](http://www.sdss.org/dr14/data_access/value-added-catalogs/?vac_id=photometry-of-crowded-fields-in-sdss-for-galactic-globular-and-open-clusters). The result is a very deep image (deeper than the 10-year LSST Deep-Wide-Fast survey, though not as deep as LSST Deep Drilling fields will be) with both a large number of galaxies and region full of stars.\n", + "\n", + "This tutorial is based on Jim Bosch's globular cluster tutorial, however in it's present state *scarlet* is unable to process the crowded field (most likely) due to poor initial conditions for the sources in the field. So instead we use a region of the image outside of the cluster." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Imports\n", + "\n", + "We'll start with some standard imports of both LSST and third-party packages." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "from lsst.daf.persistence import Butler\n", + "from lsst.geom import Box2I, Box2D, Point2I, Point2D, Extent2I, Extent2D\n", + "from lsst.afw.image import Exposure, Image, PARENT, MultibandExposure, MultibandImage\n", + "from lsst.afw.detection import MultibandFootprint" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Reading Data\n", + "\n", + "We'll be retrieving data using the `Butler` tool, which manages where various datasets are stored on the filesystem (and can in principle manage datasets that aren't even stored as files, though all of these are).\n", + "\n", + "We start by creating a `Butler` instance, pointing it at a *Data Repository* (which here is just a root directory)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "butler = Butler(\"/project/jbosch/tutorials/lsst2018/data\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Datasets managed by a butler are identified by a dictionary *Data ID* (specifying things like the visit number or sky patch) and a string *DatasetType* (such as a particular image or catalog). Different DatasetTypes have different keys, while different instances of the same Dataset Type have different values. All of the datasets we use in this tutorial will correspond to the same patch of sky, so they'll have at least the keys in the dictionary in the next cell (they will also have `filter`, but with different values):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataId = {\"tract\": 9813, \"patch\": \"4,4\"}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now use those to load a set of *grizy* coadds, which we'll put directly in a dictionary. The result of each `Butler.get` call is in this case an `lsst.afw.image.Exposure` object, an image that actually contains three \"planes\" (the main image, a bit mask, and a variance image) as well as many other objects that describe the image, such as its PSF and WCS. Note that we (confusingly) use `Exposures` to hold coadd images as well as true single-exposure images, but combine them into a `MultibandExposure`, which contains an exposure in each band.\n", + "\n", + "The DatasetType here is `deepCoadd_calexp` (a coadd on which we've already done some additional processing, such as subtracting the background and setting some mask values), and the extra `filter` argument gets appended to the Data ID." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "filters = \"grizy\"\n", + "coadds = [butler.get(\"deepCoadd_calexp\", dataId, filter=\"HSC-{}\".format(f.upper())) for f in filters]\n", + "coadds = MultibandExposure.fromExposures(filters, coadds)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Making and displaying color composite images\n", + "\n", + "We'll start by just looking at the images, as 3-color composites. We'll use astropy to build those as a nice way to demonstrate how to get NumPy arrays from the `MultibandImage` objects (the images in `coadds`). (LSST also has code to make 3-color composites using the same algorithm, and in fact the Astropy implementation is based on ours, but now that it's in Astropy we'll probably retire ours.)\n", + "\n", + "We'll just use matplotlib to display the images themselves. We'll use Firefly for other image display tasks later, but while Firefly itself supports color-composites, work on our preferred composition algorithm is still in progress, and we haven't quite finished connecting that functionality to the Python client we'll demonstrate here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from astropy.visualization import make_lupton_rgb\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll use the following function a few times to display color images. It's worth reading through the implementation carefully to see what's going on." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def showRGB(image, bgr=\"gri\", ax=None, fp=None, figsize=(8,8), stretch=1, Q=10):\n", + " \"\"\"Display an RGB color composite image with matplotlib.\n", + " \n", + " Parameters\n", + " ----------\n", + " image : `MultibandImage`\n", + " `MultibandImage` to display.\n", + " bgr : sequence\n", + " A 3-element sequence of filter names (i.e. keys of the exps dict) indicating what band\n", + " to use for each channel. If `image` only has three filters then this parameter is ignored\n", + " and the filters in the image are used.\n", + " ax : `matplotlib.axes.Axes`\n", + " Axis in a `matplotlib.Figure` to display the image.\n", + " If `axis` is `None` then a new figure is created.\n", + " fp: `lsst.afw.detection.Footprint`\n", + " Footprint that contains the peak catalog for peaks in the image.\n", + " If `fp` is `None` then no peak positions are plotted.\n", + " figsize: tuple\n", + " Size of the `matplotlib.Figure` created.\n", + " If `ax` is not `None` then this parameter is ignored.\n", + " stretch: int\n", + " The linear stretch of the image.\n", + " Q: int\n", + " The Asinh softening parameter.\n", + " \"\"\"\n", + " # If the image only has 3 bands, reverse the order of the bands to produce the RGB image\n", + " if len(image) == 3:\n", + " bgr = image.filters\n", + " # Extract the primary image component of each Exposure with the .image property, and use .array to get a NumPy array view.\n", + " rgb = make_lupton_rgb(image_r=image[bgr[2]].array, # numpy array for the r channel\n", + " image_g=image[bgr[1]].array, # numpy array for the g channel\n", + " image_b=image[bgr[0]].array, # numpy array for the b channel\n", + " stretch=stretch, Q=Q) # parameters used to stretch and scale the pixel values\n", + " if ax is None:\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = fig.add_subplot(1,1,1)\n", + " \n", + " # Exposure.getBBox() returns a Box2I, a box with integer pixel coordinates that correspond to the centers of pixels.\n", + " # Matplotlib's `extent` argument expects to receive the coordinates of the edges of pixels, which is what\n", + " # this Box2D (a box with floating-point coordinates) represents.\n", + " integerPixelBBox = image[bgr[0]].getBBox()\n", + " bbox = Box2D(integerPixelBBox)\n", + " ax.imshow(rgb, interpolation='nearest', origin='lower', extent=(bbox.getMinX(), bbox.getMaxX(), bbox.getMinY(), bbox.getMaxY()))\n", + " if fp is not None:\n", + " for peak in fp.getPeaks():\n", + " ax.plot(peak.getIx(), peak.getIy(), \"bx\", mew=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice that we can slice `MultibandImage` objects (as well a `MultibandExposure` objects) along the filter dimension using the filter names as indices. Like `Exposure` objects, `MultibandExposure` objects have `image`, `mask`, and `variance` properties that contain the image, mask plane, and variance of the `Exposure` respectively. For now we will only worry about the `image` property, although internal deblending and measurement algorithms make use of all three objects (when available)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(coadds[:\"z\"].image, figsize=(10, 10))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(coadds[\"i\":].image, figsize=(10, 10))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Those images are a full \"patch\", which is our usual unit of processing for coadds - it's about the same size as a single LSST sensor (exactly the same in pixels, smaller in terms of area because these use HSC's smaller pixel scale). That's a bit unweildy (just because waiting for processing to happen isn't fun in a tutorial setting), so we'll reload our dict with sub-images centered on the region of interest.\n", + "\n", + "Note that we can load the sub-images directly with the `butler`, by appending `_sub` to the DatasetType and passing a `bbox` argument. If you want to see the region of the image with the cluster, use `clusterBBox` below, however as mentioned above, that region is too memory intensive for the current version of *scarlet*. Instead use `sampleBBox` to select a sub-region of the image (note that we add a small frame around each blend to include more background regions, which are important for the detection algorithm)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "frame = 50\n", + "clusterBBox = Box2I(Point2I(18325, 17725), Extent2I(400, 350))\n", + "\n", + "#sampleBBox = Box2I(Point2I(18699-frame, 17138-frame), Extent2I(93+2*frame, 104+2*frame))\n", + "#sampleBBox = Box2I(Point2I(16424-frame, 17806-frame), Extent2I(55+2*frame, 62+2*frame))\n", + "#sampleBBox = Box2I(Point2I(17838-frame, 18945-frame), Extent2I(111+2*frame, 103+2*frame))\n", + "sampleBBox = Box2I(Point2I(19141-frame, 18228-frame), Extent2I(63+2*frame, 87+2*frame))\n", + "\n", + "subset = coadds[:, sampleBBox]\n", + "# Due to a bug in the code the PSF isn't copied properly.\n", + "# The code below copies the PSF into the `MultibandExposure`,\n", + "# but will be unecessary in the future\n", + "for f in subset.filters:\n", + " subset[f].setPsf(coadds[f].getPsf())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(subset.image)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Basic Processing\n", + "\n", + "Now we'll try the regular LSST processing tasks, with a simpler configuration than we usually use to process coadds, just to avoid being distracted by complexity. This includes\n", + "\n", + " - Detection (`SourceDetectionTask`): given an `Exposure`, find above-threshold regions and peaks within them (`Footprints`), and create a *parent* source for each `Footprint`.\n", + " - Deblending (`MultibandDeblendTask`): given a `MultibandExposure` and a catalog of parent sources, create a *child* source for each peak in every `Footprint` that contains more than one peak. Each child source is given a `HeavyFootprint`, which contains both the pixel region that source covers and the fractional pixel values associated with that source. A `SourceDeblendTask` is also available using the single band SDSS-HSC deblender that takes a single band `Exposure`).\n", + " - Measurment (`SingleFrameMeasurementTask`): given an `Exposure` and a catalog of sources, run a set of \"measurement plugins\" on each source, using deblended pixel values if it is a child. Notice that measurement is still performed on single band catalogs, since none of the measurement algorithms work for multiband data.\n", + "\n", + "We'll start by importing these, along with the `SourceCatalog` class we'll use to hold the outputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from lsst.meas.algorithms import SourceDetectionTask\n", + "from lsst.meas.deblender import MultibandDeblendTask\n", + "from lsst.meas.base import SingleFrameMeasurementTask\n", + "from lsst.afw.table import SourceCatalog" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll now construct all of these `Tasks` before actually running any of them. That's because `SourceDeblendTask` and `SingleFrameMeasurementTask` are constructed with a `Schema` object that records what fields they'll produce, and they modify that schema when they're constructed by adding columns to it. When we run the tasks later, they'll need to be given a catalog that includes all of those columns, **but we can't add columns to a catalog that already exists**.\n", + "\n", + "To recap, the sequence looks like this:\n", + "\n", + " 1. Make a (mostly) empty schema.\n", + " 2. Construct all of the `Task`s (in the order you plan to run them), which adds columns to the schema.\n", + " 3. Make a `SourceCatalog` object from the *complete* schema.\n", + " 4. Pass the same `SourceCatalog` object to each `Task` when you run it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "schema = SourceCatalog.Table.makeMinimalSchema()\n", + "\n", + "detectionTask = SourceDetectionTask(schema=schema)\n", + "\n", + "config = MultibandDeblendTask.ConfigClass()\n", + "config.usePsfConvolution = True\n", + "config.conserveFlux = True\n", + "config.maxIter = 100\n", + "deblendTask = MultibandDeblendTask(schema=schema, config=config)\n", + "\n", + "# We'll customize the configuration of measurement to just run a few plugins.\n", + "# The default list of plugins is much longer (and hence slower).\n", + "measureConfig = SingleFrameMeasurementTask.ConfigClass()\n", + "measureConfig.plugins.names = [\"base_SdssCentroid\", \"base_PsfFlux\", \"base_SkyCoord\"]\n", + "# \"Slots\" are aliases that provide easy access to certain plugins.\n", + "# Because we're not running the plugin these slots refer to by default,\n", + "# we need to disable them in the configuration.\n", + "measureConfig.slots.apFlux = None\n", + "measureConfig.slots.instFlux = None\n", + "measureConfig.slots.shape = None\n", + "measureConfig.slots.modelFlux = None\n", + "measureConfig.slots.calibFlux = None\n", + "measureTask = SingleFrameMeasurementTask(config=measureConfig, schema=schema)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first step we'll run is detection, which actually returns a new `SourceCatalog` object rather than working on an existing one.\n", + "\n", + "Instead, it takes a `Table` object, which is sort of like a factory for records. We won't use it directly after this, and it isn't actually necessary to make a new `Table` every time you run `MultibandDetectionTask` (but you can only create one after you're done adding columns to the schema).\n", + "\n", + "`Task`s that return anything do so via a `lsst.pipe.base.Struct` object, which is just a simple collection of named attributes. The only return values we're interested is `sources`. That's our new `SourceCatalog`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "table = SourceCatalog.Table.make(schema)\n", + "detectionResult = detectionTask.run(table, subset[\"r\"])\n", + "catalog = detectionResult.sources" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's take a quick look at what's in that catalog. First off, we can look at its schema:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "catalog.schema" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that this includes a lot of columns that were actually added by the deblend or measurement steps; those will all still be blank (`0` for integers or flags, `NaN` for floating-point columns).\n", + "\n", + "In fact, the only columns filled by `SourceDetectionTask` are the IDs. But it also attaches `Footprint` objects, which don't appear in the schema. You can retrieve the `Footprint` by calling `getFootprint()` on a row:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "footprint = catalog[0].getFootprint()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`Footprints` have two components:\n", + " - a `SpanSet`, which represents an irregular region on an image via a list of (y, x0, x1) `Spans`;\n", + " - a `PeakCatalog`, a slightly different kind of catalog whose rows represent peaks within that `Footprint`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(footprint.getSpans())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(footprint.getPeaks())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we actually look at the footprints in the catalog we see that some have only a single peak, while others have multiple peaks that need to be deblended.\n", + "\n", + "To display only the pixels contained in the footprint (and not other pixels in the bounding box) we create a `MultibandFootprint`, which is a `HeavyFootprint` that contains a `SpanSet`, `PeakCatalog`, and `flux` values for all of the pixels in the `SpanSet`. In this case the `flux` is the total measured flux in the image, since no deblending has taken place yet." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "for src in catalog:\n", + " fp = src.getFootprint()\n", + " img = coadds[:,fp.getBBox()].image\n", + " mfp = MultibandFootprint.fromImages(coadds.filters, image=img, footprint=fp)\n", + " showRGB(mfp.getImage().image, fp=fp, figsize=(3,3))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It's worth noting that while the peaks *can* have both an integer-valued position and a floating-point position, they're the same right now; `SourceDetectionTask` currently just finds the pixels that are local minima and doesn't try to find their sub-pixel locations. That's left to the centroider, which is part of the measurement stage.\n", + "\n", + "Before we can get to that point, we need to run the deblender:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fluxCatalog, templateCatalog = deblendTask.run(coadds, catalog)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`MultibandDeblendTask` always returns two catalogs, a `templateCatalog` that contains the model outputs from *scarlet* and a `fluxCatalog`, which uses the *scarlet* models as weights to redistribute the flux from the input image (in other words they contain flux-conserved models). If `MultibandDeblendTask.config.saveTemplates` is `False`, then `templateCatalog` will be `None`. Similarly, if `MultibandDeblendTask.config.conserveFlux` is `False` then the `fluxCatalog` will be `None` (and the code will run slightly faster, since it doesn't have to reweight the flux, however this is a small fraction of the processing time).\n", + "\n", + "The deblender itself sets the `parent` column for each source, which is `0` for objects with no parent, and all of the columns that begin with `deblend_` and also adds new rows to the catalog for each child. It does *not* remove the parent rows it created those child rows from, and this is intentional, because we want to measure both \"interpretations\" of the blend family: one in which there is only one object (the parent version) and one in which there are several (the children). Before doing any science with the outputs of an LSST catalog, it's important to remove one of those interpretations (typically the parent one). That can be done by looking at the `deblend_nChild` and `parent` fields:\n", + "\n", + " - `parent` is the ID of the source from which this was deblended, or `0` if the source is itself a parent.\n", + " - `deblend_nChild` is the number of child sources this source has (so it's `0` for sources that are themselves children or were never blended).\n", + " \n", + "Together, these define two particularly useful filters:\n", + "\n", + " - `deblend_nChild == 0`: never-blended object or de-blended child\n", + " - `deblend_nChild == 0 and parent == 0`: never-blended object\n", + " \n", + "The first is what you'll usually want to use; the second is what to use if you're willing to throw away some objects (possibly many) because you don't trust the deblender.\n", + "\n", + "The last processing step for our purposes is running measurement, which must be done on each catalog, in each band (if we want measurements for all of them):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "measureTask.run(templateCatalog[\"r\"], coadds['r'])\n", + "measureTask.run(templateCatalog[\"i\"], coadds['i'])\n", + "measureTask.run(fluxCatalog[\"r\"], coadds['r'])\n", + "measureTask.run(fluxCatalog[\"i\"], coadds['i'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Due to an unfortunate bug in the deblender task, the resulting catalogs are not contiguous and we need to copy them into new objects to use them appropriately. This step can be avoided in the near future." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import lsst.afw.table as afwTable\n", + "\n", + "for f in filters:\n", + " _catalog = afwTable.SourceCatalog(templateCatalog[f].table.clone())\n", + " _catalog.extend(templateCatalog[f], deep=True)\n", + " templateCatalog[f] = _catalog\n", + " _catalog = afwTable.SourceCatalog(fluxCatalog[f].table.clone())\n", + " _catalog.extend(fluxCatalog[f], deep=True)\n", + " fluxCatalog[f] = _catalog" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since we care about deblending (for the sake of this tutorial) lets look at the results from the 13th blend displayed above. We use the `HeavyFootprint`s from the catalog sources that have the same parent (parent 13 from above) to build a model for the entre scene, and to compare the results of the flux conserved and *scarlet* models. In the process we look at both the *scarlet* and flux conserved models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "from lsst.afw.detection import MultibandFootprint\n", + "from lsst.afw.image import MultibandImage\n", + "\n", + "# Use the 13th parent in the blend\n", + "# Note: this is not the parent ID, but the 13th source in the catalog\n", + "parentIdx = 13\n", + "\n", + "# Create empty multiband images to model the entire scene\n", + "templateModel = MultibandImage.fromImages(coadds.filters,\n", + " [Image(fluxCatalog[\"r\"][parentIdx].getFootprint().getBBox(), dtype=np.float32)\n", + " for b in range(len(filters))])\n", + "fluxModel = MultibandImage.fromImages(coadds.filters,\n", + " [Image(fluxCatalog[\"r\"][parentIdx].getFootprint().getBBox(), dtype=np.float32)\n", + " for b in range(len(filters))])\n", + "\n", + "# Only use the subset catalogs with the same parent\n", + "parentId = fluxCatalog[\"r\"][parentIdx].get(\"id\")\n", + "fluxChildren = {b: fluxCatalog[b][fluxCatalog[b].get(\"parent\")==parentId] for b in filters}\n", + "templateChildren = {b: templateCatalog[b][templateCatalog[b].get(\"parent\")==parentId] for b in filters}\n", + "assert(len(fluxChildren)==len(templateChildren))\n", + "\n", + "for n in range(len(templateChildren[\"r\"])):\n", + " # Add the source model to the model of the entire scene\n", + " fp = MultibandFootprint(coadds.filters, [templateChildren[b][n].getFootprint() for b in filters])\n", + " _fp = MultibandFootprint(coadds.filters, [fluxChildren[b][n].getFootprint() for b in filters])\n", + " templateModel[:, fp.getBBox()].array += fp.getImage(fill=0).image.array\n", + " fluxModel[:, _fp.getBBox()].array += _fp.getImage(fill=0).image.array\n", + "\n", + " # Show the model\n", + " fig = plt.figure(figsize=(6, 3))\n", + " ax = [fig.add_subplot(1, 2, n+1) for n in range(2)]\n", + " ax[0].set_title(\"scarlet\")\n", + " ax[1].set_title(\"flux conserved\")\n", + " showRGB(fp.getImage().image, ax=ax[0])\n", + " showRGB(_fp.getImage().image, ax=ax[1])\n", + " plt.show()\n", + "\n", + "templateResidual = MultibandImage(coadds.filters,\n", + " coadds[:, templateModel.getBBox()].image.array - templateModel.array)\n", + "fluxResidual = MultibandImage(coadds.filters,\n", + " coadds[:, fluxModel.getBBox()].image.array - fluxModel.array)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally we look at the full models and the residuals. As expected, there are no residuals for the flux conserved model since all of the flux in the image (that is within the footprint) is added to one of the sources. In this particular case that works fine, but in instances where one or more sources were not detected this can cause one source to have its flux contaminated with its neighbor." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for title, model, residual in [[\"scarlet\", templateModel, templateResidual], [\"flux conserved\", fluxModel, fluxResidual]]:\n", + " fig = plt.figure(figsize=(15,8))\n", + " ax = [fig.add_subplot(1, 2, n+1) for n in range(2)]\n", + " ax[0].set_title(\"{0} model\".format(title))\n", + " ax[1].set_title(\"{0} residual\".format(title))\n", + " showRGB(model, ax=ax[0])\n", + " showRGB(residual ,ax=ax[1], Q=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Exercises\n", + "\n", + "- Use some of the other`sampleBBox` regions and run through the code again, from source detection through measurment and blending displays. Don't foget to change the parent index to view only the children of the correct blend.\n", + "- Play around with other *scarlet* constraints, such as adding an L0 penalty. This should help the code execute faster, as one of the main reasons for the slow down is unecessarily large boxes surrounding the smaller sources. See https://github.com/lsst/meas_deblender/blob/master/python/lsst/meas/deblender/deblend.py#L462-L545 for a description of the other configuration options that can be passed to the deblender\n", + "\n", + "Note: in order to try out different constraints the `meas_deblender` package requires an upgrade that has not been pushed to master yet. To use the latest changes, from your terminal session you must execute the following steps:\n", + "\n", + "```bash\n", + "~$ cp /project/fred3m/tutorials/lsst2018/.user_setups ~/notebooks\n", + "~$ source ~/notebooks/.user_setups\n", + "```\n", + "\n", + "You will have to restart the kernel for this notebook session in order for the changes to take place." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Deblending/scarlet_tutorial.ipynb b/Deblending/scarlet_tutorial.ipynb new file mode 100755 index 00000000..4ea99f05 --- /dev/null +++ b/Deblending/scarlet_tutorial.ipynb @@ -0,0 +1,470 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Scarlet deblending Tutorial\n", + "\n", + "The purpose of this tutorial is to familiarize the user with the basics of using *scarlet* to model blended scenes, and how tweaking various objects and parameters affects the resulting model. A tutorial that is more specific to using scarlet in the context of the LSST DM Science Pipelines is also available.\n", + "\n", + "Before attempting this tutorial it will be useful to read the [introduction](http://scarlet.readthedocs.io/en/latest/user_docs.html) to the *scarlet* User Guide, and many of the exercises below may require referencing the *scarlet* [docs](http://scarlet.readthedocs.io/en/latest/index.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Import the necessary libraries\n", + "import os\n", + "\n", + "%matplotlib inline\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "# don't interpolate the pixels\n", + "matplotlib.rc('image', interpolation='none')\n", + "\n", + "import numpy as np\n", + "import scarlet\n", + "import scarlet.display" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Display functions\n", + "\n", + "Below are several usful functions used throughout this tutorial to visualize the data and models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Display the sources\n", + "def display_sources(sources, image, norm=None, subset=None, combine=False, show_sed=True, filter_indices=None):\n", + " \"\"\"Display the data and model for all sources in a blend\n", + "\n", + " This convenience function is used to display all (or a subset) of\n", + " the sources and (optionally) their SED's.\n", + " \"\"\"\n", + " if subset is None:\n", + " # Show all sources in the blend\n", + " subset = range(len(sources))\n", + " if filter_indices is None:\n", + " filter_indices = [3,2,1]\n", + " for m in subset:\n", + " # Load the model for the source\n", + " src = sources[m]\n", + " model = [comp.get_model() for comp in src]\n", + "\n", + " # Select the image patch the overlaps with the source and convert it to an RGB image\n", + " img_rgb = scarlet.display.img_to_rgb(image[src[0].bb], filter_indices=filter_indices, norm=norm)\n", + "\n", + " # Build a model for each component in the model\n", + " rgb = []\n", + " for _model in model:\n", + " # Convert the model to an RGB image\n", + " _rgb = scarlet.display.img_to_rgb(_model, filter_indices=filter_indices, norm=norm)\n", + " rgb.append(_rgb)\n", + "\n", + " # Display the image and model\n", + " figsize = [6,3]\n", + " columns = 2\n", + " # Calculate the number of columns needed and shape of the figure\n", + " if show_sed:\n", + " figsize[0] += 3\n", + " columns += 1\n", + " if not combine:\n", + " figsize[0] += 3*(len(model)-1)\n", + " columns += len(model)-1\n", + " # Build the figure\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = [fig.add_subplot(1,columns,n+1) for n in range(columns)]\n", + " ax[0].imshow(img_rgb)\n", + " ax[0].set_title(\"Data: Source {0}\".format(m))\n", + " for n, _rgb in enumerate(rgb):\n", + " ax[n+1].imshow(_rgb)\n", + " if combine:\n", + " ax[n+1].set_title(\"Initial Model\")\n", + " else:\n", + " ax[n+1].set_title(\"Component {0}\".format(n))\n", + " if show_sed:\n", + " for comp in src:\n", + " ax[-1].plot(comp.sed)\n", + " ax[-1].set_title(\"SED\")\n", + " ax[-1].set_xlabel(\"Band\")\n", + " ax[-1].set_ylabel(\"Intensity\")\n", + " # Mark the current source in the image\n", + " y,x = src[0].center\n", + " ax[0].plot(x-src[0].bb[2].start, y-src[0].bb[1].start, 'x', color=\"#5af916\", mew=2)\n", + " plt.tight_layout()\n", + " plt.show()\n", + "\n", + "def display_model_residual(images, blend, peaks, norm, filter_indices=None):\n", + " \"\"\"Display the data, model, and residual for a given result\n", + " \"\"\"\n", + " if filter_indices is None:\n", + " filter_indices = [3,2,1]\n", + " model = blend.get_model()\n", + " residual = images-model\n", + " print(\"Data range: {0:.3f} to {1:.3f}\\nresidual range: {2:.3f} to {3:.3f}\\nrms: {4:.3f}\".format(\n", + " np.min(images),\n", + " np.max(images),\n", + " np.min(residual),\n", + " np.max(residual),\n", + " np.sqrt(np.std(residual)**2+np.mean(residual)**2)\n", + " ))\n", + " # Create RGB images\n", + " img_rgb = scarlet.display.img_to_rgb(images, filter_indices=filter_indices, norm=norm)\n", + " model_rgb = scarlet.display.img_to_rgb(model, filter_indices=filter_indices, norm=norm)\n", + " residual_norm = scarlet.display.Linear(img=residual)\n", + " residual_rgb = scarlet.display.img_to_rgb(residual, filter_indices=filter_indices, norm=residual_norm)\n", + "\n", + " # Show the data, model, and residual\n", + " fig = plt.figure(figsize=(15,5))\n", + " ax = [fig.add_subplot(1,3,n+1) for n in range(3)]\n", + " ax[0].imshow(img_rgb)\n", + " ax[0].set_title(\"Data\")\n", + " ax[1].imshow(model_rgb)\n", + " ax[1].set_title(\"Model\")\n", + " ax[2].imshow(residual_rgb)\n", + " ax[2].set_title(\"Residual\")\n", + " for k,component in enumerate(blend.components):\n", + " y,x = component.center\n", + " #px, py = peaks[k]\n", + " ax[0].plot(x, y, \"gx\")\n", + " #ax[0].plot(px, py, \"rx\")\n", + " ax[1].text(x, y, k, color=\"r\")\n", + " plt.show()\n", + "\n", + "def show_psfs(psfs, filters, norm=None):\n", + " rows = int(np.ceil(len(psfs)/3))\n", + " columns = min(len(psfs), 3)\n", + " figsize = (45/columns, rows*5)\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = [fig.add_subplot(rows, columns, n+1) for n in range(len(psfs))]\n", + " for n, psf in enumerate(psfs):\n", + " im = ax[n].imshow(psf, norm=norm)\n", + " ax[n].set_title(\"{0}-band PSF\".format(filters[n]))\n", + " plt.colorbar(im, ax=ax[n])\n", + " plt.show()\n", + "\n", + "def display_diff_kernels(psf_blend, diff_kernels):\n", + " model = psf_blend.get_model()\n", + " for b, component in enumerate(psf_blend.components):\n", + " fig = plt.figure(figsize=(15,2.5))\n", + " ax = [fig.add_subplot(1,4,n+1) for n in range(4)]\n", + " # Display the psf\n", + " ax[0].set_title(\"psf\")\n", + " _img = ax[0].imshow(psfs[b])\n", + " fig.colorbar(_img, ax=ax[0])\n", + " # Display the model\n", + " ax[1].set_title(\"modeled psf\")\n", + " _model = np.ma.array(model[b], mask=model[b]==0)\n", + " _img = ax[1].imshow(_model)\n", + " fig.colorbar(_img, ax=ax[1])\n", + " # Display the difference kernel\n", + " ax[2].set_title(\"difference kernel\")\n", + " _img = ax[2].imshow(np.ma.array(diff_kernels[b], mask=diff_kernels[b]==0))\n", + " fig.colorbar(_img, ax=ax[2])\n", + " # Display the residual\n", + " ax[3].set_title(\"residual\")\n", + " residual = psfs[b]-model[b]\n", + " vabs = np.max(np.abs(residual))\n", + " _img = ax[3].imshow(residual, vmin=-vabs, vmax=vabs, cmap='seismic')\n", + " fig.colorbar(_img, ax=ax[3])\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Load and Display the data\n", + "\n", + "The `file_path` points to a directory with 147 HSC blends from the COSMOS field detected by the LSST pipeline. Changing `idx` below will select a different blend." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Load the sample images\n", + "idx = 53\n", + "file_path = \"/project/fred3m/code/testdata_deblender/real_data/hsc_cosmos/not_matched\"\n", + "files = os.listdir(file_path)\n", + "data = np.load(os.path.join(file_path, files[idx]))\n", + "image = data[\"images\"]\n", + "wmap = data[\"weights\"]\n", + "peaks = data[\"peaks\"]\n", + "psfs = data[\"psfs\"]\n", + "filters = [\"G\", \"R\", \"I\", \"Z\", \"Y\"]\n", + "# Only a rough estimate of the background is needed\n", + "# to initialize and resize the sources\n", + "bg_rms = np.std(image, axis=(1,2))\n", + "print(\"Background RMS: {0}\".format(bg_rms))\n", + "\n", + "# Use Asinh scaling for the images\n", + "norm = scarlet.display.Asinh(img=image, Q=10)\n", + "# Map i,r,g -> RGB\n", + "filter_indices = [3,2,1]\n", + "# Convert the image to an RGB image\n", + "img_rgb = scarlet.display.img_to_rgb(image, filter_indices=filter_indices, norm=norm)\n", + "plt.imshow(img_rgb)\n", + "plt.title(\"Image: {0}\".format(idx))\n", + "for src in peaks:\n", + " plt.plot(src[0], src[1], \"rx\", mew=2)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Initializing Sources\n", + "\n", + "Astrophysical objects are modeled in scarlet as a collection of components, where each component has a single SED that is constant over it's morphology (band independent intensity). So a single source might have multiple components, like a bulge and disk, or a single component.\n", + "\n", + "The different classes that inherit from `Source` mainly differ in how they are initialized, and otherwise behave similarly during the optimization routine. This section illustrates the differences between different source initialization classes.\n", + "\n", + "The simplest source is a single component intialized with only a single pixel (at the center of the object) turned on.\n", + "\n", + "### *WARNING* \n", + "Scarlet accepts source positions using the numpy/C++ convention of (y,x), which is different than the astropy and LSST stack convention of (x,y)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sources = [scarlet.PointSource((peak[1], peak[0]), image) for peak in peaks]\n", + "\n", + "# Display the initial guess for each source\n", + "display_sources(sources, image, norm=norm)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise:\n", + "\n", + "* Experiment with the above code by using `ExtendedSource`, which initializes each object as a single component with maximum flux at the peak that falls off monotonically and has 180 degree symmetry; and using `MultiComponentSource`, which models a source as two components (a bulge and a disk) that are each symmetric and montonically decreasing from the peak.\n", + "\n", + "# Deblending a scene\n", + "\n", + "The `Blend` class contains the list of sources, the image, and any other configuration parameters necessary to fit the data, including routines to fit the center positions and resize the bounding box containing the sources (if necessary). Once a blend has been initialized with a list of sources, the image and background RMS values must be set (the background RMS is used to determine when to truncate the bounding box around a source)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "blend = scarlet.Blend(sources)\n", + "blend.set_data(image, bg_rms=bg_rms)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we can fit a model, given a maximum number of iterations and the relative error required for convergence." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "blend.fit(100, 1e-2)\n", + "print(\"Deblending completed in {0} iterations\".format(blend.it))\n", + "display_model_residual(image, blend, peaks, norm)\n", + "display_sources(sources, image, norm)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Experiment by running the above code using different source models (for example `ExtendedSource`) to see how initializtion affects the belnding results.\n", + "\n", + "* The code above initialized the sources at their exact centers. Try offsetting the initial positions by `0.5` pixels in `x` and/or `y` and passing a `shift_center=0` argument when initializing the source. This prevents the source from updating its position, so notice how that affects the resulting model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Constraints\n", + "\n", + "The above models used the default constraints: perfect symmetry and a weighted monotonicity that decreases from the peak. So each source is defined (internally during initialization) with the constraints" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "import scarlet.constraint as sc\n", + "constraints = (sc.SimpleConstraint(),\n", + " sc.DirectMonotonicityConstraint(use_nearest=False),\n", + " sc.DirectSymmetryConstraint())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "where `SimpleConstraint` forces the SED and morphology to be non-negative, the SED to be normalized to unity, and the peak to have some (minimal) flux at the center." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Go back to the source initialization cell and pass a custom set of constraints. For example, pass `DirectSymmetryConstraint` a number between 0 and 1 to set the level of symmetry required, or eliminate the symmetry constraint altogether and see how that affects deblending.\n", + "\n", + "* Set `use_nearest=True` in the `DirectMonotonicityConstraint`.\n", + "\n", + "* Add `L0Constraint` or `L1Constraint` to the list of constraints and observe the results." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Configuration\n", + "\n", + "There are additional configuration paramters that can be used to initialize a source, as described in http://scarlet.readthedocs.io/en/latest/config.html#Configuration-(scarlet.config).\n", + "\n", + "## Exercises\n", + "\n", + "* Initialize the sources with a custom configuration where `refine_skip=2`, which updates the positions and box sizes on every other step, and see how the results compare" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# PSF Deconvolution\n", + "\n", + "When analyzing real images the PSF will be different in each band unless they have been PSF matched. In general deblending should not be performed on PSF matched coadds, as matching will increase the blending in bands with better seeing. Instead scarlet can be used to build a deconvolved model which is a more sparse (and less blended) representation of the data, and convolve the model in each band to compare to the input data.\n", + "\n", + "To initialize a source with a PSF, pass the PSF as an input to the new source:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "scarlet.ExtendedSource(peaks[0], image, bg_rms, psf=psfs)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Partial PSF Deconvolution\n", + "\n", + "As discussed in the tutorial http://scarlet.readthedocs.io/en/latest/psf_matching.html, the data is noisy and the fully deconvolved scene is undersampled, making the application of the constraints and and full convolution kernel unstable and prone to biases. Instead we can create a target PSF and model the sources in the partially deconvolved target PSF scene.\n", + "\n", + "First we need to specify the target PSF. *scarlet* includes a `fit_target_psf` function to fit the PSF in each band to either a `moffat`, `gaussian`, or `double_gaussian` function. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import scarlet.psf_match\n", + "\n", + "show_psfs(psfs, filters)\n", + "\n", + "# Find the target PSF\n", + "target_psf = scarlet.psf_match.fit_target_psf(psfs, scarlet.psf_match.moffat)\n", + "plt.imshow(target_psf)\n", + "plt.title(\"target PSF\")\n", + "plt.colorbar()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once we have the target PSF we can find the difference kernel in each band using *scarlet*. The `build_diff_kernels` function basically treats the PSF image as a blend, where the PSF in each band is a monochromatic source, and fits the difference kernels using the minimum number of pixels necessary." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "diff_kernels, psf_blend = scarlet.psf_match.build_diff_kernels(psfs, target_psf)\n", + "display_diff_kernels(psf_blend, diff_kernels)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Try building the difference kernels while varying the parameters in `build_diff_kernels`, for example using larger and smaller values for `l0_thresh`.\n", + "\n", + "* Go back up to source initialization and use `psf=psfs` to fully deconvolve the scene and fit the blend\n", + "\n", + "* Try the same thing but set `psf=diff_kernels` for each source to partially deconvolve the scene." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From 4c5158c5423812b8fc793b30b7ada24f82235f11 Mon Sep 17 00:00:00 2001 From: Phil Marshall Date: Wed, 15 Aug 2018 09:55:14 -0700 Subject: [PATCH 2/8] Deblending in the READMEs --- Deblending/README.rst | 36 ++++++++++++++++++++++++++++++++++++ README.md | 5 +++-- 2 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 Deblending/README.rst diff --git a/Deblending/README.rst b/Deblending/README.rst new file mode 100644 index 00000000..5b422d3a --- /dev/null +++ b/Deblending/README.rst @@ -0,0 +1,36 @@ +Deblending +========== + +This folder contains a set of tutorial notebooks exploring the deblending of LSST objects. See the index table below for links to the notebook code, and an auto-rendered view of the notebook with outputs. + + +.. list-table:: + :widths: 10 20 10 10 + :header-rows: 1 + + * - Notebook + - Short description + - Links + - Owner + + + * - **SCARLET Tutorial** + - Introduction to the SCARLET deblender, how to configure and run it. + - `ipynb `_, + `rendered `_ + + .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/scarlet_tutorial.svg + :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/scarlet_tutorial.log + + - `Fred Moolekamp `_ + + + * - **Deblending in DRP ** + - Where and how the deblending happens, in the DRP pipeline. + - `ipynb `_, + `rendered `_ + + .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/lsst_stack_deblender.svg + :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/lsst_stack_deblender.log + + - `Fred Moolekamp `_ diff --git a/README.md b/README.md index 9f4ec115..ec3f1f6d 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ the LSST Science Collaborations. | Visualization | Displaying images and catalogs. | [StackClub/Visualization](Visualization) | | Image Processing | From raw images to `calexp`s and `coadd`s. | [StackClub/ImageProcessing](ImageProcessing) | | SourceDetection | Detection of sources in images - including low surface brightness galaxies. | [StackClub/SourceDetection](SourceDetection) | +| Deblending | Deblending the objects | [StackClub/Deblending](Deblending) | | Validation | Tools for validating Stack outputs, example validation analyses | [StackClub/Validation](Validation) | * [Stack Club projects](https://github.com/LSSTScienceCollaborations/StackClub/labels/project), as defined by Stack Club members - follow [this link](https://github.com/LSSTScienceCollaborations/StackClub/labels/project) to see what people are working on. [Unassigned projects](https://github.com/LSSTScienceCollaborations/StackClub/issues?utf8=%E2%9C%93&q=is%3Aopen+label%3Aproject+no%3Aassignee) are available for new members to take on! @@ -26,7 +27,7 @@ the LSST Science Collaborations. ## Contributing New Stack Club members: please see the [notes on getting started](GettingStarted/GettingStarted.md) - they'll walk you onto you new LSST Science Platform account, and then show you how to work on your tutorial notebooks. Also, please note the [Stack Club Rules](Rules.md) that we all agree to abide by. -Everyone else: we welcome pull requests! Feel free to fork this repo and send us a pull request. And if you are interested in joining the Stack Club, please drop one of us a line, or come and find us in the [#stack-club](https://lsstc.slack.com/messages/C9YRAS4HM) LSSTC Slack channel. +Everyone else: we welcome pull requests! Feel free to fork this repo and send us a pull request. And if you are interested in joining the Stack Club, please drop one of us a line, or come and find us in the [#stack-club](https://lsstc.slack.com/messages/C9YRAS4HM) LSSTC Slack channel. > When preparing a pull request, please note the [standards](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/GettingStarted/GettingStarted.md#standards) that we are trying to uphold. @@ -52,4 +53,4 @@ but you can't blame us if it doesn't do what you want. ## More About This Project -Following a successful LSSTC "Enabling Science" proposal, we put together a 3-phase plan, which you can read about in more detail [here](https://docs.google.com/document/d/103kzjOklSUWo5MJP9B-EsnAdO7V6bstTC_mzBvd0NIk/edit#). Phase 0 involved collecting existing tutorials and identifying potential club members from around the LSST Science Collaborations. Then, in Phase 1 (late May 2018 to mid August 2018) we worked together in a small group to turn a subset of those existing "seed" tutorials into community-maintained Jupyter notebooks, for display at the August LSST 2018 Project and Community Workshop (PCW) in Tucson. At that meeting, we opened up to a larger group of LSST science collaboration members, extending and spinning off the initial set of notebooks. +Following a successful LSSTC "Enabling Science" proposal, we put together a 3-phase plan, which you can read about in more detail [here](https://docs.google.com/document/d/103kzjOklSUWo5MJP9B-EsnAdO7V6bstTC_mzBvd0NIk/edit#). Phase 0 involved collecting existing tutorials and identifying potential club members from around the LSST Science Collaborations. Then, in Phase 1 (late May 2018 to mid August 2018) we worked together in a small group to turn a subset of those existing "seed" tutorials into community-maintained Jupyter notebooks, for display at the August LSST 2018 Project and Community Workshop (PCW) in Tucson. At that meeting, we opened up to a larger group of LSST science collaboration members, extending and spinning off the initial set of notebooks. From 26bf19ef0dd0732b7016ff153ce4707432a793d3 Mon Sep 17 00:00:00 2001 From: fred3m Date: Wed, 15 Aug 2018 09:22:34 -0700 Subject: [PATCH 3/8] add blending tutorials from LSST 2018 blending workshop --- Deblending/lsst_stack_deblender.ipynb | 628 ++++++++++++++++++++++++++ Deblending/scarlet_tutorial.ipynb | 470 +++++++++++++++++++ 2 files changed, 1098 insertions(+) create mode 100755 Deblending/lsst_stack_deblender.ipynb create mode 100755 Deblending/scarlet_tutorial.ipynb diff --git a/Deblending/lsst_stack_deblender.ipynb b/Deblending/lsst_stack_deblender.ipynb new file mode 100755 index 00000000..4427f74a --- /dev/null +++ b/Deblending/lsst_stack_deblender.ipynb @@ -0,0 +1,628 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# LSST Stack Multiband Deblender Tutorial\n", + "\n", + "This tutorial is designed to illustrate how to execture the multiband deblender (*scarlet*) in the LSST stack. This includes a brief introduction to LSST stack objects including:\n", + "\n", + " - Geometry classes from `lsst.geom`, such as points and boxes.\n", + " - Higher-level astronomical primitives from `lsst.afw`, such as the `Image`, `Exposure`, and `Psf` classes.\n", + " - Our core algorithmic `Task` classes, including those for source detection, deblending, and measurement.\n", + " \n", + "We'll be working with coadded images made from Subaru Hyper Suprime-Cam (HSC) data in the COSMOS field. We've taken a recent LSST reprocessing of the HSC-SSP UltraDeep COSMOS field (see [this page](https://confluence.lsstcorp.org/display/DM/S18+HSC+PDR1+reprocessing) for information on that reprocessing, and [this page](https://hsc-release.mtk.nao.ac.jp/doc/) for the data), and added simulated stars from a scaled [SDSS catalog](http://www.sdss.org/dr14/data_access/value-added-catalogs/?vac_id=photometry-of-crowded-fields-in-sdss-for-galactic-globular-and-open-clusters). The result is a very deep image (deeper than the 10-year LSST Deep-Wide-Fast survey, though not as deep as LSST Deep Drilling fields will be) with both a large number of galaxies and region full of stars.\n", + "\n", + "This tutorial is based on Jim Bosch's globular cluster tutorial, however in it's present state *scarlet* is unable to process the crowded field (most likely) due to poor initial conditions for the sources in the field. So instead we use a region of the image outside of the cluster." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Imports\n", + "\n", + "We'll start with some standard imports of both LSST and third-party packages." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "from lsst.daf.persistence import Butler\n", + "from lsst.geom import Box2I, Box2D, Point2I, Point2D, Extent2I, Extent2D\n", + "from lsst.afw.image import Exposure, Image, PARENT, MultibandExposure, MultibandImage\n", + "from lsst.afw.detection import MultibandFootprint" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Reading Data\n", + "\n", + "We'll be retrieving data using the `Butler` tool, which manages where various datasets are stored on the filesystem (and can in principle manage datasets that aren't even stored as files, though all of these are).\n", + "\n", + "We start by creating a `Butler` instance, pointing it at a *Data Repository* (which here is just a root directory)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "butler = Butler(\"/project/jbosch/tutorials/lsst2018/data\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Datasets managed by a butler are identified by a dictionary *Data ID* (specifying things like the visit number or sky patch) and a string *DatasetType* (such as a particular image or catalog). Different DatasetTypes have different keys, while different instances of the same Dataset Type have different values. All of the datasets we use in this tutorial will correspond to the same patch of sky, so they'll have at least the keys in the dictionary in the next cell (they will also have `filter`, but with different values):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataId = {\"tract\": 9813, \"patch\": \"4,4\"}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now use those to load a set of *grizy* coadds, which we'll put directly in a dictionary. The result of each `Butler.get` call is in this case an `lsst.afw.image.Exposure` object, an image that actually contains three \"planes\" (the main image, a bit mask, and a variance image) as well as many other objects that describe the image, such as its PSF and WCS. Note that we (confusingly) use `Exposures` to hold coadd images as well as true single-exposure images, but combine them into a `MultibandExposure`, which contains an exposure in each band.\n", + "\n", + "The DatasetType here is `deepCoadd_calexp` (a coadd on which we've already done some additional processing, such as subtracting the background and setting some mask values), and the extra `filter` argument gets appended to the Data ID." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "filters = \"grizy\"\n", + "coadds = [butler.get(\"deepCoadd_calexp\", dataId, filter=\"HSC-{}\".format(f.upper())) for f in filters]\n", + "coadds = MultibandExposure.fromExposures(filters, coadds)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Making and displaying color composite images\n", + "\n", + "We'll start by just looking at the images, as 3-color composites. We'll use astropy to build those as a nice way to demonstrate how to get NumPy arrays from the `MultibandImage` objects (the images in `coadds`). (LSST also has code to make 3-color composites using the same algorithm, and in fact the Astropy implementation is based on ours, but now that it's in Astropy we'll probably retire ours.)\n", + "\n", + "We'll just use matplotlib to display the images themselves. We'll use Firefly for other image display tasks later, but while Firefly itself supports color-composites, work on our preferred composition algorithm is still in progress, and we haven't quite finished connecting that functionality to the Python client we'll demonstrate here." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from astropy.visualization import make_lupton_rgb\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll use the following function a few times to display color images. It's worth reading through the implementation carefully to see what's going on." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def showRGB(image, bgr=\"gri\", ax=None, fp=None, figsize=(8,8), stretch=1, Q=10):\n", + " \"\"\"Display an RGB color composite image with matplotlib.\n", + " \n", + " Parameters\n", + " ----------\n", + " image : `MultibandImage`\n", + " `MultibandImage` to display.\n", + " bgr : sequence\n", + " A 3-element sequence of filter names (i.e. keys of the exps dict) indicating what band\n", + " to use for each channel. If `image` only has three filters then this parameter is ignored\n", + " and the filters in the image are used.\n", + " ax : `matplotlib.axes.Axes`\n", + " Axis in a `matplotlib.Figure` to display the image.\n", + " If `axis` is `None` then a new figure is created.\n", + " fp: `lsst.afw.detection.Footprint`\n", + " Footprint that contains the peak catalog for peaks in the image.\n", + " If `fp` is `None` then no peak positions are plotted.\n", + " figsize: tuple\n", + " Size of the `matplotlib.Figure` created.\n", + " If `ax` is not `None` then this parameter is ignored.\n", + " stretch: int\n", + " The linear stretch of the image.\n", + " Q: int\n", + " The Asinh softening parameter.\n", + " \"\"\"\n", + " # If the image only has 3 bands, reverse the order of the bands to produce the RGB image\n", + " if len(image) == 3:\n", + " bgr = image.filters\n", + " # Extract the primary image component of each Exposure with the .image property, and use .array to get a NumPy array view.\n", + " rgb = make_lupton_rgb(image_r=image[bgr[2]].array, # numpy array for the r channel\n", + " image_g=image[bgr[1]].array, # numpy array for the g channel\n", + " image_b=image[bgr[0]].array, # numpy array for the b channel\n", + " stretch=stretch, Q=Q) # parameters used to stretch and scale the pixel values\n", + " if ax is None:\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = fig.add_subplot(1,1,1)\n", + " \n", + " # Exposure.getBBox() returns a Box2I, a box with integer pixel coordinates that correspond to the centers of pixels.\n", + " # Matplotlib's `extent` argument expects to receive the coordinates of the edges of pixels, which is what\n", + " # this Box2D (a box with floating-point coordinates) represents.\n", + " integerPixelBBox = image[bgr[0]].getBBox()\n", + " bbox = Box2D(integerPixelBBox)\n", + " ax.imshow(rgb, interpolation='nearest', origin='lower', extent=(bbox.getMinX(), bbox.getMaxX(), bbox.getMinY(), bbox.getMaxY()))\n", + " if fp is not None:\n", + " for peak in fp.getPeaks():\n", + " ax.plot(peak.getIx(), peak.getIy(), \"bx\", mew=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice that we can slice `MultibandImage` objects (as well a `MultibandExposure` objects) along the filter dimension using the filter names as indices. Like `Exposure` objects, `MultibandExposure` objects have `image`, `mask`, and `variance` properties that contain the image, mask plane, and variance of the `Exposure` respectively. For now we will only worry about the `image` property, although internal deblending and measurement algorithms make use of all three objects (when available)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(coadds[:\"z\"].image, figsize=(10, 10))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(coadds[\"i\":].image, figsize=(10, 10))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Those images are a full \"patch\", which is our usual unit of processing for coadds - it's about the same size as a single LSST sensor (exactly the same in pixels, smaller in terms of area because these use HSC's smaller pixel scale). That's a bit unweildy (just because waiting for processing to happen isn't fun in a tutorial setting), so we'll reload our dict with sub-images centered on the region of interest.\n", + "\n", + "Note that we can load the sub-images directly with the `butler`, by appending `_sub` to the DatasetType and passing a `bbox` argument. If you want to see the region of the image with the cluster, use `clusterBBox` below, however as mentioned above, that region is too memory intensive for the current version of *scarlet*. Instead use `sampleBBox` to select a sub-region of the image (note that we add a small frame around each blend to include more background regions, which are important for the detection algorithm)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "frame = 50\n", + "clusterBBox = Box2I(Point2I(18325, 17725), Extent2I(400, 350))\n", + "\n", + "#sampleBBox = Box2I(Point2I(18699-frame, 17138-frame), Extent2I(93+2*frame, 104+2*frame))\n", + "#sampleBBox = Box2I(Point2I(16424-frame, 17806-frame), Extent2I(55+2*frame, 62+2*frame))\n", + "#sampleBBox = Box2I(Point2I(17838-frame, 18945-frame), Extent2I(111+2*frame, 103+2*frame))\n", + "sampleBBox = Box2I(Point2I(19141-frame, 18228-frame), Extent2I(63+2*frame, 87+2*frame))\n", + "\n", + "subset = coadds[:, sampleBBox]\n", + "# Due to a bug in the code the PSF isn't copied properly.\n", + "# The code below copies the PSF into the `MultibandExposure`,\n", + "# but will be unecessary in the future\n", + "for f in subset.filters:\n", + " subset[f].setPsf(coadds[f].getPsf())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "showRGB(subset.image)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Basic Processing\n", + "\n", + "Now we'll try the regular LSST processing tasks, with a simpler configuration than we usually use to process coadds, just to avoid being distracted by complexity. This includes\n", + "\n", + " - Detection (`SourceDetectionTask`): given an `Exposure`, find above-threshold regions and peaks within them (`Footprints`), and create a *parent* source for each `Footprint`.\n", + " - Deblending (`MultibandDeblendTask`): given a `MultibandExposure` and a catalog of parent sources, create a *child* source for each peak in every `Footprint` that contains more than one peak. Each child source is given a `HeavyFootprint`, which contains both the pixel region that source covers and the fractional pixel values associated with that source. A `SourceDeblendTask` is also available using the single band SDSS-HSC deblender that takes a single band `Exposure`).\n", + " - Measurment (`SingleFrameMeasurementTask`): given an `Exposure` and a catalog of sources, run a set of \"measurement plugins\" on each source, using deblended pixel values if it is a child. Notice that measurement is still performed on single band catalogs, since none of the measurement algorithms work for multiband data.\n", + "\n", + "We'll start by importing these, along with the `SourceCatalog` class we'll use to hold the outputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from lsst.meas.algorithms import SourceDetectionTask\n", + "from lsst.meas.deblender import MultibandDeblendTask\n", + "from lsst.meas.base import SingleFrameMeasurementTask\n", + "from lsst.afw.table import SourceCatalog" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll now construct all of these `Tasks` before actually running any of them. That's because `SourceDeblendTask` and `SingleFrameMeasurementTask` are constructed with a `Schema` object that records what fields they'll produce, and they modify that schema when they're constructed by adding columns to it. When we run the tasks later, they'll need to be given a catalog that includes all of those columns, **but we can't add columns to a catalog that already exists**.\n", + "\n", + "To recap, the sequence looks like this:\n", + "\n", + " 1. Make a (mostly) empty schema.\n", + " 2. Construct all of the `Task`s (in the order you plan to run them), which adds columns to the schema.\n", + " 3. Make a `SourceCatalog` object from the *complete* schema.\n", + " 4. Pass the same `SourceCatalog` object to each `Task` when you run it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "schema = SourceCatalog.Table.makeMinimalSchema()\n", + "\n", + "detectionTask = SourceDetectionTask(schema=schema)\n", + "\n", + "config = MultibandDeblendTask.ConfigClass()\n", + "config.usePsfConvolution = True\n", + "config.conserveFlux = True\n", + "config.maxIter = 100\n", + "deblendTask = MultibandDeblendTask(schema=schema, config=config)\n", + "\n", + "# We'll customize the configuration of measurement to just run a few plugins.\n", + "# The default list of plugins is much longer (and hence slower).\n", + "measureConfig = SingleFrameMeasurementTask.ConfigClass()\n", + "measureConfig.plugins.names = [\"base_SdssCentroid\", \"base_PsfFlux\", \"base_SkyCoord\"]\n", + "# \"Slots\" are aliases that provide easy access to certain plugins.\n", + "# Because we're not running the plugin these slots refer to by default,\n", + "# we need to disable them in the configuration.\n", + "measureConfig.slots.apFlux = None\n", + "measureConfig.slots.instFlux = None\n", + "measureConfig.slots.shape = None\n", + "measureConfig.slots.modelFlux = None\n", + "measureConfig.slots.calibFlux = None\n", + "measureTask = SingleFrameMeasurementTask(config=measureConfig, schema=schema)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first step we'll run is detection, which actually returns a new `SourceCatalog` object rather than working on an existing one.\n", + "\n", + "Instead, it takes a `Table` object, which is sort of like a factory for records. We won't use it directly after this, and it isn't actually necessary to make a new `Table` every time you run `MultibandDetectionTask` (but you can only create one after you're done adding columns to the schema).\n", + "\n", + "`Task`s that return anything do so via a `lsst.pipe.base.Struct` object, which is just a simple collection of named attributes. The only return values we're interested is `sources`. That's our new `SourceCatalog`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "table = SourceCatalog.Table.make(schema)\n", + "detectionResult = detectionTask.run(table, subset[\"r\"])\n", + "catalog = detectionResult.sources" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's take a quick look at what's in that catalog. First off, we can look at its schema:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "catalog.schema" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that this includes a lot of columns that were actually added by the deblend or measurement steps; those will all still be blank (`0` for integers or flags, `NaN` for floating-point columns).\n", + "\n", + "In fact, the only columns filled by `SourceDetectionTask` are the IDs. But it also attaches `Footprint` objects, which don't appear in the schema. You can retrieve the `Footprint` by calling `getFootprint()` on a row:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "footprint = catalog[0].getFootprint()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`Footprints` have two components:\n", + " - a `SpanSet`, which represents an irregular region on an image via a list of (y, x0, x1) `Spans`;\n", + " - a `PeakCatalog`, a slightly different kind of catalog whose rows represent peaks within that `Footprint`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(footprint.getSpans())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(footprint.getPeaks())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we actually look at the footprints in the catalog we see that some have only a single peak, while others have multiple peaks that need to be deblended.\n", + "\n", + "To display only the pixels contained in the footprint (and not other pixels in the bounding box) we create a `MultibandFootprint`, which is a `HeavyFootprint` that contains a `SpanSet`, `PeakCatalog`, and `flux` values for all of the pixels in the `SpanSet`. In this case the `flux` is the total measured flux in the image, since no deblending has taken place yet." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "for src in catalog:\n", + " fp = src.getFootprint()\n", + " img = coadds[:,fp.getBBox()].image\n", + " mfp = MultibandFootprint.fromImages(coadds.filters, image=img, footprint=fp)\n", + " showRGB(mfp.getImage().image, fp=fp, figsize=(3,3))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It's worth noting that while the peaks *can* have both an integer-valued position and a floating-point position, they're the same right now; `SourceDetectionTask` currently just finds the pixels that are local minima and doesn't try to find their sub-pixel locations. That's left to the centroider, which is part of the measurement stage.\n", + "\n", + "Before we can get to that point, we need to run the deblender:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "fluxCatalog, templateCatalog = deblendTask.run(coadds, catalog)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`MultibandDeblendTask` always returns two catalogs, a `templateCatalog` that contains the model outputs from *scarlet* and a `fluxCatalog`, which uses the *scarlet* models as weights to redistribute the flux from the input image (in other words they contain flux-conserved models). If `MultibandDeblendTask.config.saveTemplates` is `False`, then `templateCatalog` will be `None`. Similarly, if `MultibandDeblendTask.config.conserveFlux` is `False` then the `fluxCatalog` will be `None` (and the code will run slightly faster, since it doesn't have to reweight the flux, however this is a small fraction of the processing time).\n", + "\n", + "The deblender itself sets the `parent` column for each source, which is `0` for objects with no parent, and all of the columns that begin with `deblend_` and also adds new rows to the catalog for each child. It does *not* remove the parent rows it created those child rows from, and this is intentional, because we want to measure both \"interpretations\" of the blend family: one in which there is only one object (the parent version) and one in which there are several (the children). Before doing any science with the outputs of an LSST catalog, it's important to remove one of those interpretations (typically the parent one). That can be done by looking at the `deblend_nChild` and `parent` fields:\n", + "\n", + " - `parent` is the ID of the source from which this was deblended, or `0` if the source is itself a parent.\n", + " - `deblend_nChild` is the number of child sources this source has (so it's `0` for sources that are themselves children or were never blended).\n", + " \n", + "Together, these define two particularly useful filters:\n", + "\n", + " - `deblend_nChild == 0`: never-blended object or de-blended child\n", + " - `deblend_nChild == 0 and parent == 0`: never-blended object\n", + " \n", + "The first is what you'll usually want to use; the second is what to use if you're willing to throw away some objects (possibly many) because you don't trust the deblender.\n", + "\n", + "The last processing step for our purposes is running measurement, which must be done on each catalog, in each band (if we want measurements for all of them):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "measureTask.run(templateCatalog[\"r\"], coadds['r'])\n", + "measureTask.run(templateCatalog[\"i\"], coadds['i'])\n", + "measureTask.run(fluxCatalog[\"r\"], coadds['r'])\n", + "measureTask.run(fluxCatalog[\"i\"], coadds['i'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Due to an unfortunate bug in the deblender task, the resulting catalogs are not contiguous and we need to copy them into new objects to use them appropriately. This step can be avoided in the near future." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import lsst.afw.table as afwTable\n", + "\n", + "for f in filters:\n", + " _catalog = afwTable.SourceCatalog(templateCatalog[f].table.clone())\n", + " _catalog.extend(templateCatalog[f], deep=True)\n", + " templateCatalog[f] = _catalog\n", + " _catalog = afwTable.SourceCatalog(fluxCatalog[f].table.clone())\n", + " _catalog.extend(fluxCatalog[f], deep=True)\n", + " fluxCatalog[f] = _catalog" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since we care about deblending (for the sake of this tutorial) lets look at the results from the 13th blend displayed above. We use the `HeavyFootprint`s from the catalog sources that have the same parent (parent 13 from above) to build a model for the entre scene, and to compare the results of the flux conserved and *scarlet* models. In the process we look at both the *scarlet* and flux conserved models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": false + }, + "outputs": [], + "source": [ + "from lsst.afw.detection import MultibandFootprint\n", + "from lsst.afw.image import MultibandImage\n", + "\n", + "# Use the 13th parent in the blend\n", + "# Note: this is not the parent ID, but the 13th source in the catalog\n", + "parentIdx = 13\n", + "\n", + "# Create empty multiband images to model the entire scene\n", + "templateModel = MultibandImage.fromImages(coadds.filters,\n", + " [Image(fluxCatalog[\"r\"][parentIdx].getFootprint().getBBox(), dtype=np.float32)\n", + " for b in range(len(filters))])\n", + "fluxModel = MultibandImage.fromImages(coadds.filters,\n", + " [Image(fluxCatalog[\"r\"][parentIdx].getFootprint().getBBox(), dtype=np.float32)\n", + " for b in range(len(filters))])\n", + "\n", + "# Only use the subset catalogs with the same parent\n", + "parentId = fluxCatalog[\"r\"][parentIdx].get(\"id\")\n", + "fluxChildren = {b: fluxCatalog[b][fluxCatalog[b].get(\"parent\")==parentId] for b in filters}\n", + "templateChildren = {b: templateCatalog[b][templateCatalog[b].get(\"parent\")==parentId] for b in filters}\n", + "assert(len(fluxChildren)==len(templateChildren))\n", + "\n", + "for n in range(len(templateChildren[\"r\"])):\n", + " # Add the source model to the model of the entire scene\n", + " fp = MultibandFootprint(coadds.filters, [templateChildren[b][n].getFootprint() for b in filters])\n", + " _fp = MultibandFootprint(coadds.filters, [fluxChildren[b][n].getFootprint() for b in filters])\n", + " templateModel[:, fp.getBBox()].array += fp.getImage(fill=0).image.array\n", + " fluxModel[:, _fp.getBBox()].array += _fp.getImage(fill=0).image.array\n", + "\n", + " # Show the model\n", + " fig = plt.figure(figsize=(6, 3))\n", + " ax = [fig.add_subplot(1, 2, n+1) for n in range(2)]\n", + " ax[0].set_title(\"scarlet\")\n", + " ax[1].set_title(\"flux conserved\")\n", + " showRGB(fp.getImage().image, ax=ax[0])\n", + " showRGB(_fp.getImage().image, ax=ax[1])\n", + " plt.show()\n", + "\n", + "templateResidual = MultibandImage(coadds.filters,\n", + " coadds[:, templateModel.getBBox()].image.array - templateModel.array)\n", + "fluxResidual = MultibandImage(coadds.filters,\n", + " coadds[:, fluxModel.getBBox()].image.array - fluxModel.array)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally we look at the full models and the residuals. As expected, there are no residuals for the flux conserved model since all of the flux in the image (that is within the footprint) is added to one of the sources. In this particular case that works fine, but in instances where one or more sources were not detected this can cause one source to have its flux contaminated with its neighbor." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for title, model, residual in [[\"scarlet\", templateModel, templateResidual], [\"flux conserved\", fluxModel, fluxResidual]]:\n", + " fig = plt.figure(figsize=(15,8))\n", + " ax = [fig.add_subplot(1, 2, n+1) for n in range(2)]\n", + " ax[0].set_title(\"{0} model\".format(title))\n", + " ax[1].set_title(\"{0} residual\".format(title))\n", + " showRGB(model, ax=ax[0])\n", + " showRGB(residual ,ax=ax[1], Q=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Exercises\n", + "\n", + "- Use some of the other`sampleBBox` regions and run through the code again, from source detection through measurment and blending displays. Don't foget to change the parent index to view only the children of the correct blend.\n", + "- Play around with other *scarlet* constraints, such as adding an L0 penalty. This should help the code execute faster, as one of the main reasons for the slow down is unecessarily large boxes surrounding the smaller sources. See https://github.com/lsst/meas_deblender/blob/master/python/lsst/meas/deblender/deblend.py#L462-L545 for a description of the other configuration options that can be passed to the deblender\n", + "\n", + "Note: in order to try out different constraints the `meas_deblender` package requires an upgrade that has not been pushed to master yet. To use the latest changes, from your terminal session you must execute the following steps:\n", + "\n", + "```bash\n", + "~$ cp /project/fred3m/tutorials/lsst2018/.user_setups ~/notebooks\n", + "~$ source ~/notebooks/.user_setups\n", + "```\n", + "\n", + "You will have to restart the kernel for this notebook session in order for the changes to take place." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/Deblending/scarlet_tutorial.ipynb b/Deblending/scarlet_tutorial.ipynb new file mode 100755 index 00000000..4ea99f05 --- /dev/null +++ b/Deblending/scarlet_tutorial.ipynb @@ -0,0 +1,470 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Scarlet deblending Tutorial\n", + "\n", + "The purpose of this tutorial is to familiarize the user with the basics of using *scarlet* to model blended scenes, and how tweaking various objects and parameters affects the resulting model. A tutorial that is more specific to using scarlet in the context of the LSST DM Science Pipelines is also available.\n", + "\n", + "Before attempting this tutorial it will be useful to read the [introduction](http://scarlet.readthedocs.io/en/latest/user_docs.html) to the *scarlet* User Guide, and many of the exercises below may require referencing the *scarlet* [docs](http://scarlet.readthedocs.io/en/latest/index.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Import the necessary libraries\n", + "import os\n", + "\n", + "%matplotlib inline\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "# don't interpolate the pixels\n", + "matplotlib.rc('image', interpolation='none')\n", + "\n", + "import numpy as np\n", + "import scarlet\n", + "import scarlet.display" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Display functions\n", + "\n", + "Below are several usful functions used throughout this tutorial to visualize the data and models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Display the sources\n", + "def display_sources(sources, image, norm=None, subset=None, combine=False, show_sed=True, filter_indices=None):\n", + " \"\"\"Display the data and model for all sources in a blend\n", + "\n", + " This convenience function is used to display all (or a subset) of\n", + " the sources and (optionally) their SED's.\n", + " \"\"\"\n", + " if subset is None:\n", + " # Show all sources in the blend\n", + " subset = range(len(sources))\n", + " if filter_indices is None:\n", + " filter_indices = [3,2,1]\n", + " for m in subset:\n", + " # Load the model for the source\n", + " src = sources[m]\n", + " model = [comp.get_model() for comp in src]\n", + "\n", + " # Select the image patch the overlaps with the source and convert it to an RGB image\n", + " img_rgb = scarlet.display.img_to_rgb(image[src[0].bb], filter_indices=filter_indices, norm=norm)\n", + "\n", + " # Build a model for each component in the model\n", + " rgb = []\n", + " for _model in model:\n", + " # Convert the model to an RGB image\n", + " _rgb = scarlet.display.img_to_rgb(_model, filter_indices=filter_indices, norm=norm)\n", + " rgb.append(_rgb)\n", + "\n", + " # Display the image and model\n", + " figsize = [6,3]\n", + " columns = 2\n", + " # Calculate the number of columns needed and shape of the figure\n", + " if show_sed:\n", + " figsize[0] += 3\n", + " columns += 1\n", + " if not combine:\n", + " figsize[0] += 3*(len(model)-1)\n", + " columns += len(model)-1\n", + " # Build the figure\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = [fig.add_subplot(1,columns,n+1) for n in range(columns)]\n", + " ax[0].imshow(img_rgb)\n", + " ax[0].set_title(\"Data: Source {0}\".format(m))\n", + " for n, _rgb in enumerate(rgb):\n", + " ax[n+1].imshow(_rgb)\n", + " if combine:\n", + " ax[n+1].set_title(\"Initial Model\")\n", + " else:\n", + " ax[n+1].set_title(\"Component {0}\".format(n))\n", + " if show_sed:\n", + " for comp in src:\n", + " ax[-1].plot(comp.sed)\n", + " ax[-1].set_title(\"SED\")\n", + " ax[-1].set_xlabel(\"Band\")\n", + " ax[-1].set_ylabel(\"Intensity\")\n", + " # Mark the current source in the image\n", + " y,x = src[0].center\n", + " ax[0].plot(x-src[0].bb[2].start, y-src[0].bb[1].start, 'x', color=\"#5af916\", mew=2)\n", + " plt.tight_layout()\n", + " plt.show()\n", + "\n", + "def display_model_residual(images, blend, peaks, norm, filter_indices=None):\n", + " \"\"\"Display the data, model, and residual for a given result\n", + " \"\"\"\n", + " if filter_indices is None:\n", + " filter_indices = [3,2,1]\n", + " model = blend.get_model()\n", + " residual = images-model\n", + " print(\"Data range: {0:.3f} to {1:.3f}\\nresidual range: {2:.3f} to {3:.3f}\\nrms: {4:.3f}\".format(\n", + " np.min(images),\n", + " np.max(images),\n", + " np.min(residual),\n", + " np.max(residual),\n", + " np.sqrt(np.std(residual)**2+np.mean(residual)**2)\n", + " ))\n", + " # Create RGB images\n", + " img_rgb = scarlet.display.img_to_rgb(images, filter_indices=filter_indices, norm=norm)\n", + " model_rgb = scarlet.display.img_to_rgb(model, filter_indices=filter_indices, norm=norm)\n", + " residual_norm = scarlet.display.Linear(img=residual)\n", + " residual_rgb = scarlet.display.img_to_rgb(residual, filter_indices=filter_indices, norm=residual_norm)\n", + "\n", + " # Show the data, model, and residual\n", + " fig = plt.figure(figsize=(15,5))\n", + " ax = [fig.add_subplot(1,3,n+1) for n in range(3)]\n", + " ax[0].imshow(img_rgb)\n", + " ax[0].set_title(\"Data\")\n", + " ax[1].imshow(model_rgb)\n", + " ax[1].set_title(\"Model\")\n", + " ax[2].imshow(residual_rgb)\n", + " ax[2].set_title(\"Residual\")\n", + " for k,component in enumerate(blend.components):\n", + " y,x = component.center\n", + " #px, py = peaks[k]\n", + " ax[0].plot(x, y, \"gx\")\n", + " #ax[0].plot(px, py, \"rx\")\n", + " ax[1].text(x, y, k, color=\"r\")\n", + " plt.show()\n", + "\n", + "def show_psfs(psfs, filters, norm=None):\n", + " rows = int(np.ceil(len(psfs)/3))\n", + " columns = min(len(psfs), 3)\n", + " figsize = (45/columns, rows*5)\n", + " fig = plt.figure(figsize=figsize)\n", + " ax = [fig.add_subplot(rows, columns, n+1) for n in range(len(psfs))]\n", + " for n, psf in enumerate(psfs):\n", + " im = ax[n].imshow(psf, norm=norm)\n", + " ax[n].set_title(\"{0}-band PSF\".format(filters[n]))\n", + " plt.colorbar(im, ax=ax[n])\n", + " plt.show()\n", + "\n", + "def display_diff_kernels(psf_blend, diff_kernels):\n", + " model = psf_blend.get_model()\n", + " for b, component in enumerate(psf_blend.components):\n", + " fig = plt.figure(figsize=(15,2.5))\n", + " ax = [fig.add_subplot(1,4,n+1) for n in range(4)]\n", + " # Display the psf\n", + " ax[0].set_title(\"psf\")\n", + " _img = ax[0].imshow(psfs[b])\n", + " fig.colorbar(_img, ax=ax[0])\n", + " # Display the model\n", + " ax[1].set_title(\"modeled psf\")\n", + " _model = np.ma.array(model[b], mask=model[b]==0)\n", + " _img = ax[1].imshow(_model)\n", + " fig.colorbar(_img, ax=ax[1])\n", + " # Display the difference kernel\n", + " ax[2].set_title(\"difference kernel\")\n", + " _img = ax[2].imshow(np.ma.array(diff_kernels[b], mask=diff_kernels[b]==0))\n", + " fig.colorbar(_img, ax=ax[2])\n", + " # Display the residual\n", + " ax[3].set_title(\"residual\")\n", + " residual = psfs[b]-model[b]\n", + " vabs = np.max(np.abs(residual))\n", + " _img = ax[3].imshow(residual, vmin=-vabs, vmax=vabs, cmap='seismic')\n", + " fig.colorbar(_img, ax=ax[3])\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Load and Display the data\n", + "\n", + "The `file_path` points to a directory with 147 HSC blends from the COSMOS field detected by the LSST pipeline. Changing `idx` below will select a different blend." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Load the sample images\n", + "idx = 53\n", + "file_path = \"/project/fred3m/code/testdata_deblender/real_data/hsc_cosmos/not_matched\"\n", + "files = os.listdir(file_path)\n", + "data = np.load(os.path.join(file_path, files[idx]))\n", + "image = data[\"images\"]\n", + "wmap = data[\"weights\"]\n", + "peaks = data[\"peaks\"]\n", + "psfs = data[\"psfs\"]\n", + "filters = [\"G\", \"R\", \"I\", \"Z\", \"Y\"]\n", + "# Only a rough estimate of the background is needed\n", + "# to initialize and resize the sources\n", + "bg_rms = np.std(image, axis=(1,2))\n", + "print(\"Background RMS: {0}\".format(bg_rms))\n", + "\n", + "# Use Asinh scaling for the images\n", + "norm = scarlet.display.Asinh(img=image, Q=10)\n", + "# Map i,r,g -> RGB\n", + "filter_indices = [3,2,1]\n", + "# Convert the image to an RGB image\n", + "img_rgb = scarlet.display.img_to_rgb(image, filter_indices=filter_indices, norm=norm)\n", + "plt.imshow(img_rgb)\n", + "plt.title(\"Image: {0}\".format(idx))\n", + "for src in peaks:\n", + " plt.plot(src[0], src[1], \"rx\", mew=2)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Initializing Sources\n", + "\n", + "Astrophysical objects are modeled in scarlet as a collection of components, where each component has a single SED that is constant over it's morphology (band independent intensity). So a single source might have multiple components, like a bulge and disk, or a single component.\n", + "\n", + "The different classes that inherit from `Source` mainly differ in how they are initialized, and otherwise behave similarly during the optimization routine. This section illustrates the differences between different source initialization classes.\n", + "\n", + "The simplest source is a single component intialized with only a single pixel (at the center of the object) turned on.\n", + "\n", + "### *WARNING* \n", + "Scarlet accepts source positions using the numpy/C++ convention of (y,x), which is different than the astropy and LSST stack convention of (x,y)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sources = [scarlet.PointSource((peak[1], peak[0]), image) for peak in peaks]\n", + "\n", + "# Display the initial guess for each source\n", + "display_sources(sources, image, norm=norm)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise:\n", + "\n", + "* Experiment with the above code by using `ExtendedSource`, which initializes each object as a single component with maximum flux at the peak that falls off monotonically and has 180 degree symmetry; and using `MultiComponentSource`, which models a source as two components (a bulge and a disk) that are each symmetric and montonically decreasing from the peak.\n", + "\n", + "# Deblending a scene\n", + "\n", + "The `Blend` class contains the list of sources, the image, and any other configuration parameters necessary to fit the data, including routines to fit the center positions and resize the bounding box containing the sources (if necessary). Once a blend has been initialized with a list of sources, the image and background RMS values must be set (the background RMS is used to determine when to truncate the bounding box around a source)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "blend = scarlet.Blend(sources)\n", + "blend.set_data(image, bg_rms=bg_rms)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next we can fit a model, given a maximum number of iterations and the relative error required for convergence." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "blend.fit(100, 1e-2)\n", + "print(\"Deblending completed in {0} iterations\".format(blend.it))\n", + "display_model_residual(image, blend, peaks, norm)\n", + "display_sources(sources, image, norm)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Experiment by running the above code using different source models (for example `ExtendedSource`) to see how initializtion affects the belnding results.\n", + "\n", + "* The code above initialized the sources at their exact centers. Try offsetting the initial positions by `0.5` pixels in `x` and/or `y` and passing a `shift_center=0` argument when initializing the source. This prevents the source from updating its position, so notice how that affects the resulting model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Constraints\n", + "\n", + "The above models used the default constraints: perfect symmetry and a weighted monotonicity that decreases from the peak. So each source is defined (internally during initialization) with the constraints" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "import scarlet.constraint as sc\n", + "constraints = (sc.SimpleConstraint(),\n", + " sc.DirectMonotonicityConstraint(use_nearest=False),\n", + " sc.DirectSymmetryConstraint())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "where `SimpleConstraint` forces the SED and morphology to be non-negative, the SED to be normalized to unity, and the peak to have some (minimal) flux at the center." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Go back to the source initialization cell and pass a custom set of constraints. For example, pass `DirectSymmetryConstraint` a number between 0 and 1 to set the level of symmetry required, or eliminate the symmetry constraint altogether and see how that affects deblending.\n", + "\n", + "* Set `use_nearest=True` in the `DirectMonotonicityConstraint`.\n", + "\n", + "* Add `L0Constraint` or `L1Constraint` to the list of constraints and observe the results." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Configuration\n", + "\n", + "There are additional configuration paramters that can be used to initialize a source, as described in http://scarlet.readthedocs.io/en/latest/config.html#Configuration-(scarlet.config).\n", + "\n", + "## Exercises\n", + "\n", + "* Initialize the sources with a custom configuration where `refine_skip=2`, which updates the positions and box sizes on every other step, and see how the results compare" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# PSF Deconvolution\n", + "\n", + "When analyzing real images the PSF will be different in each band unless they have been PSF matched. In general deblending should not be performed on PSF matched coadds, as matching will increase the blending in bands with better seeing. Instead scarlet can be used to build a deconvolved model which is a more sparse (and less blended) representation of the data, and convolve the model in each band to compare to the input data.\n", + "\n", + "To initialize a source with a PSF, pass the PSF as an input to the new source:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "scarlet.ExtendedSource(peaks[0], image, bg_rms, psf=psfs)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Partial PSF Deconvolution\n", + "\n", + "As discussed in the tutorial http://scarlet.readthedocs.io/en/latest/psf_matching.html, the data is noisy and the fully deconvolved scene is undersampled, making the application of the constraints and and full convolution kernel unstable and prone to biases. Instead we can create a target PSF and model the sources in the partially deconvolved target PSF scene.\n", + "\n", + "First we need to specify the target PSF. *scarlet* includes a `fit_target_psf` function to fit the PSF in each band to either a `moffat`, `gaussian`, or `double_gaussian` function. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import scarlet.psf_match\n", + "\n", + "show_psfs(psfs, filters)\n", + "\n", + "# Find the target PSF\n", + "target_psf = scarlet.psf_match.fit_target_psf(psfs, scarlet.psf_match.moffat)\n", + "plt.imshow(target_psf)\n", + "plt.title(\"target PSF\")\n", + "plt.colorbar()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once we have the target PSF we can find the difference kernel in each band using *scarlet*. The `build_diff_kernels` function basically treats the PSF image as a blend, where the PSF in each band is a monochromatic source, and fits the difference kernels using the minimum number of pixels necessary." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "diff_kernels, psf_blend = scarlet.psf_match.build_diff_kernels(psfs, target_psf)\n", + "display_diff_kernels(psf_blend, diff_kernels)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercises\n", + "\n", + "* Try building the difference kernels while varying the parameters in `build_diff_kernels`, for example using larger and smaller values for `l0_thresh`.\n", + "\n", + "* Go back up to source initialization and use `psf=psfs` to fully deconvolve the scene and fit the blend\n", + "\n", + "* Try the same thing but set `psf=diff_kernels` for each source to partially deconvolve the scene." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LSST", + "language": "python", + "name": "lsst" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From 0d79fdbf71a52d679342ea1e98e805d9cd33c1fa Mon Sep 17 00:00:00 2001 From: Phil Marshall Date: Wed, 15 Aug 2018 09:55:14 -0700 Subject: [PATCH 4/8] Deblending in the READMEs --- Deblending/README.rst | 36 ++++++++++++++++++++++++++++++++++++ README.md | 5 +++-- 2 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 Deblending/README.rst diff --git a/Deblending/README.rst b/Deblending/README.rst new file mode 100644 index 00000000..5b422d3a --- /dev/null +++ b/Deblending/README.rst @@ -0,0 +1,36 @@ +Deblending +========== + +This folder contains a set of tutorial notebooks exploring the deblending of LSST objects. See the index table below for links to the notebook code, and an auto-rendered view of the notebook with outputs. + + +.. list-table:: + :widths: 10 20 10 10 + :header-rows: 1 + + * - Notebook + - Short description + - Links + - Owner + + + * - **SCARLET Tutorial** + - Introduction to the SCARLET deblender, how to configure and run it. + - `ipynb `_, + `rendered `_ + + .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/scarlet_tutorial.svg + :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/scarlet_tutorial.log + + - `Fred Moolekamp `_ + + + * - **Deblending in DRP ** + - Where and how the deblending happens, in the DRP pipeline. + - `ipynb `_, + `rendered `_ + + .. image:: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/lsst_stack_deblender.svg + :target: https://github.com/LSSTScienceCollaborations/StackClub/blob/rendered/Deblending/log/lsst_stack_deblender.log + + - `Fred Moolekamp `_ diff --git a/README.md b/README.md index ed98e4ee..f02b30c6 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ the LSST Science Collaborations. | Visualization | Displaying images and catalogs. | [StackClub/Visualization](Visualization) | | Image Processing | From raw images to `calexp`s and `coadd`s. | [StackClub/ImageProcessing](ImageProcessing) | | SourceDetection | Detection of sources in images - including low surface brightness galaxies. | [StackClub/SourceDetection](SourceDetection) | +| Deblending | Deblending the objects | [StackClub/Deblending](Deblending) | | Validation | Tools for validating Stack outputs, example validation analyses | [StackClub/Validation](Validation) | * [Stack Club projects](https://github.com/LSSTScienceCollaborations/StackClub/labels/project), as defined by Stack Club members - follow [this link](https://github.com/LSSTScienceCollaborations/StackClub/labels/project) to see what people are working on. [Unassigned projects](https://github.com/LSSTScienceCollaborations/StackClub/issues?utf8=%E2%9C%93&q=is%3Aopen+label%3Aproject+no%3Aassignee) are available for new members to take on! @@ -29,7 +30,7 @@ If you would like to join the Stack Club, please fill out this short **[applicat ## Contributing New Stack Club members: please see the [notes on getting started](GettingStarted/GettingStarted.md) - they'll walk you onto you new LSST Science Platform account, and then show you how to work on your tutorial notebooks. -Everyone else: we welcome pull requests! Feel free to fork this repo, write us an issue, and send us a PR. +Everyone else: we welcome pull requests! Feel free to fork this repo and send us a pull request. And if you are interested in joining the Stack Club, please drop one of us a line, or come and find us in the [#stack-club](https://lsstc.slack.com/messages/C9YRAS4HM) LSSTC Slack channel. > When preparing a pull request, please note the [standards](https://github.com/LSSTScienceCollaborations/StackClub/blob/master/GettingStarted/GettingStarted.md#standards) that we are trying to uphold. @@ -55,4 +56,4 @@ but you can't blame us if it doesn't do what you want. ## More About This Project -Following a successful LSSTC "Enabling Science" proposal, we put together a 3-phase plan, which you can read about in more detail [here](https://docs.google.com/document/d/103kzjOklSUWo5MJP9B-EsnAdO7V6bstTC_mzBvd0NIk/edit#). Phase 0 involved collecting existing tutorials and identifying potential club members from around the LSST Science Collaborations. Then, in Phase 1 (late May 2018 to mid August 2018) we worked together in a small group to turn a subset of those existing "seed" tutorials into community-maintained Jupyter notebooks, for display at the August LSST 2018 Project and Community Workshop (PCW) in Tucson. At that meeting, we opened up to a larger group of LSST science collaboration members, extending and spinning off the initial set of notebooks. +Following a successful LSSTC "Enabling Science" proposal, we put together a 3-phase plan, which you can read about in more detail [here](https://docs.google.com/document/d/103kzjOklSUWo5MJP9B-EsnAdO7V6bstTC_mzBvd0NIk/edit#). Phase 0 involved collecting existing tutorials and identifying potential club members from around the LSST Science Collaborations. Then, in Phase 1 (late May 2018 to mid August 2018) we worked together in a small group to turn a subset of those existing "seed" tutorials into community-maintained Jupyter notebooks, for display at the August LSST 2018 Project and Community Workshop (PCW) in Tucson. At that meeting, we opened up to a larger group of LSST science collaboration members, extending and spinning off the initial set of notebooks. From 42917ee5e2cbd4d78089667cc12c490f0b4673a4 Mon Sep 17 00:00:00 2001 From: fred3m Date: Fri, 17 Aug 2018 07:21:27 -0700 Subject: [PATCH 5/8] Fix path to use public version of testdata_deblender --- Deblending/scarlet_tutorial.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Deblending/scarlet_tutorial.ipynb b/Deblending/scarlet_tutorial.ipynb index 4ea99f05..56b500b9 100755 --- a/Deblending/scarlet_tutorial.ipynb +++ b/Deblending/scarlet_tutorial.ipynb @@ -203,7 +203,7 @@ "source": [ "# Load the sample images\n", "idx = 53\n", - "file_path = \"/project/fred3m/code/testdata_deblender/real_data/hsc_cosmos/not_matched\"\n", + "file_path = \"/project/shared/data/testdata_deblender/real_data/hsc_cosmos/not_matched\"\n", "files = os.listdir(file_path)\n", "data = np.load(os.path.join(file_path, files[idx]))\n", "image = data[\"images\"]\n", @@ -448,9 +448,9 @@ ], "metadata": { "kernelspec": { - "display_name": "LSST", + "display_name": "Python 3", "language": "python", - "name": "lsst" + "name": "python3" }, "language_info": { "codemirror_mode": { From 47fbe32953f23ff498edfa78a92dee113d77a06b Mon Sep 17 00:00:00 2001 From: Phil Marshall Date: Fri, 17 Aug 2018 17:10:40 +0000 Subject: [PATCH 6/8] Added standard nb header --- Deblending/lsst_stack_deblender.ipynb | 20 +++++++++++++++++--- Deblending/scarlet_tutorial.ipynb | 20 +++++++++++++++++--- 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/Deblending/lsst_stack_deblender.ipynb b/Deblending/lsst_stack_deblender.ipynb index 4427f74a..26a2f97b 100755 --- a/Deblending/lsst_stack_deblender.ipynb +++ b/Deblending/lsst_stack_deblender.ipynb @@ -4,9 +4,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# LSST Stack Multiband Deblender Tutorial\n", + "# Using the LSST Stack Multiband Deblender \n", + "
Owner(s): **Fred Moolekamp** ([@fred3m](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@fred3m))\n", + "
Last Verified to Run: **2018-08-17**\n", + "
Verified Stack Release: **16.0**\n", "\n", - "This tutorial is designed to illustrate how to execture the multiband deblender (*scarlet*) in the LSST stack. This includes a brief introduction to LSST stack objects including:\n", + "This tutorial is designed to illustrate how to execute the multiband deblender (*scarlet*) in the LSST stack. This includes a brief introduction to LSST stack objects including:\n", "\n", " - Geometry classes from `lsst.geom`, such as points and boxes.\n", " - Higher-level astronomical primitives from `lsst.afw`, such as the `Image`, `Exposure`, and `Psf` classes.\n", @@ -14,7 +17,18 @@ " \n", "We'll be working with coadded images made from Subaru Hyper Suprime-Cam (HSC) data in the COSMOS field. We've taken a recent LSST reprocessing of the HSC-SSP UltraDeep COSMOS field (see [this page](https://confluence.lsstcorp.org/display/DM/S18+HSC+PDR1+reprocessing) for information on that reprocessing, and [this page](https://hsc-release.mtk.nao.ac.jp/doc/) for the data), and added simulated stars from a scaled [SDSS catalog](http://www.sdss.org/dr14/data_access/value-added-catalogs/?vac_id=photometry-of-crowded-fields-in-sdss-for-galactic-globular-and-open-clusters). The result is a very deep image (deeper than the 10-year LSST Deep-Wide-Fast survey, though not as deep as LSST Deep Drilling fields will be) with both a large number of galaxies and region full of stars.\n", "\n", - "This tutorial is based on Jim Bosch's globular cluster tutorial, however in it's present state *scarlet* is unable to process the crowded field (most likely) due to poor initial conditions for the sources in the field. So instead we use a region of the image outside of the cluster." + "This tutorial is based on Jim Bosch's globular cluster tutorial, however in it's present state *scarlet* is unable to process the crowded field (most likely) due to poor initial conditions for the sources in the field. So instead we use a region of the image outside of the cluster.\n", + "\n", + "### Learning Objectives:\n", + "\n", + "After working through this tutorial you should be able to: \n", + "1. Configure and run the LSST multiband deblender on a test list of objects;\n", + "2. Understand its task context in the DRP pipeline.\n", + "\n", + "### Logistics\n", + "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub.\n", + "\n", + "## Set-up" ] }, { diff --git a/Deblending/scarlet_tutorial.ipynb b/Deblending/scarlet_tutorial.ipynb index 4ea99f05..ddfaf0e8 100755 --- a/Deblending/scarlet_tutorial.ipynb +++ b/Deblending/scarlet_tutorial.ipynb @@ -4,11 +4,25 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Scarlet deblending Tutorial\n", + "# Deblending with *Scarlet*\n", + "
Owner(s): **Fred Moolekamp** ([@fred3m](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@fred3m))\n", + "
Last Verified to Run: **2018-08-17**\n", + "
Verified Stack Release: **16.0**\n", "\n", - "The purpose of this tutorial is to familiarize the user with the basics of using *scarlet* to model blended scenes, and how tweaking various objects and parameters affects the resulting model. A tutorial that is more specific to using scarlet in the context of the LSST DM Science Pipelines is also available.\n", + "The purpose of this tutorial is to familiarize you with the basics of using *scarlet* to model blended scenes, and how tweaking various objects and parameters affects the resulting model. A tutorial that is more specific to using scarlet in the context of the LSST DM Science Pipelines is also available.\n", "\n", - "Before attempting this tutorial it will be useful to read the [introduction](http://scarlet.readthedocs.io/en/latest/user_docs.html) to the *scarlet* User Guide, and many of the exercises below may require referencing the *scarlet* [docs](http://scarlet.readthedocs.io/en/latest/index.html)." + "### Learning Objectives:\n", + "\n", + "After working through this tutorial you should be able to: \n", + "1. Configure and run _scarlet_ on a test list of objects;\n", + "2. Understand its various model assumptions and applied constraints.\n", + "\n", + "Before attempting this tutorial it will be useful to read the [introduction](http://scarlet.readthedocs.io/en/latest/user_docs.html) to the *scarlet* User Guide, and many of the exercises below may require referencing the *scarlet* [docs](http://scarlet.readthedocs.io/en/latest/index.html).\n", + "\n", + "### Logistics\n", + "This notebook is intended to be runnable on `lsst-lspdev.ncsa.illinois.edu` from a local git clone of https://github.com/LSSTScienceCollaborations/StackClub.\n", + "\n", + "## Set-up" ] }, { From 58520d894c3c578567edbdba7875978cb32f7bf7 Mon Sep 17 00:00:00 2001 From: Phil Marshall Date: Fri, 17 Aug 2018 17:30:51 +0000 Subject: [PATCH 7/8] Verified to run --- Deblending/lsst_stack_deblender.ipynb | 7 ------- 1 file changed, 7 deletions(-) diff --git a/Deblending/lsst_stack_deblender.ipynb b/Deblending/lsst_stack_deblender.ipynb index 26a2f97b..1142ed15 100755 --- a/Deblending/lsst_stack_deblender.ipynb +++ b/Deblending/lsst_stack_deblender.ipynb @@ -609,13 +609,6 @@ "\n", "You will have to restart the kernel for this notebook session in order for the changes to take place." ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { From 0cdefb2ef9f3c5cfce2d82eac060c6537c1becdf Mon Sep 17 00:00:00 2001 From: Phil Marshall Date: Fri, 17 Aug 2018 17:33:13 +0000 Subject: [PATCH 8/8] Verified on w_2018_32 --- Deblending/lsst_stack_deblender.ipynb | 2 +- Deblending/scarlet_tutorial.ipynb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Deblending/lsst_stack_deblender.ipynb b/Deblending/lsst_stack_deblender.ipynb index 1142ed15..bcfa264b 100755 --- a/Deblending/lsst_stack_deblender.ipynb +++ b/Deblending/lsst_stack_deblender.ipynb @@ -7,7 +7,7 @@ "# Using the LSST Stack Multiband Deblender \n", "
Owner(s): **Fred Moolekamp** ([@fred3m](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@fred3m))\n", "
Last Verified to Run: **2018-08-17**\n", - "
Verified Stack Release: **16.0**\n", + "
Verified Stack Release: **w_2018_32**\n", "\n", "This tutorial is designed to illustrate how to execute the multiband deblender (*scarlet*) in the LSST stack. This includes a brief introduction to LSST stack objects including:\n", "\n", diff --git a/Deblending/scarlet_tutorial.ipynb b/Deblending/scarlet_tutorial.ipynb index f260f2a0..b98af66c 100755 --- a/Deblending/scarlet_tutorial.ipynb +++ b/Deblending/scarlet_tutorial.ipynb @@ -7,7 +7,7 @@ "# Deblending with *Scarlet*\n", "
Owner(s): **Fred Moolekamp** ([@fred3m](https://github.com/LSSTScienceCollaborations/StackClub/issues/new?body=@fred3m))\n", "
Last Verified to Run: **2018-08-17**\n", - "
Verified Stack Release: **16.0**\n", + "
Verified Stack Release: **w_2018_32**\n", "\n", "The purpose of this tutorial is to familiarize you with the basics of using *scarlet* to model blended scenes, and how tweaking various objects and parameters affects the resulting model. A tutorial that is more specific to using scarlet in the context of the LSST DM Science Pipelines is also available.\n", "\n",