Compatible cell and nucleus segmentation file formats

The following file formats are compatible for use with the import-segmentation pipeline:

Labeled mask in TIFF (32-bit) or NumPy NPY format
Polygons in GeoJSON format (FeatureCollection type)
Baysor output CSV and GeoJSON file formats
Xenium Onboard Analysis cells.zarr.zip file: the nucleus masks are used as input if --nuclei=cells.zarr.zip and can be expanded with --expansion-distance. The cell masks are used as input if --cells=cells.zarr.zip.

This pipeline has been tested using segmentation files generated by Cellpose v2.2.2, QuPath v0.4, and Baysor v0.6.

The polygon GeoJSON should be exported as a FeatureCollection. The cells and nuclei can be saved in the same (as shown below in QuPath's output format) or separate GeoJSON files. The format should look similar to this example:


{
   "type": "FeatureCollection",
   "features": [
     {
       "type": "Feature",
       "id": "12059192-3b27-4438-96dc-97b41ca84717",
       "geometry": {
         "type": "Polygon",
         "coordinates": [
           [
             [3418.94, 2],
             [3414.62, 3.85],
             [3414.12, 22.65],
             [3415.06, 27.26],
             [3420.1, 35.15],
             [3428.03, 40.22],
             [3437.01, 42.94],
             [3440.69, 45.87],
             [3445.32, 46.71],
             [3468.85, 46.71],
             [3477.88, 44.35],
             [3534.35, 44.31],
             [3546.79, 37.83],
             [3552.28, 30.11],
             [3552.94, 25.45],
             [3552.93, 6.63],
             [3550.68, 2.49],
             [3546, 2],
             [3418.94, 2]
           ]
         ]
       }
     }
   ]
}

Segmentation results from QuPath are compatible with Xenium Ranger and have these specifications:

The feature objectType should be cell, as features with a non-cell objectType will be ignored (i.e., annotations).
The --nuclei argument will use the nucleusGeometry polygon if it exists in the GeoJSON, otherwise it will use the geometry polygon. The --cells argument will use the geometry polygon.
QuPath exports nucleus and cell segmentation results in one file, so the same GeoJSON file should be specified for both --cells and --nuclei. The format should look similar to this example:


{
  "type":"FeatureCollection",
  "features": [
    {
      "type":"Feature",
      "id":"fd0c3d4e-6146-427d-9696-97fbe7adb63d",
      "geometry":{
        "type":"Polygon",
        "coordinates":[
          [
           [4348.52, 0],
           [4344.02, 1.37],
           [4341.18, 10.23],
           [4341.8, 24.31],
           [4346.73, 32.28],
           [4353.27, 38.99],
           [4361.18, 44],
           [4379.94, 44.48],
           [4388.64, 40.97],
           [4396.11, 35.26],
           [4404.71, 21.36],
           [4404.27, 2.55],
           [4400.29, 0.06],
           [4348.52, 0]
           ]
          ]
      },
      "nucleusGeometry":{
        "type":"Polygon",
        "coordinates":[
          [
           [4373.91, 4.81],
           [4366.61, 9.07],
           [4364.28, 12.61],
           [4364.3, 16.85],
           [4370.37, 22.75],
           [4378.35, 20.24],
           [4382.75, 13.06],
           [4380.87, 9.26],
           [4373.91, 4.81]
           ]
          ]
      },
      "properties":{
        "objectType":"cell"
      }
    }
  ]
}

Segmentation results from Baysor v0.6 and later are compatible with Xenium Ranger and have the specifications listed below.

Baysor segmentation CSV:

The CSV must have the following columns: transcript_id, cell, is_noise (order does not matter). The other default columns from Baysor are optional.

Here is an example of the required columns in CSV format:


transcript_id,cell,is_noise
281474976710656,CRc17aaabcd-3,false
281474976710657,CRc17aaabcd-3,false
281474976710658,CRc17aaabcd-3,false
[ ... ]

Baysor segmentation GeoJSON:

Should be of type GeometryCollection, where each geometry is of type Polygon.
The Polygon should contain an extra field called cell to identify which cell the polygon corresponds to.
Each cell in the GeoJSON must have at least one transcript assigned to it in the transcript assignment CSV. Filtering out Q-Score < 20 transcripts and negative controls prior to running any transcript-based segmentation method is recommended. However, unlike Xenium Ranger v2.0.1, Xenium Ranger v3.0 can handle cells with only low-quality transcripts without mixing up a cell's transcripts. See this Knowledge Base article for more information.

Here is an example of the GeoJSON format:


{
  "geometries": [
    {
      "coordinates": [
        [
          [15.891722, 21.441122],
          [13.95383, 21.565407],
          [14.233208, 25.427168],
          [15.927658, 26.113512],
          [17.464699, 25.211208],
          [16.642977, 22.83285],
          [15.891722, 21.441122]
        ]
      ],
      "type": "Polygon",
      "cell": 5
    }
  ],
  "type": "GeometryCollection"
}

Compatible cell and nucleus segmentation file formats

Overview

Polygons in GeoJSON

QuPath output format

Baysor output format