Output File Formats¶
PyReduce produces FITS files containing extracted spectra. This page documents the file structure.
Spectra Format (v2)¶
The current format stores spectra in a FITS binary table with one row per trace.
Files are identified by header keyword E_FMTVER = 2.
Header Keywords¶
Keyword |
Description |
|---|---|
|
Format version (2 for current format) |
|
Comma-separated list of pipeline steps run |
|
Extraction oversampling factor |
|
Slit function smoothing parameter |
|
Spectrum smoothing parameter |
|
Swath width (if set) |
|
Barycentric velocity correction (km/s) |
Table Columns¶
The binary table extension (named SPECTRA) contains:
Column |
Format |
Description |
|---|---|---|
|
|
Extracted spectrum (float32). NaN for masked pixels. |
|
|
Uncertainty (float32). NaN for masked pixels. |
|
|
Spectral order number (see below). -1 if unknown. |
|
|
Group identifier (‘A’, ‘B’, ‘cal’, or bundle index). |
|
|
Fiber index within group (1-indexed). -1 if unknown. |
|
|
Extraction height used for this trace |
|
|
Wavelength in Angstroms (float64, optional) |
|
|
Continuum level (float32, optional) |
|
|
Slit function (float32, optional, NaN-padded) |
Spectral Order Number (M)¶
The M column contains the physical spectral (diffraction) order number, not a
sequential index. In echelle spectrographs, higher order numbers correspond to
shorter wavelengths.
The order number is assigned during reduction via:
order_centers.yaml: If the instrument provides this file, traces are matched to known order centers during detection.
Wavelength calibration: The linelist file contains
obase(base order number). Each trace getsm = obase + trace_index.Fallback: For legacy files or MOSAIC mode,
Mmay be -1 (unknown) or sequential from 0.
The order number is used in 2D wavelength calibration polynomials. See Wavelength Calibration for details.
Each row corresponds to one extracted trace/order.
Masking¶
Invalid pixels are marked with NaN in the SPEC and SIG columns. This
replaces the separate COLUMNS array used in the legacy format.
Reading Spectra¶
from pyreduce.spectra import Spectra
# Load spectra (handles both v2 and legacy formats)
spectra = Spectra.read("observation.science.fits")
# Access individual spectra
for s in spectra.data:
print(f"Order {s.m}, fiber {s.fiber}")
print(f" Wavelength range: {s.wave[~s.mask].min():.1f} - {s.wave[~s.mask].max():.1f} A")
# Get stacked arrays
arrays = spectra.get_arrays()
spec_2d = arrays["spec"] # shape (ntrace, ncol)
Legacy Echelle Format (v1)¶
Files without E_FMTVER or with E_FMTVER < 2 use the legacy format.
Structure¶
The binary table has a single row containing flattened 2D arrays:
Column |
Format |
Description |
|---|---|---|
|
|
Flattened spectrum array |
|
|
Flattened uncertainty array |
|
|
Flattened wavelength array |
|
|
Flattened continuum array |
|
|
Column range [start, end] per trace |
The TDIM keyword stores the original shape as (ncol, ntrace).
Key Differences from v2¶
Aspect |
Legacy (v1) |
Current (v2) |
|---|---|---|
Table rows |
1 (flattened) |
ntrace (one per spectrum) |
Masking |
Separate |
NaN in data |
Order info |
Not stored |
|
Group info |
Not stored |
|
Fiber index |
Not stored |
|
Extraction height |
Not stored |
|
Slit function |
Separate files |
|
Reading Legacy Files¶
Spectra.read() automatically detects and handles legacy files:
from pyreduce.spectra import Spectra
# Works for both formats - auto-detects via E_FMTVER header
spectra = Spectra.read("old_file.fits")
# Access data the same way regardless of original format
for s in spectra.data:
print(f"Order {s.m}: {len(s.spec)} pixels")