DESI’s data systems are responsible for target selection, survey planning, data processing pipelines, and the transfer, archiving, and distribution of its data products. Lists of targets for DESI to observe are created with data from the Legacy Imaging surveys, and the WISE and Gaia satellites.
Mock survey simulations help determine the most efficient target allocations and field selection for survey operations. During operations, the ‘Next Field Selector’ will use these algorithms to choose the optimal field to observe in realtime to maximize overall survey efficiency.
At Kitt Peak, a “quicklook” pipeline will process the data within 3 minutes for quality assurance monitoring by the observers. The raw data will be transferred in realtime to the National Energy Research Scientific Computing Center (NERSC) at Berkeley Lab for more detailed processing, science analyses, and archiving. The spectroscopic data processing pipeline extracts the spectra from the raw data, subtracts the sky model, flux calibrates them, and measures their classifications and redshifts. There will be about 10 terabytes (TB) of raw data per year transferred from Kitt Peak National Observatory to NERSC, as well as a separate archival copy at the NSF’s National Optical-Infrared Astronomy Research Laboratory (NSF’s OIR Lab) in Tucson, Arizona.
After running the data through the pipelines at NERSC (using millions of CPU hours), there will be about 100 TB year of data products that will be made available as data releases approximately once per year throughout DESI’s 5 years of operations.