about us || news || projects || github || jobs || contact


reflections on converting the clementine and chandrayaan-1 m3 archives

introduction

In late 2020, the United States Geological Survey (USGS) contracted us to convert several of their archives from the Planetary Data System Version 3 (PDS3) standard to the Planetary Data System Version 4 (PDS4) standard. Specifically, these archives included all of the imaging data (both raw and map-projected) from the Clementine lunar orbiter's 1994 mission, and all raw and reduced data collected by the Moon Mineralogy Mapper instrument hosted on the Chandrayaan-1 lunar orbiter during its 2008-2009 mission. These data sets are crucial parts of the legacy of orbital lunar observation, but were in difficult-to-use formats and lacked modern metadata. Our new versions are considerably easier to use. They also come with comprehensive documentation and supplemental repositories containing all of the software we used in format conversion, ensuring that the archive will remain clear, traceable, and maintainable for many more decades.

The data are currently in peer review at the PDS, so we cannot yet share them with you, but you can explore our preliminary documentation and conversion software in our project repositories on GitHub:

The page you are reading now is for some broader reflections on the project.

notes on raw data 1: separation and immediacy

One of the most difficult and important parts of this project was making the least-processed available data from these instruments accessible: the level 0 (L0) data from M3 and the Experiment Data Record (EDR) from Clementine. They were the most difficult portions of the archives to work with, because they were in specialized formats unique to their missions and unreadable by any modern general-purpose software. (M3 L0 data may have been readable by some legacy software at one point; we are not sure. Clementine compressed EDR data, as far as we know, were never readable by any software that was not specially produced for the mission.)

Reduced data, like maps and images with camera lens distortion removed, are intrinsically more user-friendly. They more often contain immediately-recognizable objects; they are more often in physically meaningful units. But to really squeeze everything out of data, you need access to them in their rawest form, as close to the format an instrument produces as possible. Reduced data actively construct forms of contact with entities they supposedly observe (in this case, the Moon); the Clementine mosaics bring you far closer to their version of the Moon than the EDR can hope to (although other Moons, perhaps even better or truer Moons, exist). Raw data are interesting because they give us the closest contact possible with the act of observation and the observing entities (in this case, a bunch of cameras ranging from bizarre to quotidian, or at least as quotidian as something can be after you tape it to a spaceship).

notes on raw data 2: clementine edr videos

There are roughly 1.9 million images in Clementine's EDR. After placing them all into standard formats (FITS arrays and JPEG browse images with XML metadata labels), we realized that we might be holding the only full-resolution copy of the Clementine EDR that could be easily read by modern software. Out of general affection for the data, we decided to do some analysis in an unconventional mode.

Here are videos placing raw observations from each of Clementine's instruments end-to-end, from the first images returned after launch to the rotary glare of its thrusters as its attitude control systems failed, sending it into an endless, mission-ending spin. Although in 'chronological order', they are otherwise out of joint with time in April and May 1994 as it is conventionally understood. Their temporality matches the observational cadence of the mission, not clock time. Please take care: we have altered these images very little, and the observational cadence of the mission involved long-short exposure variations, spinning filter wheels, and other phenomena which lead these videos to often strobe and flicker in the 5-30 Hz range.

notes on raw data 3: multiplicity

This is an unusual way to look at, and think about, data produced by scientific observations. Data reduction is a narrative process. It makes sense out of the senseless and insensible, and it also makes the targets of observations unary and singular. Looking at this tangle of perspectives reminds us that the Moon may not be 'real', in the sense that we can have knowledge of it as a single fixed entity. The Moon is multiple, changing, and seen only through many glasses darkly.

The artifacts you see in these videos are all at least as real as the Moon. Only the text is new. This includes the obvious compression artifacts. Clementine had an onboard compression chip that allowed it to send many more images home than previous satellites, at some cost in fidelity (but fidelity to what?).

Finally, although the videos linked above ares basically an art project, we think this might actually be extensible as a quick and useful way to explore and QA EDR archives. Many problems in the data and metadata jump out immediately when they are viewed in rapid sequence.

pdr and pdr.converter

The project provided us with an opportunity to prototype a data conversion stack based on our pdr software. We will have more to say about that in this space later; for now, take a look at our LPSC abstract in which we briefly describe its architecture.