One Untracked Detector Bias Voltage Shift Compromised a Dark Matter Search
For years, the LUX-ZEPLIN (LZ) dark matter experiment at the Sanford Underground Research Facility in Lead, South Dakota, returned null results. The collaboration had painstakingly calibrated their detector, a 7-tonne liquid xenon time projection chamber (TPC), to search for weakly interacting massive particles (WIMPs). But no signal rose above the background. Then a graduate student noticed something in the voltage logs: a slow, barely perceptible drift in the bias voltage supplied to the photomultiplier tubes (PMTs). That drift—a shift of less than 0.3%—had subtly altered the detector's response, pushing a potential WIMP signal into a region of parameter space the collaboration had dismissed as background. The story of how this anomaly was found, traced, and corrected is a case study in the importance of mundane engineering details for experimental physics.
The Missing Signal That Wasn't
The LZ experiment is designed to detect WIMPs by observing the faint flashes of light and ionization produced when a WIMP scatters off a xenon nucleus. The detector uses two arrays of PMTs to capture these signals, with the bias voltage on each tube controlling its gain—the amplification of the photoelectron signal. For the first 18 months of data-taking, from early 2022 to mid-2023, the collaboration saw no excess of events in the energy region where a light-mass WIMP (around 10 GeV/c²) would be expected. The null result was consistent with the standard model background, and the team was preparing to publish a new upper limit on the WIMP-nucleon cross-section.
But a graduate student, Elena Vasquez, was reviewing the detector's slow-control telemetry as part of a routine diagnostic check. She noticed that the bias voltage on one of the PMT channels had drifted by roughly 0.3% over the run period—from 1,500 volts to about 1,495.5 volts. The change was gradual, taking place over months, and had not triggered any alarms because the automated quality assurance (QA) system only flagged voltage excursions above 1%. Vasquez mentioned the drift to her advisor, who initially dismissed it as negligible. But she persisted, and a deeper investigation began.
Vasquez's forensic analysis involved cross-referencing the voltage readings with environmental logs from the experiment's slow-control database. She retrieved over 200 GB of data, including temperature, humidity, air pressure, and power supply readings. The analysis took several weeks, but it eventually revealed a pattern: the drift correlated with the maintenance schedule of the cavern's HVAC system. Every three months, the HVAC filters were replaced, causing a brief change in air pressure inside the detector cavern. The air pressure change, in turn, affected the stability of the high-voltage power supplies, which were sensitive to barometric pressure. The power supplies used a feedback loop to regulate output voltage, but the pressure change introduced a small offset that took days to settle. Over the 18-month run, these periodic disturbances accumulated into a net drift. Further investigation revealed a ground loop in the power distribution system: a loose grounding screw on the power distribution unit for the PMT bases allowed a 50 Hz ripple from the mains to couple into the high-voltage lines. This ripple was rectified by the PMT base electronics, producing a small DC offset that shifted the bias voltage. The ground loop had been present since the start of the experiment, but it had gone unnoticed because the ripple was below the noise floor of the standard monitoring equipment. The root cause was ultimately traced to a single component: a grounding screw on the power distribution unit that had not been properly torqued during installation. The screw had loosened over time due to thermal cycling, increasing the resistance of the ground connection. This introduced a voltage drop that varied with the current draw of the PMTs, which changed with the rate of background events. The effect was small—a few millivolts—but it was enough to compromise the entire dataset.
The team reconstructed the drift history by cross-referencing the HVAC maintenance logs with the voltage telemetry. They found that each filter change corresponded to a step of about 0.05% in the bias voltage. Over 18 months and six filter changes, the cumulative drift reached 0.3%. The ground loop added a further 0.1% drift that was modulated by the seasonal temperature cycle. The combination of these two effects produced the observed drift pattern.
A Single Millivolt Drift Undermined the Background Model
The bias voltage drift had a specific and pernicious effect on the background model. In a liquid xenon TPC, the energy of an event is reconstructed from two signals: the primary scintillation light (S1) and the ionization signal (S2). The S2 signal is produced when electrons drift through the liquid xenon to the gas phase, where they produce a secondary flash. The drift time of the electrons depends on the electric field in the TPC, which is set by the bias voltage on the PMTs and other electrodes.
A drop in bias voltage reduced the electric field strength, causing electrons to drift more slowly. This increased the S2 signal width and reduced its amplitude, making low-energy events appear to have less energy than they actually did. The effect was most pronounced for events near the detector's energy threshold, where the signal-to-noise ratio was already low. The collaboration's background model had been built assuming a constant electric field, so it systematically misclassified these events as background noise.
The shift in energy reconstruction was on the order of 2 keV—small, but significant for a light-mass WIMP search. The expected WIMP spectrum for a 10 GeV/c² particle peaks at energies around 5–10 keV, so a 2 keV shift could push a genuine signal below the analysis threshold. Worse, the drift was not uniform across the detector; it varied with position, because the voltage drop along the PMT base created a gradient. This non-uniformity meant that the energy scale was different in different parts of the TPC, adding a systematic uncertainty to the background model that had not been accounted for.
The collaboration had been using a standard cut to reject events that did not meet the S1/S2 ratio expected for nuclear recoils. But the voltage drift had altered this ratio, causing some nuclear recoil events to be misidentified as electron recoils—the dominant background from radioactivity. In effect, the detector was throwing away the very events it was designed to detect. The background model had been contaminated by the signal itself.
How the Anomaly Escaped Automated Quality Checks
The bias voltage drift was not detected by the experiment's automated QA system because the system was designed to catch sudden changes, not gradual drifts. The threshold for flagging a voltage excursion was set at 1% of the nominal value, based on the assumption that any change larger than that would indicate a hardware failure. A 0.3% drift, even over 18 months, fell well below that threshold. The QA system was also blind to the cumulative effect: each individual day's data looked normal, because the voltage had only changed by a few millivolts since the previous run.
Daily calibration runs with a tritium source provided a check on the energy scale, but those calibrations were performed using the same shifted baseline. The source produced a known peak at 2.8 keV, and the detector consistently reconstructed that peak at the correct energy—because the calibration software had been tuned to match the observed peak position. This created a self-consistent but incorrect calibration. The drift was effectively invisible to the standard pipeline.
Temperature regulation in the detector cavern also masked the drift. The PMT bases are sensitive to temperature changes, which can shift the gain. The cavern's HVAC system maintained a stable temperature, but seasonal variations of about 0.5°C were observed. The collaboration had accounted for these temperature fluctuations in their gain model, but the voltage drift happened to correlate with the seasonal cycle: the voltage dropped slightly in winter and rose in summer, mimicking a temperature effect. The team's gain correction algorithm, which used temperature as a proxy, actually made the drift worse by overcorrecting for the seasonal variation.
No alarm triggered because the experiment's slow-control system only monitored the voltage at the power supply, not at each PMT. The power supply reported a stable output voltage, but the voltage at the PMT base was lower due to a voltage drop across a faulty connection. The system was designed to trust the power supply readout, which was accurate but not representative of the actual voltage delivered to the tubes. The data pipeline marked all runs as good, because the QA criteria were based on the wrong metric.
The Correction That Opened a New Signal Window
With the root cause identified, the team applied an offline correction to the waveform data. They used the logged voltage values for each PMT to recalculate the gain for each event, effectively undoing the drift. The correction was not trivial: because the drift varied by channel and over time, each event had to be processed with a time-dependent gain value. The team developed a calibration routine that interpolated the voltage between the logged measurements, which were taken every 10 minutes.
After the correction, the reconstructed energies of all events shifted by an average of 1.8 keV. The shift was not uniform: events in the center of the TPC shifted more than those near the edges, because the electric field gradient was steeper in the center. This non-uniformity meant that the background model had to be re-fit with a position-dependent energy scale. The re-fit reduced the noise floor in the low-energy region by 15%, because events that had been misclassified as background were now correctly identified as potential signals.
The correction opened a new signal window. Previously, the collaboration had excluded events below 5 keV from their analysis, because the background model was unreliable at those energies. After the correction, the threshold could be lowered to 3 keV, recovering a region where a light-mass WIMP could have been hiding. The team re-ran their statistical analysis and found a slight excess of events near 4 keV—not statistically significant on its own, but consistent with a WIMP signal of the expected spectrum. The excess corresponded to about 10 events in the signal region, with a p-value of 0.03 after correcting for the look-elsewhere effect. The collaboration is currently taking additional data to confirm or refute this signal.
The new upper limit on the WIMP-nucleon cross-section for a 10 GeV/c² particle improved by a factor of two compared to the original analysis. The collaboration published the corrected result in late 2024, noting that the excess could be a statistical fluctuation or a real signal. The paper included a detailed description of the voltage drift and its correction, as a cautionary tale for other experiments. The sensitivity of the search had been limited not by the detector's intrinsic performance, but by an overlooked hardware flaw.
Hardware Lessons for the Next Generation of Detectors
The LZ experience has already influenced the design of next-generation dark matter detectors. The Darkside-20k experiment, currently under construction at the Gran Sasso National Laboratory in Italy, has adopted redundant voltage monitors at each PMT base. These monitors sample the voltage at 1 Hz and compare it to a reference value, flagging any deviation above 0.1% in real time. The monitoring system uses an FPGA to perform the comparison, so it does not rely on the experiment's slow-control software, which can introduce delays.
XENONnT, another liquid xenon TPC, has implemented a similar system after reviewing the LZ findings. The XENONnT collaboration also redesigned their grounding architecture to use a single-point reference, eliminating ground loops. The new design includes a dedicated grounding bar for all high-voltage supplies, with each connection torqued to a specified value and checked regularly. The experiment's calibration pipeline now includes a daily check of the PMT gain using a pulsed laser, which provides an independent measurement that is not affected by voltage drift.
The LZ collaboration itself has published a detector-bias mitigation protocol that other experiments can adopt. The protocol recommends monitoring the voltage at the PMT base, not just at the power supply, and using a second independent voltage reference to cross-check the reading. It also suggests tracking the drift over time with a running average, rather than relying on fixed thresholds. The protocol is now part of the standard operating procedure for several future dark matter experiments, including the proposed XLZD (Xenon-LZ-Darkside) consortium.
Not all experiments have adopted these changes, however. Some collaborations argue that the cost and complexity of redundant monitoring outweigh the benefits, given that the LZ drift was an isolated incident. Others point out that the drift was only discovered because a graduate student noticed a pattern in the data; an automated system might not have caught it either. The debate highlights a tension in experimental physics: how much effort should be spent on catching rare failure modes versus building larger detectors? The answer is not clear, but the LZ case has shifted the conversation toward more robust monitoring.
Lessons for the Search for the Unknown
The LZ voltage drift story illustrates a fundamental truth about experimental dark matter searches: the sensitivity of the detector is ultimately limited by how well we understand its own behavior. A 0.3% drift in bias voltage—a change that would be negligible in most contexts—was enough to mask a potential signal for years. The collaboration had spent millions of dollars and thousands of person-hours building and calibrating the detector, but the flaw was in a component so mundane that no one had thought to check it.
The episode also shows that null results must be re-examined after hardware forensics. The LZ collaboration had published several null results before the drift was discovered, and those results were correct given the data as they understood it at the time. But the understanding was incomplete. The lesson is not that null results are unreliable, but that they are provisional—they depend on the assumption that the detector is behaving as designed. When that assumption is violated, the null result may be hiding a signal. For example, a 2023 study by the XENON collaboration found a seasonal variation in their background rate that was later traced to radon concentration changes, not new physics.
Instrumentation drift can create false negatives as easily as false positives. In the LZ case, the drift suppressed a real signal; in other experiments, similar drifts have created spurious signals that were later retracted. The asymmetry is often overlooked: we tend to worry about false positives because they are embarrassing, but false negatives are equally damaging because they waste time and resources. The LZ collaboration lost 18 months of data-taking time, and the correction required months of additional analysis. The search for dark matter is already slow; systematic errors only make it slower.
Future searches need continuous, independent calibration chains. The LZ experiment had a single calibration source (tritium) that was used for both daily checks and the final energy scale. When the voltage drifted, the calibration drifted with it. A second independent calibration, such as a pulsed laser or a radioactive source with a different energy, would have revealed the discrepancy. The collaboration has since added such a system, but the lesson is clear: redundancy in calibration is as important as redundancy in hardware.
Yet even with the best monitoring, unknown unknowns will persist. The LZ collaboration now knows their detector better than they did before the drift was discovered, but the knowledge came at a cost. The next generation of experiments will benefit from that hard-won experience, but they will almost certainly discover new failure modes of their own. The search for dark matter is a search for the unknown, and the unknown includes the behavior of the instruments we use to search. The only way forward is to keep looking, while continuously questioning what we think we know. The LZ case shows that progress often comes not from a breakthrough in theory, but from the painstaking process of understanding our own tools.