Maximum likelihood search results for Data Challenge 04, fixed bandpasses

  B. Racine

This posting is a rerun of DC4 with fixed bandpasses, as well as cutting a few really bad fits that occured for some of the 00 sky models.


This posting summarizes results from analysis of CMB-S4 Data Challenge 04 using a BICEP/Keck-style parametrized foreground model.
There are a few additional changes compared to the DC02 analysis:
we fixed the lensing spectrum to the input one. In the previous posting it uses a lensing spectrum that differs from the input,
we zero out the theory spectrum below ell=30, since that's what was done in the sims.

The current posting is an update over this previous DC04 posting, which we keep separate for comparison.
As in the previous posting, we now use power spectra that have been re-generated to fix a bug, as shown in this posting.

In section 1, we show the main results in the form of figures and histograms, as well as triangle plots including foreground parameters. These rely on new cuts and changes explained in the following sections.
In section 2, we introduce a major improvement for model 00 from the rejection of a few flawed simulations.
In section 3, we show how the use of proper conventions for the banpasses seem to give more correct constraints, especially on the foreground parameters.
These 2 changes concern model 0, 5 and 6.
In section 4, we report tables of r constraints for the different sky models, for different lensing residuals, with and without decorrelation in the ML search.

To-do: In the next run, we will probably release the priors on the foreground parameters.

Note about the model:
In this analysis, for each realization, we find the set of model parameters that maximizes the likelihood multiplied by priors on the dust and sync spectral index parameters (\(\beta_d\) and \(\beta_s\)). These priors are based on Planck data, so they are quite weak in comparison with CMB-S4 sensitivity. However, in principle foreground models may violate them potentially leading to biases (e.g. model 03 where the preferred value of \(\beta_d\) is outside the prior range - see below, Figure 2).

The model includes the following parameters:

For the decorrelation model, we assume that the cross-spectrum of dust between frequencies \(\nu_1\) and \(\nu_2\) is reduced by factor \(\exp\{log(\Delta_d) \times [\log^2(\nu_1 / \nu_2) / \log^2(217 / 353)] \times f(\ell)\}\). For the \(\ell\) dependence we fix the scaling to take a linear form (pivot scale is \(\ell\)=80).

Note about the simulations:

1: Summary Plots

In Figure 1, we summarize the r results using the new L-cut and bandpass correction fix. The changes are quite minimal, except for a decrease of \(\sigma(r)\) for model 00, which ranges from 3% to 20% depending on the lensing residual level and decorrelation models, as reported in this table.
As we will see in section 3 below the improvement is more important for the foreground parameters.

Figure 1: Summary of the results for different values of \(A_L\), in red, the analysis of the simulations with r=0.003, in green, the r=0 case. On the left, the case without decorrelation in the parametrization, on the right, with linear-\(\ell\) decorrelation. The outer error bars show the standard deviation \(\sigma\) of the \(N_{sims}\) simulations' ML results ( \(N_{sims}\simeq 500\)), and the inner error bars show \(\sigma/\sqrt{N_{sims}}\).

Note that for model 4, \(\beta_d\) is in slight tension with the imposed prior: Gaussian centered at 1.6 with width 0.11. Similarly for \(\beta_s\) in model 6, (compared to the Gaussian centered at -3.1 with width 0.3).

Figure 2: Figure representing the ML parameters histograms for models 04.00 to 04.06, using the proper bandpass conventions, as well as taking into account the L-cut described in Section 2.

2: Flawed simulations removal

While analysing the distribution of the ML parameters for the 1000 "04.00" simulations, we found a few outliers. It seems like for some realizations have a flawed foreground component. At least some of these bad realizations have missing synchrotron in 85/95GHz as can be seen in the following pager link.
One way to filter out those realization is to apply a cut based on the goodness of fit. Here we use the log-likelihood (H-L likelihood) value and detect outliers using the modified Z-score method (with a threshold of 5, see here).
In Figure 3, we show the 1000 -log(L) values for sky models 00 to 06. Only model 00 has obvious outliers.
We can also see that somehow the first 100 simulations of model 5 are different from the 900 following. It seems to be due to a different decorrelation parameter value (see this figure).
The effect of these cuts can be seen in Figure 4. The main effect is a slight reduction of the error bars.
Caveat: note that this method relies on the fact that a flawed simulation results in a bad fit, which is not necessarily true.

Figure 3: L=-log(Likelihood) for each sky models, for different lensing residuals, and with or without modeling decorrelation. The red discs show the detected outliers.

3: Bandpass conventions

We also found that in the previous XX.00 runs, the level of foregrounds was significantly off in the ML search, as we see in Figure 2, for example in this case. The mean of the recovered ML amplitude parameter is shifted compared to the fiducial model used in the simulations by 20 to 30 \(\sigma\).
We realized that there was a missmatch in the bandpass conventions used in the Maximum Likelihood search. As is states in this definition page:
"What a tophat bandpass actually means is a little tricky and discussed in Bandpass Convention - What does flat mean. Models 00/05/06/09 use nu^-2 in spectral radiance (SR) units, while models 01/02/03 use nu^0 in SR units. (What 04 uses is not known.) "
Until now, the \(\nu^0\) convention was used for all models.
We ran some new ML searches where we updated to the proper convention for models 0, 5 and 6.

Comments on the figure:

Figure 4: Figure representing the ML parameters histograms for model 04.00, for different bandpass conventions, as well as taking into account the L-cut described in Section 2.

4: DC4 results table

00: Gaussian foregrounds

The mean values and standard deviations of \(r\) for simulations with simple Gaussian foregrounds are summarized in Table 00. With a 10% lensing residual, we don't quite achieve \(\sigma(r) = 5 \times 10^{-3}\) for sims with \(r = 0\).

Turning on dust decorrelation in the model doesn't cause any bias in \(r\) and the recovered \(\Delta_d\) values are centered around 1 (i.e. analysis recovers zero decorrelation). Adding this parameter does increase \(\sigma(r)\) somewhat.

Table 00:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with simple Gaussian foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
linear -0.787±2.554-0.211±1.003-0.047±0.565-0.009±0.404
Input \(r\) = 0.003
none 2.102±2.6182.675±1.0832.868±0.6182.960±0.445

When comparing these results to the old posting, we get an improvement on \(\sigma(r)\) due to the outlier rejection. We report these decrease in % in Table 00bis.

Table 00bis:
Decrease in the \(\sigma(r)\) due to the new loglike cut (the new bandpass convention has negligible effect). We report 100x(\(\sigma(r)_{new}\)-\(\sigma(r)_{old}\))/\(\sigma(r)_{old}\).
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
linear -8.8%-11.6% -7.3% -3 %
Input \(r\) = 0.003
linear-5.6%-12 %-13 %-10.6%
Table 0ter:
Fiducial model parameter values.
r \(A_d\) \(\beta_d\) \(A_s\) \(\beta_s\) \(\alpha_d\) \(\alpha_s\) \(\epsilon\) \(\Delta_d\)
0/0.003 4.25 \(\mu K^2\) 1.6 3.8 \(\mu K^2\) -3.1 -0.4 -0.6 0 0

01: PySM a1d1s1f1

As has been previously noted, dust power is much higher in this model (\(A_d \sim 12.5 \mu K^2\)) than for the Gaussian foreground sims (\(A_d = 4.25 \mu K^2\)). The PySM d1 dust model does feature a spatially varying spectral index, but we don't find any detectable decorrelation in this analysis. The PySM s1 synchrotron model yields \(A_s \sim 0.5 \mu K^2\) and there is \(\sim 6\)% correlation between dust and sync.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 01:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a1d1f1s1 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none 0.225±2.5220.593±0.9310.590±0.4710.526±0.306
linear 0.089±2.6380.494±1.1000.479±0.6380.387±0.445
Input \(r\) = 0.003
none 3.139±2.6093.524±1.0943.569±0.6483.563±0.483

02: PySM a2d4f1s3

The d4 version of PySM dust adds a second dust component (with different blackbody temperature and emissivity power law) based on Meisner & Finkbeiner (2014). Not sure what type of \(\beta_d\) spatial variations are included in this model, but Colin thinks it is more or less the same as for d1. The s3 synchrotron model adds curvature to the synchrotron spectral index: \(\beta_s \rightarrow \beta_s + C \ln (\nu / \nu_C)\). The a2 AME model uses a 2% polarization fraction for AME, which seems very high, but there is no attempt to model AME in this analysis.

Results for this model show that \(A_d\) is even larger (\(\sim 32.5 \mu K^2\)) than for the d1 dust model. The mean value of \(\beta_d\) decreases from 1.59 (for PySM d1 model) to 1.55, which is probably a sign of the two component dust. The mean value of \(\beta_s\) decreases from -3.05 (for PySM s1 model) to -3.13, which is probably due to synchrotron spectral curvature (and perhaps polarized AME?). Dust–sync correlation is higher, at \(\sim 10\)%, which could be from polarized AME.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a much higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 02:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a2d4f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none -0.230±2.5480.210±0.954 0.301±0.4780.316±0.300
linear -0.542±2.716-0.001±1.1560.154±0.6590.177±0.445
Input \(r\) = 0.003
none 2.603±2.5383.142±1.0783.281±0.6453.335±0.480
linear 2.348±2.6602.981±1.2353.162±0.7953.196±0.613

03: PySM a2d7f1s3

The next PySM version uses the Hensley/Draine dust model, which has additional complexity in the dust SED (perhaps described in arXiv:1709.07897?). The level of dust power is similar to sky model 01 (PySM d1 model), but we find that the emissivity power law is even flatter than the last case, with \(\beta_d \sim 1.44\).

The recovered means seem quite wacky, and \(A_L\) dependent.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 03:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a2d7f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none -0.073±2.5100.431±0.921 0.512±0.4720.493±0.313
Input \(r\) = 0.003
none 3.102±2.6303.484±1.1113.544±0.6633.542±0.498
linear 2.232±2.7482.826±1.3163.061±0.8723.135±0.680

04: Ghosh dust model

The Ghosh dust model (described here) is based on GASS HI data with a model for the Galactic magnetic field. For these sims, it is combined with the PySM a2, f1, and s3 components (same as the two previous models).

The analysis of this model yields smaller still values of \(\beta_d \sim 1.3-1.4\). Dust-sync correlation is still present, but smaller (2–3%), which is probably due to the fact that the Ghosh dust sims don't know anything about the PySM synchrotron or AME components. The fact that they are correlated at all probably happens because both models are based on data at larger scales.

Dust decorrelation is small in absolute terms, but detected at high significance. Using a model without dust decorrelation leads to a large positive bias on \(r\) in the range \(4-5 \times 10^{-3}\). Dust decorrelation with linear \(\ell\) scaling produces the smallest biases, but still quite large compared to other sky models.

Table 04:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with Ghosh dust model plus PySM a2f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none 2.177±2.644 3.196±1.189 4.105±0.778 5.251±0.660
Input \(r\) = 0.003
none 5.596±3.0386.657±1.4327.837±0.9949.371±0.903

05: Gaussian decorrelated dust model

This model has extremely large dust decorrelation (15% between 217 and 353 GHz at \(\ell\) = 80) and it exactly follows the assumed functional form of decorrelation with linear \(\ell\) scaling, so we can still draw some useful conclusions.

When we choose decorrelation with linear \(\ell\) scaling to match the sims, then we find no bias on \(r\) and recover \(\Delta_d\) = 0.85.

An important point to note from this model is that, even for the unbiased case where the decorrelation is correctly modeled in both \(\nu\) and \(\ell\), we find \(\sigma(r) \sim 1.4\), much larger than the target sensitivity of CMB-S4. This shows that, for extreme levels of foreground decorrelation, we lose the ability to clean foregrounds from the maps because the foreground modes are significantly independent between the various CMB-S4 frequencies. Regardless of whether you are doing map-based cleaning or fitting the power spectra as we do here, the only way to improve sensitivity would be use more observing bands that are more closely spaced. It also makes the point that our Fisher forecasts should assume some non-zero level of decorrelation. Adding decorrelation as a free parameter to a forecast that assumes \(\Delta_d = 1\) only captures part of the statistical penalty.

Table 05:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 50 realizations with Gaussian decorrelated dust foreground.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
Input \(r\) = 0.003
linear1.926±3.346 2.632±1.826 2.908±1.412 3.027±1.284

06: Flauger MHD foregrounds

Out understanding is that this model uses MHD simulations to consistently model polarized dust and synchrotron in the Galactic magnetic field. This makes it quite interesting that this analysis finds negative dust-sync correlation with \(\epsilon \sim -0.36\). The dust power is similar to the Gaussian sims, and \(\beta_d\) matches the Planck value of 1.59. This analysis finds a synchrotron SED power law that is much flatter than usual, \(\beta_s \sim -2.6\), which is inconsistent with the prior at about \(1.5 \sigma\).

This model does not show any significant dust decorrelation. In general, the results for this model look nearly as good as the simple Gaussian foregrounds (sky model 00).

Table 06:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with Flauger MHD foreground model.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none -0.748±2.398-0.191±0.916-0.035±0.4840.025±0.321
linear -0.546±2.4920.018±1.071 0.137±0.654 0.134±0.476
Input \(r\) = 0.003
none 2.406±2.9382.883±1.2393.023±0.7083.081±0.500

Appendix A: Table Results for CDT report

As can be seen in Figure 4, we seem to have biases even in the case of the Gaussian foreground simulations, mostly for the foreground parameters. In this posting this last has been improved.

Just as for the CDT report, we remove this "algorithmic bias" to focus on the bias produced by the different dust simulations. We also chose to report results using the linear \(\ell\) dependence for the decorrelation model. See caption of Table 07. As we have seen in , the "algorithmic bias" on r is is now in this case basically irrelevant. Note that we also improved the ones on the foreground parameters.

Table 07:
Bias on \(r\), obtained by subtracting the "algorithmic bias" to the mean of \(r\) for each of the 6 complex foreground models, for the case \(A_L\) = 0.1, assuming linear decorrelation for the Gaussian model. The "algorithmic bias" is estimated from the Gaussian simulations (-0.047±0.565)\(\times 10^{-3}\) in th r=0 case, and (2.954±0.811)\(\times 10^{-3}\) in the r=0.003 case (see Table 00). For this Gaussian foreground case, we report a bias based on the absolute value of the sample variance on the mean for \(\simeq 500\) sims, which acknowledges statistical limitations exist even for closed-loop tests calibrated by MC sims.
\(r\) bias \(\times 10^4\) \(\sigma(r) \times 10^4\)
Input \(r\) = 0
04.00 0.3 5.6
04.01 5.3 6.4
04.02 2.0 6.6
04.03 0.7 6.7
04.04 -5.8 8.1
04.05 -0.3 12.8
04.06 1.8 6.5
Input \(r\) = 0.003
04.00 0.4 8.1
04.01 5.5 8.5
04.02 2.1 8.0
04.03 1.1 8.7
04.04 -4.0 10.3
04.05 -0.5 14.1
04.06 2.6 8.6