Maximum likelihood search results for Data Challenge 04, models 0 to 9

  B. Racine


New round of selection and changes in priors. We now look at the effect of progressively releasing the priors on parameters, and we also impose a new cut on the bad simulations (outliers in map domain).
Tables and default plots now show a case with only generous flat priors on all parameters. Other tables with different priors are linked.
Results including the Chile (04b) and Pole (04c) masks are shown in this separate posting


Introduction

This posting summarizes results from analysis of CMB-S4 Data Challenge 04 using a BICEP/Keck-style parametrized foreground model.
There are a few additional changes compared to the DC02 analysis:
we fixed the lensing spectrum to the input one. In the previous posting it uses a lensing spectrum that differs from the input,
we zero out the theory spectrum below ell=30, since that's what was done in the sims.

The current posting is a slight update over this previous DC04 posting, where we introduced models 7, 8 and 9.

In section 1, we show the main results in the form of figures and histograms including foreground parameters.
In section 2, we isolate the effect of cutting extra simulations based on an outlier detection at the map level.
In section 3, we report tables of r constraints for the different sky models, for different lensing residuals, with and without decorrelation in the ML search.

Note about the model:
In the former analyses, for each realization, we found the set of model parameters that maximizes the likelihood multiplied by priors on the dust and sync spectral index parameters (\(\beta_d\) and \(\beta_s\)). These priors are based on Planck data, so they are quite weak in comparison with CMB-S4 sensitivity. However, in principle foreground models may violate them potentially leading to biases (e.g. model 03 where the preferred value of \(\beta_d\) is outside the prior range - see below, Figure 2).
In the current analysis, we remove all the remaining parameter priors step-by-step:

The model includes the following parameters:

For the decorrelation model, we assume that the cross-spectrum of dust between frequencies \(\nu_1\) and \(\nu_2\) is reduced by factor \(\exp\{log(\Delta_d) \times [\log^2(\nu_1 / \nu_2) / \log^2(217 / 353)] \times f(\ell)\}\). For the \(\ell\) dependence we fix the scaling to take a linear form (pivot scale is \(\ell\)=80).

Note about the simulations: (see section 3 for more detail.)

1: Summary Plots

In Figure 1, we summarize the r results, as well as the L=-log(Likelihood) values for different priors imposed on the \(\beta\) and \(A_L\) parameters.
Some of these models produce strong biases, especially model 8, which still has a very significant bias even when we take into account decorrelation in the model. See section 3, above table 8 for discussions. It seems like using L=-log(Likelihood), we could barely detect that this model is a bad fit. Here are temporary plots showing the likelihood distribution. Model 9 has a weaker bias but its maximum likelihood distribution is off the model 00.

Comments on the figure:

Figure 1: Summary of the results for different values of \(A_L\), in red, the analysis of the simulations with r=0.003, in green, the r=0 case. On the left, the case without decorrelation in the parametrization, on the right, with linear-\(\ell\) decorrelation. The outer error bars show the standard deviation \(\sigma\) of the \(N_{sims}\) simulations' ML results ( \(N_{sims}\simeq 500\)), and the inner error bars show \(\sigma/\sqrt{N_{sims}}\). We report both the r values, as well as L=-log(Likelihood).

Comments on the figure:

Figure 2: Figure representing the ML parameters histograms for models 04.00 to 04.09.
In the histograms, we report the mean of the distribution as a red line, and the fiducial input value as a black line (if defined, see this table for 04.00). We also report the mean±standard deviation of the distribution.
For models 00 and 07, we also report the biases in in terms of the \(\sigma\) of the mean, i.e. (mean(\(\theta_{ML}\))- \(\theta_{input}\))/(std(\(\theta_{ML}\))*\(\sqrt{N_{sims}}\))).

2: Bad simulations cut

In this previous posting, we introduced a likelihood cut to reject badly flawed simulations in models 00 and 07. Here we go a step further where instead of rejecting the ones with a bad fit, we reject the ones that are outliers at the map level. Clem just made a posting about this, here.
Similarly, here, I computed the standard deviation (and the mean) of the inputs maps of model 00 and used the same outlier detection algorithm I used for the likelihood outliers, based on the modified Z-score method (with a threshold of 5, see here). We find 5 more outliers with this method, not sure why.
The list is here (in bold the ones that were already detected at likelihood cut, in italic the oned that Clem's method didn't pick up):


Same comments as in Clem's posting: The complete list of bad realizations with zero based indexing (same as the filenames) is above. These need to be screened out of all .00 re-analysis results so far, and also .07 since this is built from the same maps. Model .05 is built from the alms and is not effected as we can see here. Before any future round of sims these maps will be regenerated to properly fix the problem.

Comments on the figures:

Figure 3: Summary of the results for different values of \(A_L\), in red, the analysis of the simulations with r=0.003, in green, the r=0 case. On the left, the case without decorrelation in the parametrization, on the right, with linear-\(\ell\) decorrelation. The outer error bars show the standard deviation \(\sigma\) of the \(N_{sims}\) simulations' ML results ( \(N_{sims}\simeq 500\)), and the inner error bars show \(\sigma/\sqrt{N_{sims}}\). We report both the r values, as well as L=-log(Likelihood).
Figure 4: Figure representing the ML parameters histograms for models 04.00 to 04.09.
In the histograms, we report the mean of the distribution as a red line, and the fiducial input value as a black line (if defined, see this table for 04.00). We also report the mean±standard deviation of the distribution.

3: DC4 results table

In the following tables, we report the \(r\) results for the case where all the parameters have generous flat priors.
For the other cases, see the tables in the following links:
Gaussian priors on \(\beta\)'s, fixed \(A_L\)
free \(\beta\)'s, fixed \(A_L\)
free \(\beta\)'s, Gaussian 5% prior on \(A_L\)
free \(\beta\)'s, free \(A_L\) (as below)

00: Gaussian foregrounds

The mean values and standard deviations of \(r\) for simulations with simple Gaussian foregrounds are summarized in Table 00. With a 10% lensing residual, we don't quite achieve \(\sigma(r) = 5 \times 10^{-3}\) for sims with \(r = 0\).

Turning on dust decorrelation in the model doesn't cause any bias in \(r\) and the recovered \(\Delta_d\) values are centered around 1 (i.e. analysis recovers zero decorrelation). Adding this parameter does increase \(\sigma(r)\) somewhat.

Table 00:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with simple Gaussian foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none0.104±2.6870.033±0.9770.013±0.480 0.003±0.300
linear0.062±2.7170.001±1.045-0.013±0.573-0.017±0.406
Input \(r\) = 0.003
none3.097±2.7673.019±1.1403.009±0.6543.013±0.475
linear3.111±2.9223.022±1.3113.008±0.8083.013±0.609

The fiducial model used for this model are in the following table.

Table 0ter:
Fiducial model parameter values.
r \(A_d\) \(\beta_d\) \(A_s\) \(\beta_s\) \(\alpha_d\) \(\alpha_s\) \(\epsilon\) \(\Delta_d\)
0/0.003 4.25 \(\mu K^2\) 1.6 3.8 \(\mu K^2\) -3.1 -0.4 -0.6 0 0

01: PySM a1d1s1f1

As has been previously noted, dust power is much higher in this model (\(A_d \sim 12.5 \mu K^2\)) than for the Gaussian foreground sims (\(A_d = 4.25 \mu K^2\)). The PySM d1 dust model does feature a spatially varying spectral index, but we don't find any detectable decorrelation in this analysis. The PySM s1 synchrotron model yields \(A_s \sim 0.5 \mu K^2\) and there is \(\sim 6\)% correlation between dust and sync.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 01:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a1d1f1s1 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none1.176±2.7450.926±1.0090.729±0.5050.579±0.324
linear0.990±2.8260.725±1.1300.501±0.6370.343±0.451
Input \(r\) = 0.003
none4.150±2.8273.879±1.1893.715±0.7063.613±0.526
linear3.993±2.9493.717±1.3383.541±0.8473.438±0.657

02: PySM a2d4f1s3

The d4 version of PySM dust adds a second dust component (with different blackbody temperature and emissivity power law) based on Meisner & Finkbeiner (2014). Not sure what type of \(\beta_d\) spatial variations are included in this model, but Colin thinks it is more or less the same as for d1. The s3 synchrotron model adds curvature to the synchrotron spectral index: \(\beta_s \rightarrow \beta_s + C \ln (\nu / \nu_C)\). The a2 AME model uses a 2% polarization fraction for AME, which seems very high, but there is no attempt to model AME in this analysis.

Results for this model show that \(A_d\) is even larger (\(\sim 32.5 \mu K^2\)) than for the d1 dust model. The mean value of \(\beta_d\) decreases from 1.59 (for PySM d1 model) to 1.55, which is probably a sign of the two component dust. The mean value of \(\beta_s\) decreases from -3.05 (for PySM s1 model) to -3.13, which is probably due to synchrotron spectral curvature (and perhaps polarized AME?). Dust–sync correlation is higher, at \(\sim 10\)%, which could be from polarized AME.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a much higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 02:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a2d4f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none0.608±2.7620.468±1.0320.390±0.5160.336±0.323
linear0.273±2.8770.181±1.1740.157±0.6530.141±0.448
Input \(r\) = 0.003
none3.608±2.7253.470±1.1483.401±0.6903.363±0.518
linear3.307±2.7833.206±1.2433.176±0.7883.165±0.611

03: PySM a2d7f1s3

The next PySM version uses the Hensley/Draine dust model, which has additional complexity in the dust SED (perhaps described in arXiv:1709.07897?). The level of dust power is similar to sky model 01 (PySM d1 model), but we find that the emissivity power law is even flatter than the last case, with \(\beta_d \sim 1.44\).

The recovered means seem quite wacky, and \(A_L\) dependent.

Note here that the value of \(\sigma(r)\) doesn't change much compared to the 04.00 case despite having a higher dust level. This is probably due to the fact that we only have one realization of the foreground sky (see note in the introduction), thus no impact from cosmic variance.

Table 03:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with PySM a2d7f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none0.872±2.726 0.754±1.004 0.639±0.515 0.534±0.340
linear-0.019±2.847-0.015±1.162-0.001±0.671-0.011±0.482
Input \(r\) = 0.003
none4.180±2.8173.871±1.1813.702±0.7033.593±0.527
linear3.309±2.8903.104±1.3273.049±0.8733.025±0.696

04: Ghosh dust model

The Ghosh dust model (described here) is based on GASS HI data with a model for the Galactic magnetic field. For these sims, it is combined with the PySM a2, f1, and s3 components (same as the two previous models).

The analysis of this model yields smaller still values of \(\beta_d \sim 1.3-1.4\). Dust-sync correlation is still present, but smaller (2–3%), which is probably due to the fact that the Ghosh dust sims don't know anything about the PySM synchrotron or AME components. The fact that they are correlated at all probably happens because both models are based on data at larger scales.

Dust decorrelation is small in absolute terms, but detected at high significance. Using a model without dust decorrelation leads to a large positive bias on \(r\) in the range \(4-5 \times 10^{-3}\). Dust decorrelation with linear \(\ell\) scaling produces the smallest biases, but still quite large compared to other sky models.

Table 04:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with Ghosh dust model plus PySM a2f1s3 foregrounds.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none1.410±2.870 1.491±1.220 1.527±0.721 1.536±0.521
linear-1.460±2.951-1.110±1.324-0.961±0.835-0.861±0.651
Input \(r\) = 0.003
none4.814±3.1604.757±1.4554.783±0.9424.798±0.755
linear1.744±3.1811.976±1.5322.118±1.0362.221±0.851

05: Gaussian decorrelated dust model

This model has extremely large dust decorrelation (15% between 217 and 353 GHz at \(\ell\) = 80) and it exactly follows the assumed functional form of decorrelation with linear \(\ell\) scaling, so we can still draw some useful conclusions.

When we choose decorrelation with linear \(\ell\) scaling to match the sims, then we find no bias on \(r\) and recover \(\Delta_d\) = 0.85.

An important point to note from this model is that, even for the unbiased case where the decorrelation is correctly modeled in both \(\nu\) and \(\ell\), we find \(\sigma(r) \sim 1.4\), much larger than the target sensitivity of CMB-S4. This shows that, for extreme levels of foreground decorrelation, we lose the ability to clean foregrounds from the maps because the foreground modes are significantly independent between the various CMB-S4 frequencies. Regardless of whether you are doing map-based cleaning or fitting the power spectra as we do here, the only way to improve sensitivity would be use more observing bands that are more closely spaced. It also makes the point that our Fisher forecasts should assume some non-zero level of decorrelation. Adding decorrelation as a free parameter to a forecast that assumes \(\Delta_d = 1\) only captures part of the statistical penalty.

Table 05:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 50 realizations with Gaussian decorrelated dust foreground.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none34.336±6.86326.576±4.36923.450±2.69823.144±2.513
linear0.182±3.665 0.036±2.013 0.006±1.511 -0.013±1.318
Input \(r\) = 0.003
none37.838±7.30830.288±4.52126.711±2.79026.610±2.574
linear3.020±3.835 3.029±2.186 3.054±1.689 3.047±1.524

06: Flauger MHD foregrounds

Our understanding is that this model uses MHD simulations to consistently model polarized dust and synchrotron in the Galactic magnetic field. This makes it quite interesting that this analysis finds negative dust-sync correlation with \(\epsilon \sim -0.36\). The dust power is similar to the Gaussian sims, and \(\beta_d\) matches the Planck value of 1.59. This analysis finds a synchrotron SED power law that is much flatter than usual, \(\beta_s \sim -2.6\), which is inconsistent with the prior at about \(1.5 \sigma\).

This model does not show any significant dust decorrelation. In general, the results for this model look nearly as good as the simple Gaussian foregrounds (sky model 00).

Table 06:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 500 realizations with Flauger MHD foreground model.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none0.166±2.6550.091±1.0160.059±0.5370.045±0.351
linear0.300±2.7040.211±1.1200.159±0.6610.129±0.477
Input \(r\) = 0.003
none3.311±3.0643.178±1.2903.126±0.7413.098±0.527
linear3.456±3.1103.313±1.3863.244±0.8593.203±0.655

07: Amplitude modulated Gaussian foregrounds.

This model is described here. It is a modified version of model 00, where the brightness of the dust varies across the sky. It does not include any decorrelation. It was mostly developed to study the effect of mask variations, shown in this separate posting.

Table 07:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 150 realizations with Flauger MHD foreground model.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none-0.005±2.6530.005±0.961 0.003±0.466 -0.003±0.287
linear-0.093±2.725-0.046±1.081-0.028±0.598-0.024±0.417
Input \(r\) = 0.003
none3.318±2.7183.103±1.1693.040±0.7073.016±0.525
linear3.407±2.8633.156±1.3533.073±0.8883.040±0.698

08: MKD model (3D multi-layer).

Model 8 has been developed by Martinez-Solaeche, Karakci, Delabrouille, (see this paper). As described in their abstract: This is a "three-dimensional model of polarised galactic dust emission that takes into account the variation of the dust density, spectral index and temperature along the line of sight, and contains randomly generated small scale polarisation fluctuations. The model is constrained to match observed dust emission on large scales, and match on smaller scales extrapolations of observed intensity and polarisation power spectra.". It is based on a multi-layer model where \(T_d\), \(\beta_d\) and the optical depth \(\tau\) is defined in each layer, constrained by Planck, IRAS, some 3D dust extinction maps. A simple model of the galactic magnetic field is used to generate the large scale polarization. For the small scales, some Gaussian random I, E and B based on Planck observed power spectra are generated in each layers. It is then extrapolated at different frequencies, based on random realizations of \(\beta_d\) and \(T_d\) in the different layers, defining the dust SED.

This model naturally produces dust decorrelation, due to a varying SED on the sky. It is also expected to produce a flattening at low frequency, as is briefly reported in figure 19 of the paper. This might be explaining the large bias we observe on r, reduced but still present at high significance when including a decorrelation parameter.

Table 08:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 150 realizations with multi-layer MKD result.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none6.391±2.9425.310±1.1804.399±0.6613.687±0.457
linear4.446±3.0333.428±1.3332.612±0.8442.015±0.648
Input \(r\) = 0.003
none9.271±2.8808.130±1.3667.288±0.8936.675±0.688
linear7.486±2.9816.441±1.5475.753±1.0975.286±0.894

09: Vansyngel model.

Model 9 has been developed by Vansyngel et al. (see this paper, and this posting). In this model, each layer has the same intensity (constrained by the Planck intensity map), but different magnetic field realizations. It produces Q and U maps by integrating along the LOS over these multiple layers of magnetic fields. This magnetic field, contrary to the previous model, is simulated down to small turbulent scales, which produce more physically motivated non-Gaussian fluctuations in the maps (down to small scales). These maps are then linearly rescaled to match the TE correlation from Planck and E-B asymetry. (Note that in the map we study here, there is no TE correlation, see here ).

This model naturally produces non-Gaussian dust patterns, but the decorrelation is ad-hoc, via some extrapolation at different frequencies using a pixel-dependent modified blackbody emission law. It is much stronger than in, say, model 01, which also have such extrapolation. I think this might be dut to the fact that PySM uses \(\beta_d\) and \(T_d\) maps from \(\texttt{Commander}\), whereas Flavien uses the same recipe as the FFP sims described here, which use the GNILC maps. Flavien made this plot, using only 1% of the pixels of the maps (Note that the reduction of the spread might be due to the resolution of the Commander map, which might have been smoothed?).

Note that in the current results, the bandwidth at 20GHz used in the ML search was the usual one used for other models (5GHz), whereas the simulations have been generated with a width of 6GHz.

Table 09:
Mean \(r \times 10^3\) and \(\sigma(r) \times 10^3\) from sets of 150 realizations with Vansyngel foreground model.
Decorrelation model \(A_L\) = 1 \(A_L\) = 0.3 \(A_L\) = 0.1 \(A_L\) = 0.03
Input \(r\) = 0
none14.917±2.23813.291±1.19112.503±0.85712.165±0.728
linear8.009±2.432 6.542±1.394 5.999±1.078 5.816±0.969
Input \(r\) = 0.003
none18.046±2.13816.414±1.27315.601±0.96915.241±0.847
linear11.119±2.4939.648±1.588 9.080±1.244 8.874±1.098

Appendix A: Table Results for CDT report/forecasting paper.

As can be seen in figure 4 of the last posting, we still have biases even in the case of the Gaussian foreground simulations, mostly for the foreground parameters.

Just as for the CDT report, we remove this "algorithmic bias" to focus on the bias produced by the different dust simulations. We also chose to report results using the linear \(\ell\) dependence for the decorrelation model. See caption of Table 10. As we have seen in , the "algorithmic bias" on r is now in this case basically irrelevant. Note that we also improved the ones on the foreground parameters.

The other cases with different biases on the parameters are available in the links at the beginning of section 3

Table 10:
Bias on \(r\), obtained by subtracting the mean of the model 00 (the now negligible algorithmic bias, see Table 00) from each of the 6 complex foreground models, for the case \(A_L\) = 0.1, assuming no decorrelation or linear decorrelation for the Gaussian model. For this Gaussian foreground case, we report a bias based on the absolute value of the sample variance on the mean for \(\simeq 500\) sims, which acknowledges statistical limitations exist even for closed-loop tests calibrated by MC sims. .
\(r\) bias \(\times 10^4\)\(\sigma(r) \times 10^4\)\(r\) bias \(\times 10^4\)\(\sigma(r) \times 10^4\)
No decorrLinear decorr
r=0
04.000.2 6.5 0.7 10.5
04.017.2 5.0 5.1 6.4
04.023.8 5.2 1.7 6.5
04.036.3 5.2 0.1 6.7
04.0415.1 7.2 -9.5 8.4
04.05234.4 27.0 0.2 15.1
04.060.5 5.4 1.7 6.6
04.07-0.1 4.7 -0.2 6.0
04.0843.9 6.6 26.2 8.4
04.0925.6 5.7 -4.3 8.2
\(r\) bias \(\times 10^4\)\(\sigma(r) \times 10^4\)\(r\) bias \(\times 10^4\)\(\sigma(r) \times 10^4\)
No decorrLinear decorr
r=0.003
04.000.7 8.2 0.9 10.5
04.017.1 7.1 5.3 8.5
04.023.9 6.9 1.7 7.9
04.036.9 7.0 0.4 8.7
04.0417.7 9.4 -8.9 10.4
04.05237.0 27.9 0.5 16.9
04.061.2 7.4 2.4 8.6
04.070.3 7.1 0.6 8.9
04.0842.8 8.9 27.4 11.0
04.0924.6 8.2 -2.3 10.5