Fixing a mistake in bandpower binning in the CMB-S4 analysis code

— J. Willmert

Clem identified an error in how I was binning \(D_ℓ\) spectra into bandpowers in the CMB-S4 analysis code. Here I explain what the mistake was and how to fix it.


Description of problem

\[ \newcommand{\llo}{ℓ_{b,\mathrm{lo}}} \newcommand{\lhi}{ℓ_{b,\mathrm{hi}}} \newcommand{\dlsf}{\frac{ℓ(ℓ+1)}{2π}} \]

Calculating bandpower values in the standard pipeline from Fourier modes \(a_{uv}\) occurs in calcspec(). For standard usage which takes uniform weighting of all modes in the Fourier plane, this translates mathematically to

\begin{align} D_b &= \frac{1}{N_b} \sum_{\llo ≤ ℓ < \lhi} \dlsf \left| a_{uv} \right|^2 \qquad \text{where} \qquad \left\{ \begin{aligned} ℓ &= 2π \sqrt{u^2 + v^2} \\ N_b &= \text{number of modes in the sums} \end{aligned} \right. \end{align}

where a bandpower bin \(b\) is defined by the bin edges \(\llo\) and \(\lhi\). In the context of using s2hat for CMB-S4 analysis, we replace the Fourier transform with spherical harmonic transforms that provide harmonic modes \(a_{ℓm}\). We can then acknowledge that \(ℓ\) and \(m\) are integers and rewrite the expression above as

\begin{align} D_b &= \frac{1}{N_b} \sum_{ℓ=\llo}^{\lhi-1} \dlsf \sum_{m=-ℓ}^ℓ \left| a_{ℓm} \right|^2 \label{eqn:bandpower} \end{align}

In the BK pipeline, maps are transformed and binned directly to bandpowers. For a reason I don't recall (maybe just an arbitrary choice that persisted), in the CMB-S4 analysis, I instead set up all the processing to calculate unbinned power spectra. To feed into the rest of our standard analysis, a post-processing step is required to mainly package up the spectra into standard pipeline data structures and files, and it is at this step that the spectra are binned.

Clem found, though, that a mistake was made in what I had coded. While I did correctly scale \(D_ℓ = ℓ(ℓ+1) C_ℓ / 2π\) before binning, I incorrectly took an average of the \(D_ℓ\) values within each bandpower bin:

\begin{align} D_b ≠ \frac{1}{\lhi-\llo} \sum_{ℓ=\llo}^{\lhi-1} D_ℓ \end{align}

This incorrectly weights low \(ℓ\) spectra more than higher \(ℓ\). If instead you start from Eq. \(\ref{eqn:bandpower}\) and manipulate to an expression in terms of \(D_ℓ\), you find that there is a missing factor of \((2ℓ+1)\) to account for the number of modes in each \(D_ℓ\).

\begin{align} D_b = \frac{1}{N_b} \sum_{ℓ=\llo}^{\lhi-1} (2ℓ + 1) D_ℓ \qquad\text{where}\qquad N_b = \sum_{ℓ=\llo}^{\lhi-1} 2ℓ + 1 = \lhi^2 - \llo^2 \end{align}

In the course of fixing this, I also found a couple more code errors:

  1. There's an error in building an array of \(ℓ\) values (due to forgetting that a list from 0 to \(N\) has length \(N + 1\)). Why this didn't cause a runtime error, though, is still a mystery to me... The logical mask built from inequalities on this vector will have length one greater than the dimension of the matrix being operated upon, but Matlab accepted it without any errors.
    
    --- a/cl2bandpowers.m   2018-12-08 14:26:54.525093975 -0500
    +++ b/cl2bandpowers.m   2018-12-08 14:53:13.859042052 -0500
    @@ -22,7 +22,7 @@
     %
    
     nbins = length(binedges) - 1;
    -l = 0:size(Cl,1);
    +l = 0:(size(Cl,1)-1);
    
     nd = ndims(Cl);
     sz = size(Cl);
    
    Figure 2.1 will show that this had absolutely no impact on the computed results.
  2. There was also an off-by-one error wherein the highest \(ℓ\) to be included in the bin was excluded from the sum; the index of the upper bin edge was already decremented, but then a less than sign was also used.
    
    --- a/cl2bandpowers.m   2018-12-08 14:53:13.859042052 -0500
    +++ b/cl2bandpowers.m   2018-12-08 14:32:03.560430038 -0500
    @@ -36,7 +36,7 @@
       be = binedges(ii+1) - 1;
       bc(ii) = 0.5 * (bs + be);
       % incorrect to use straight mean here - see comments in calcspec.m
    -  bp(ii,colons{:}) = mean(Cl(bs<=l&l<be,colons{:}), 1);
    +  bp(ii,colons{:}) = mean(Cl(bs<=l&l<=be,colons{:}), 1);
     end
    
     end
    
    Interestingly, though, I found a similar style of mistake exists in calcspec():
    
    % loop over bins
    for i=1:nbin
      ind=find(x>=binedge(i)&x<=binedge(i+1));
    
    One of the two comparisons should be a strict inequality to avoid a mode on the bin edge from being double counted. In practice, this occurs rarely, though, since the \(ℓ\) values from a Fourier transform are generally not integers. An empirical test for our standard map padding shows only 3 modes (out of 65536) are double counted:
    
    >> m = get_map_defn('bicep');
    >> map.Tvar = sparse(m.ny, m.nx);
    >> p = 2^nextpow2(m.nx);
    >> [m,map] = pad_map(m, map, p);
    >> ad = calc_ad2([m.xdos,m.ydos], [m.nx,m.ny]);
    >> bins = get_bins('bicep_norm');
    >> ad.u_l = 2*pi * ad.u_r;
    >>
    >> idx = find(cvec(bsxfun(@eq, cvec(ad.u_l), rvec(bins))));
    >> [l,b] = ind2sub([numel(ad.u_l), numel(bins)], idx);
    >> [vi,ui] = ind2sub(size(ad.u_l), l);
    >>
    >> tmp.l  = ad.u_l(l);
    >> tmp.lx = 2*pi * ad.u_val{1}(ui)';
    >> tmp.ly = 2*pi * ad.u_val{2}(vi)';
    >> tmp.edge = b;
    >> struct2table(tmp)
    
    ans =
    
      3x4 table
    
        l     lx    ly     edge
        __    __    ___    ____
    
         0    0       0     1
        90    0     -90     4
        90    0      90     4
    
    By chance these all end up being in the poly/gs trench, so this fact ends up having no impact.
  3. Finally, we fix the weighting factors used to bin the spectra.
    
    --- a/cl2bandpowers.m   2018-12-08 14:32:03.560430038 -0500
    +++ b/cl2bandpowers.m   2018-12-08 17:50:11.497393603 -0500
    @@ -31,12 +31,17 @@
     bc = zeros([nbins 1]);
     bp = zeros([nbins sz(2:end)]);
    
    +% Weight each Cl by number of alm modes
    +w = cvec(2*l + 1);
    +Cl = bsxfun(@times, Cl, w);
    +
     for ii=1:nbins
       bs = binedges(ii);
       be = binedges(ii+1) - 1;
    +  mask = bs <= l & l <= be;
       bc(ii) = 0.5 * (bs + be);
    -  % incorrect to use straight mean here - see comments in calcspec.m
    -  bp(ii,colons{:}) = mean(Cl(bs<=l&l<=be,colons{:}), 1);
    +  bp(ii,colons{:}) = sum(Cl(mask,colons{:}), 1) ./ sum(w(mask));
    +  assert(sum(w(mask)) == (be+1)^2 - bs^2);
     end
    
     end
    

Comparison of BPWFs

To give some sense of how important each of the bugs mentioned above are, I took the 01.00 BPWFs (sernum 1722) and re-binned them with each fix added in incrementally. Figure 2.1 plots the TT, EE, and BB bandpower window functions for the 85GHz band.

TT to TT, EE to EE, and BB to BB bandpower window functions for the 3% sky apodization mask first generated for the CMB-S4 01.00 analysis.

Switching from the first to second option shows no change — I verified that the numbers are identical and not just too small to see on a plot. Moving to the third option, each bandpower gets ever so slightly wider, as we'd expect from adding in one additional \(D_ℓ\) to the sums. Finally, the last option shows that the lowest \(ℓ\) bandpowers are impacted the most. This is in line with the expectation since the fractional change in the \((2ℓ+1)\) weight factor across the bin changes more at low \(ℓ\) than at high \(ℓ\).


Linked previous postings

[20170224_cmbs4_dc1_final]
HEALPix-space pure-B estimators, part 3 (JBW)
[20170129_deltamap_bk_vs_s2hat]
HEALPix-space pure-B estimators, part 2 (JBW)
[20170123_ps2hat_bindings]
HEALPix-space pure-B estimators, part 1 (JBW)