[Olympus] RFC: Blinding and proposal for rules for presentations and analysis

Fri Aug 29 14:37:17 EDT 2014

Hi all,

We are getting close to real results, I think it is time to raise the topic
of blinding again. I also have some proposals regarding forthcoming analyses
and presentations.

To not spring this on you during the meeting and to give you time to think
about these things, I wrote up what I have in mind (heavily numbered so
it's easy to refer to single statements):

Blinding:

1) Why do we need blinding:

1.1) To protect ourselves from accidental early release of results.
1.2) NOT to protect from deliberate early release of results.
1.3) To stop us from applying a personal bias to the results.

1.3 is the important point, but the continuous discussion about slides
in talks show that 1.1 is valid too.

2) How do we blind:

2.1) We will not blind 12 degree data. That means we will have to
continue to be vigilant about not releasing results, and there is a
serious danger of biasing ourselves. Why do we believe that the 2 photon
effect vanishes at 12 deg? If the answer is "Because the theoreticians
say so," first, they don't, and even if, why do we measure 2-photon in
the first place? If we believe that, why do we not believe the rest of
their predictions?

2.2) We will only blind the wire chamber. To do so, I will select tracks
from data and suppress them in the final tracked file. The selection
criteria is only known to me.
The question is now /when/ these tracks will be suppressed. The original
idea was:

2.3) I will track all files first (=3D unblinded), and generate from them
a new version with tracks removed (=3Dblinded). I will only release the
second set of files.
This method is rather safe from tampering since I do not have to release
the blinding code. If all track reconstruction is done only at MIT this is =

the preferred method. However, if other institutes/people will also =

reconstruct from the raw data then the blinding must be the done locally as =

well either with the same code or a local version of the blinding code. =

The tracking is in a state where everybody can do it. Another solution is:

2.4) I will add the blinding to the tracking plugin itself.
Of course, this means that the blinding code can be seen and deactivated
by everyone. However, I do not want to fight 1.2, so this should be OK.
I will make the check explicitly visible in the code, with the actual
calculation in a separate file. For general debugging, nobody has to
open that file, so just don't.

3) Upcoming new tracked files
3.1) I will release a new spin of the 100 files soon. This should happen
before the collaboration meeting unless I run into unexpected problems.
3.2) These files will be blinded following 2.1, since this is what we
agreed on earlier.

Forthcoming analyses:
4) We a still in a state where we have to work on different tasks
separately.
5) All calibration steps should be performed by at least two groups,
preferably with different code.
6) We will produce one canonical set of tracked files.
7) The final analysis is probably simple enough that it can be performed
by each student/interested person individually. By this, we will have a
good sample of systematics of personal bias.
8) These individual analyses can address many systematic errors
individually, but not all. This includes, for example, many
tracking related systematics. =

The different analyses will differ in their results, and we have to
understand the differences. To make this easier, I would propose the
following:
9) Each analysis should be supervised critically by somebody in all
steps. I can do this for MIT/ASU, but we need somebody at DESY to
fulfill the same role for the people in Europe. The idea behind this is
to ensure that there are no mistakes in the ground work which are then
hard to find if one sees only the result.
10) The /complete/ code to reproduce the analysis must be checked in to
git regularly. This means at least weekly!

To get used to this, and to reduce the noise, I propose the following
rules for Monday presentations:
11) Talks on Monday should be made available a couple of days in
advance. We had this rule before, but it has been eroded somewhat.
12) The talks should either have been presented before in one of the
local group meetings (Lumi, MIT or DESY analysis meeting), or been
approved by the local analysis coordinator (see 9)
13) The code/scripts to produce the results in the talk must be pushed
to git /before/ the approval or the local presentation. If the code is
good enough to produce results you want to show, then it is good enough
to be shown itself.
14) Each talk has to have a slide describing which code is used, and how
to reproduce the results. This doesn't have to be presented in the talk
itself, but should be included for reference.
15) If we find something questionable in the beginning of the talk which
might invalidate the rest of the findings, we will stop the presentation.

I think 13), in combination with 11) and 14) helps on multiple accords.
First, it gets us all into the habit of pushing the code into the repo.
Further, it may streamline presentations: During the presentations, many
questions are of the type: What generator did you use? What options? and
so on. While this should be in the talk, having the code allows
everybody to check. Additionally, it allows us to debug.
The recent bug in Denis code is a good example: Based on a simple
misconception about the C language rules, it is simple to introduce a bug, =

difficult for the original programmer to identify, but fairly simple for a =

second person to spot. This source of bugs is so common and recognized by
experts that different programming styles take it explicitly into account.
Extreme programming, for example, advocates that code is always written by =

one person with a second person looking over his shoulder. =

With the code at hand, instead of having to discuss this in on two Mondays, =

it would have been a thing of five minutes.

Please think about these points. We then can have a discussion during
the collaboration meeting.

Best,
Jan

-- =

Dr. Jan C. Bernauer
Massachusetts Institute of Technology
77 Massachusetts Ave, Room 26-441
Cambridge, MA, 02139, USA
Phone:  (617) 253-6580

-- =

Dr. Jan C. Bernauer
Massachusetts Institute of Technology
77 Massachusetts Ave, Room 26-441
Cambridge, MA, 02139, USA
Phone:  (617) 253-6580

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/olympus/attachments/20140829/013479e3=
/attachment.htm