Error Taxonomy in Connectome Proofreading

Instructor Notes

This document is a standalone instructor script. It provides the full narrative, real references, worked examples, and a misconceptions table. Adapt the depth and pacing to your audience; the material here is intentionally detailed so that nothing needs to be improvised.


1. Why Errors Matter

1.1 The Scale Problem

Modern automated segmentation pipelines have reached impressive accuracy. Lee et al. (2019) demonstrated flood-filling networks that achieve “superhuman” voxel-level accuracy on benchmark datasets. Yet even a per-edge error rate of 1-5 % becomes catastrophic at connectome scale.

Key calculation to present on the board:

Consider a neuron whose arbor passes through 1,000 supervoxel-to-supervoxel edges in the segmentation graph. If each edge has a 1 % probability of being wrong, the probability that the neuron is completely error-free is:

P(no error) = (1 - 0.01)^1000 = 0.99^1000 approx 0.000043

That is a 99.996 % chance of at least one error somewhere along the neuron. Even at 0.1 % per-edge error rate the probability of a fully correct neuron with 1,000 edges is only about 37 %.

Instructor tip: Walk students through this calculation live. It is the single most effective way to motivate why proofreading exists as a field.

1.2 Errors Are Not Equally Harmful

A boundary that is shifted by two voxels matters much less than a merge that fuses two excitatory neurons into one. The taxonomy below is ordered roughly by frequency (most common first), but the hierarchy of downstream impact is: merge errors > split errors > boundary errors > identity errors in terms of how often each type corrupts a connectome analysis.


2. Merge Errors

2.1 Definition

A merge error occurs when two or more distinct neurons are incorrectly joined into a single segment by the automated pipeline. In the supervoxel graph representation, this means there is at least one false edge linking supervoxels that belong to different ground-truth neurons.

2.2 Causes

Cause Mechanism Typical context
Close membrane apposition Two membranes approach within < 20 nm, falling below the model’s resolution limit Parallel axons in dense neuropil
Low staining contrast Membrane signal drops, model cannot distinguish boundary Poor fixation regions, section edges
Blood vessel boundaries Vessel endothelium creates a false membrane-like boundary that “bridges” two neurites Capillaries running through neuropil
Glial wrapping Thin astrocytic processes wrap around neurites; model confuses glial membrane for neuronal continuity Perisynaptic astrocyte processes
Alignment artifacts Section-to-section registration errors shift a membrane out of alignment, creating an apparent connection Regions with tissue folds or tears

2.3 Downstream Impact

2.4 Visual Signatures

2.5 Worked Example: Identifying a Merge Error

  1. You are examining a 3D mesh of a putative pyramidal cell.
  2. The cell has a normal-looking apical dendrite, but one basal branch suddenly changes caliber from ~400 nm to ~120 nm and takes a sharp 90-degree turn.
  3. You navigate to the turn point in 2D (XY) slices.
  4. In slices z = 312-314, you see two distinct membrane-bounded profiles that are separated by a visible (but faint) membrane in z = 311 and z = 315, but in z = 313 the membrane is absent and the two profiles are merged into one label.
  5. Diagnosis: merge error caused by a one-section gap in membrane signal.
  6. Action: split at the merge point.
  7. Verification: after splitting, each segment has consistent caliber and biologically plausible trajectory.

3. Split Errors

3.1 Definition

A split error occurs when a single neuron is broken into two or more disconnected segments. In the supervoxel graph, a true edge is missing.

3.2 Causes

Cause Mechanism Typical context
Thin processes Axons < 100 nm diameter drop below reliable detection Small-caliber axons in cortical neuropil
Low contrast regions Membrane signal too faint for model confidence Same as merge causes, but model errs conservatively
Missing or damaged sections Physical section lost during collection Any region; creates a gap in z
Sharp turns in z A process that curves sharply between sections appears to “jump” in xy Ascending/descending axons
Alignment errors Misregistration makes a continuous process appear discontinuous Tissue fold boundaries

3.3 Downstream Impact

3.4 Visual Signatures

3.5 Worked Example: Identifying a Split Error

  1. You are tracing a descending axon from a layer 2/3 pyramidal cell.
  2. At z = 487, the axon segment ends abruptly. There is no terminal bouton.
  3. You scroll to z = 488. Section 488 is visibly damaged (knife mark across the tissue). The axon profile is not segmented in this section.
  4. At z = 489, a small orphan fragment appears at approximately the same (x, y) position, with matching caliber (~130 nm) and trajectory.
  5. Diagnosis: split error caused by a damaged section.
  6. Action: merge the upstream segment with the downstream fragment.
  7. Verification: the merged segment has smooth, continuous trajectory across the damaged section.

4. Boundary Errors

4.1 Definition

A boundary error occurs when the membrane position is shifted relative to ground truth, causing partial volume sharing between adjacent segments. Neither segment is topologically wrong (no false merge, no false split), but the exact border is inaccurate.

4.2 Impact

4.3 Worked Example

  1. You are measuring spine head volumes for a population of excitatory synapses. Several spine heads have volumes 30-40 % larger than expected.
  2. On inspection, the segmentation boundary around those spines extends 2-3 voxels into the adjacent dendritic shaft or into the presynaptic bouton.
  3. Diagnosis: boundary error. The spine volume is inflated by the extra voxels.
  4. Action: manual boundary correction or exclude affected spines from quantitative analysis.

5. Identity Errors

5.1 Definition

An identity error occurs when the segmentation boundary is correct, but the cell label assigned to a region is wrong. This typically happens as a side effect of proofreading itself: when a merge is split, the two resulting fragments must each receive a label, and the wrong label may be propagated to the wrong fragment.

5.2 Impact

5.3 Worked Example

  1. After splitting a merge between neuron A (inhibitory basket cell) and neuron B (excitatory pyramidal cell), the proofreader assigns the perisomatic basket terminals to neuron B by mistake.
  2. Downstream analysis flags neuron B as making an unusual number of perisomatic synapses, inconsistent with pyramidal cell biology.
  3. Re-inspection reveals the identity swap.
  4. Action: reassign the segment to neuron A.

6. Relative Frequency and Impact Hierarchy

Based on reports from large-scale proofreading campaigns:

Error type Approximate frequency Downstream impact Detectability
Merge errors 40-60 % of all errors Very high Moderate (3D inspection)
Split errors 30-50 % of all errors High Moderate (dead-end search)
Boundary errors 5-15 % of all errors Moderate Low (quantitative only)
Identity errors < 2 % of all errors Very high per instance Very low

Berger et al. (2018) found that in the Drosophila medulla, merge errors dominated in dense neuropil while split errors dominated in regions with thin axonal processes. Plaza et al. (2014) reported similar patterns in the Drosophila optic lobe, noting that the ratio of merge to split errors depends heavily on the agglomeration threshold used in the pipeline: a conservative (high) threshold produces more splits and fewer merges, while an aggressive (low) threshold produces more merges and fewer splits.


7. Common Misconceptions

Misconception Reality
“Superhuman accuracy means proofreading is unnecessary.” Superhuman refers to voxel-level accuracy on small benchmarks. At connectome scale, even small error rates compound to near-certainty of errors per neuron.
“Merge errors are always obvious in 3D.” Small merges (e.g., two thin axons fused for a few sections) can be nearly invisible in 3D mesh views. They are often found only through connectivity analysis.
“Split errors are less harmful than merge errors.” For connectivity analysis, split errors cause missing edges, which can be just as damaging as the false edges from merges, depending on the scientific question.
“Boundary errors don’t matter.” For any analysis involving synapse assignment or fine morphological measurement (spine volume, bouton size), boundary errors are critically important.
“More proofreading always helps.” Poorly executed proofreading can introduce new errors (especially identity errors). Quality-controlled proofreading with verification is essential.
“Automated error detection finds all errors.” Current detectors have recall of 60-80 % for merge errors and lower for splits. Many errors remain undetected by automated methods.

8. References


End of instructor script: Error Taxonomy in Connectome Proofreading