Module 13: Machine Learning in Neuroscience

Teaching Deck

Learning Objectives

Build feature pipelines for neuron and synapse-level analyses
Compare supervised and unsupervised methods for connectomics tasks
Evaluate model quality with biologically meaningful metrics
Detect data leakage and distribution-shift risks in connectomics ML

Session Outcomes

Learners can complete the module capability target.
Learners can produce one evidence-backed artifact.
Learners can state one limitation or uncertainty.

Agenda (60 min)

0-10 min: Frame and model
10-35 min: Guided practice
35-50 min: Debrief and misconception correction
50-60 min: Competency check + exit ticket

Capability Target

Design and critique an ML analysis pipeline for connectomics that includes feature rationale, evaluation plan, leakage controls, and interpretation limits.

Concept Focus

1) Feature engineering defines the hypothesis space

Technical: feature choices encode assumptions about what variation is biologically meaningful.
Plain language: your model can only learn what your features allow.
Misconception guardrail: adding more features always improves science.

Core Workflow

See module page for details.

60-Minute Run-of-Show

**00:00-08:00 Task framing and leakage examples**
**08:00-20:00 Feature rationale workshop**
**20:00-34:00 Split strategy and baseline modeling**
**34:00-46:00 Error analysis and biologically relevant metrics**
**46:00-56:00 Model-card limitation writing**
**56:00-60:00 Competency checkpoint**

Misconceptions to Watch

Misconception guardrail: adding more features always improves science.
Misconception guardrail: one summary metric is enough.
Misconception guardrail: random split always gives valid generalization estimates.

Studio Activity

Activity Output Checklist

Evidence-linked artifact submitted.
At least one limitation or uncertainty stated.
Revision point captured from feedback.

Assessment Rubric

Minimum pass
Feature and split decisions are justified.
Metrics include at least one biologically targeted criterion.
Limitation statement is specific and actionable.
Strong performance
Identifies and mitigates likely leakage channels.
Uses error analysis to propose next data improvements.
Distinguishes exploratory model from deployment-ready model.
Common failure modes
Leakage-prone random splits for spatially correlated data.
Overfocus on aggregate accuracy.
Claims of biological insight unsupported by model diagnostics.

Exit Ticket

For one candidate model, write:

one plausible leakage pathway,
one metric blind spot,
one limitation you would report publicly.

References (Instructor)

Januszewski et al. (2018) for segmentation ML context.
UMAP paper (McInnes et al., 2018) for embedding interpretation caveats.
MICrONS/FlyWire analyses for realistic distribution-shift context.

Teaching Materials

Module page: /modules/module13/
Slide page: /modules/slides/module13/
Worksheet: /assets/worksheets/module13/module13-activity.md