Antelop

A user-friendly software package for systems neuroscience data management

Rory Bedford

MRC LMB

Big data in neuroscience

New technologies such as Neuropixels allow us to record an increasingly high number of units
Continual shift towards more automated behavioural assays means longer/more recordings
Scientists have to spend more time on data processing workflows and data management
Increasingly advanced skills are needed to undertake this (HPC skills, data engineering, etc.)

Custom file formats/project structures are hard to parse
Custom preprocessing/analysis scripts are very difficult to reproduce
Lab data storage not centralised
High entry barrier to existing tools like DataJoint and NWB which makes their adoption difficult for many labs

Software package designed to facilitate the easy adoption of data processing and storage best practices
Simple pip install and straightforward graphical configuration
Extensive graphical user interface for all aspects of your data management and processing
MySQL database backend for centralised storage
Supports electrophysiology, calcium imaging, and behavioural data processing with HPC integration

Integrates with existing tools, such as popular spikesorters, CaImAn, and DeepLabCut
- Leverages the fact that preprocessing requirements for common acquisition methods are fairly uniform
Implements a range of data visualisation tools and metrics out of the box, including an analysis standard library
Supports the writing of custom analysis scripts, with direct integration to your lab’s GitHub and data immutability checks for reproducibility
Has import/export functions for NWB and a range of acquisition systems
Has a strictly structured but accomodating database structure for analysis routines to utilise

Antelop can be used to curate a large database of paired neural activity and environmental events for a domain of interest (eg Hippocampus and spatial navigation tasks)
This facilitates the use of large scale multi-recording analyses
For example: neuroscience foundation models
- Neural nets trained via self-supervised learning to model the relationship between activity and the environmental factor of interest
- Can be fine-tuned on a number of sub-tasks such as decoding a quantity of interest

Environmental arrays split into four different types
Each contains NumPy arrays with a strict structure, metadata, and common clock with neural activity
All data belongs to the environment (which belongs to a recording session, experimenter, etc.)
Environments can have one or more subject
Optionally, data can also belong to a subject

Thank you