DataJoint pipeline: Data ingestion and processing#

Important

This guide assumes you have installed and configured a DataJoint pipeline.

This guide demonstrates the process of ingesting data from the source and preparing it for querying and further analysis in the Aeon DataJoint pipeline. The three main steps are:

Create a new experiment: Set up a new experiment in the pipeline.
Insert subjects and blocks: Manually input details about the subjects involved in the experiment and specify the blocks of interest.
Run automated ingestion and processing: Run routines to ingest data and process it for querying and analysis.

Note

This guide uses the Single mouse in a foraging assay sample dataset for the experiment named social0.2-aeon3.

If you are using a different dataset, please make sure the DataJoint pipeline is correctly configured (e.g. the data directory is correctly specified in the DataJoint configuration file (dj_local_conf.json)). You should also replace the experiment name and other parameters in the code below accordingly.

from aeon.dj_pipeline import acquisition, subject
from aeon.dj_pipeline.analysis import block_analysis
from aeon.dj_pipeline.create_experiments.create_socialexperiment import (
    create_new_social_experiment,
)
from aeon.dj_pipeline.populate.worker import (
    AutomatedExperimentIngestion,
    acquisition_worker,
    analysis_worker,
    streams_worker,
)

Step 1 - Create a new experiment#

Insert a new entry for the social0.2-aeon3 experiment into the acquisition.Experiment table, along with its associated metadata:

experiment_name = "social0.2-aeon3"
create_new_social_experiment(experiment_name)

We can now check that the experiment has been successfully inserted into the acquisition.Experiment table:

acquisition.Experiment()

experiment_name e.g exp0-aeon3	experiment_start_time datetime of the start of this experiment	experiment_description	arena_name unique name of the arena (e.g. circular_2m)	lab Abbreviated lab name	location	experiment_type
social0.2-aeon3	2024-03-01 16:46:12	Social0.2 experiment on AEON3 machine	circle-2m	SWC	AEON3	social

Total: 1

We can also check the acquisition.Experiment.Directory table to see the raw and processed directories associated with the experiment:

acquisition.Experiment.Directory()

experiment_name e.g exp0-aeon3	directory_type	repository_name	directory_path	load_order order of priority to load the directory
social0.2-aeon3	processed	ceph_aeon	aeon/data/processed/AEON3/social0.2	0
social0.2-aeon3	raw	ceph_aeon	aeon/data/raw/AEON3/social0.2	1

Total: 2

Step 2 - Insert subjects and blocks#

The social0.2-aeon3 experiment involves two subjects:

BAA-1104045
BAA-1104047

Let’s create entries for these subjects and insert them into the subject.Subject table:

subject_list = [
    {
        "subject": "BAA-1104045",
        "sex": "U",
        "subject_birth_date": "2024-01-01",
        "subject_description": "Subject for Social 0.2 experiment",
    },
    {
        "subject": "BAA-1104047",
        "sex": "U",
        "subject_birth_date": "2024-01-01",
        "subject_description": "Subject for Social 0.2 experiment",
    },
]
subject.Subject.insert(subject_list, skip_duplicates=True)

To associate these subjects with the experiment social0.2-aeon3:

subject_experiment_list = [
    {"experiment_name": "social0.2-aeon3", "subject": "BAA-1104045"},
    {"experiment_name": "social0.2-aeon3", "subject": "BAA-1104047"},
]
acquisition.Experiment.Subject.insert(subject_experiment_list, skip_duplicates=True)

We can now check that the subjects have been successfully associated with the experiment social0.2-aeon3 by querying the acquisition.Experiment.Subject table:

acquisition.Experiment.Subject()

the subjects participating in this experiment

experiment_name e.g exp0-aeon3	subject
social0.2-aeon3	BAA-1104045
social0.2-aeon3	BAA-1104047

Total: 2

Next, we need to create and insert an entry for a block of interest into the block_analysis.Block table.

block_data = {
    "experiment_name": "social0.2-aeon3",
    "block_start": "2024-03-02 12:00:00",
    "block_end": "2024-03-02 14:00:00",
    "block_duration_hr": 2,
}
block_analysis.Block.insert1(block_data)

Likewise, we can query the block_analysis.Block table to check that the block has been successfully inserted:

block_analysis.Block()

experiment_name e.g exp0-aeon3	block_start	block_end	block_duration_hr (hour)
social0.2-aeon3	2024-03-02 12:00:00	2024-03-02 14:00:00	2.000

Total: 1

Step 3 - Data ingestion and processing#

Data ingestion and processing are fully automated through the prepared routines provided below. As DataJoint pipelines are idempotent, these routines can be safely run multiple times without the risk of duplicating or altering existing data.

To initiate the automated data ingestion process for the experiment social0.2-aeon3, we need to first insert an entry for the experiment into the AutomatedExperimentIngestion table:

AutomatedExperimentIngestion.insert1(
    {"experiment_name": "social0.2-aeon3"}, skip_duplicates=True
)

Ingestion and processing of acquisition-related data for the experiment social0.2-aeon3 can now be initiated by running:

acquisition_worker.run()

Likewise, ingestion and processing of all data streams for the experiment social0.2-aeon3 can be initiated by running:

streams_worker.run()

Finally, for data analysis, run:

analysis_worker.run()

Once the data ingestion and processing routines are complete, we can begin querying the data from the pipeline.

DataJoint pipeline: Data ingestion and processing#

Step 1 - Create a new experiment#

Step 2 - Insert subjects and blocks#

Step 3 - Data ingestion and processing#

This Page