DataJoint pipeline: Querying data#

Important

This guide assumes you have a DataJoint pipeline deployed with data already ingested.

This guide provides examples of how to query data from the Aeon DataJoint pipeline using DataJoint’s various query operators, such as restriction (&, -), projection (proj), joining (*), and aggregation (aggr). These together enable powerful data manipulations and flexible data analysis workflows.

Note

The examples here use the Single mouse in a foraging assay dataset for the experiment named social0.2-aeon3. If you are using a different dataset, be sure to replace the experiment name and parameters in the code below accordingly.

from aeon.dj_pipeline import acquisition, tracking
from aeon.dj_pipeline.analysis import block_analysis

Acquisition data#

The DataJoint acquisition module manages raw data collected during experiments. We will start by exploring the acquisition.Chunk table, which stores metadata about discrete time chunks and the associated raw data files for each experiment.

acquisition.Chunk()
A recording period corresponds to a 1-hour data acquisition

experiment_name

e.g exp0-aeon3

chunk_start

datetime of the start of a given acquisition chunk

chunk_end

datetime of the end of a given acquisition chunk

directory_type

epoch_start

social0.2-aeon3 2024-03-02 12:00:00 2024-03-02 13:00:00 raw 2024-03-01 16:46:12
social0.2-aeon3 2024-03-02 13:00:00 2024-03-02 14:00:00 raw 2024-03-01 16:46:12

Total: 2

The command above retrieves all records for all experiments from the Chunk table. Since we only ingested the sample data containing a 2-hour snippet of experiment social0.2-aeon3, the query returns the two records corresponding to each hour of the experiment. In most cases, you would have multiple experiments in the database and the query would return more records. Thus, it is often more efficient to apply specific criteria to filter the data.

Restricting data#

For example, we can use the restriction operator & with a restriction key to specify the experiment_name (e.g. social0.2-aeon3) for which we want to retrieve the chunks.

experiment_key = {"experiment_name": "social0.2-aeon3"}
acquisition.Chunk & experiment_key
A recording period corresponds to a 1-hour data acquisition

experiment_name

e.g exp0-aeon3

chunk_start

datetime of the start of a given acquisition chunk

chunk_end

datetime of the end of a given acquisition chunk

directory_type

epoch_start

social0.2-aeon3 2024-03-02 12:00:00 2024-03-02 13:00:00 raw 2024-03-01 16:46:12
social0.2-aeon3 2024-03-02 13:00:00 2024-03-02 14:00:00 raw 2024-03-01 16:46:12

Total: 2

Tracking data#

The DataJoint tracking module manages the position tracking data produced by different tracking software. Here we continue using the same restriction key to explore tracking data associated with the experiment social0.2-aeon3.

The SLEAPTracking table stores the SLEAP tracking data of the subject(s) in the experiment for each chunk of video recorded from a particular camera device.

tracking.SLEAPTracking & experiment_key
Tracked objects position data from a particular VideoSource for multi-animal experiment using the SLEAP tracking method per chunk

experiment_name

e.g exp0-aeon3

chunk_start

datetime of the start of a given acquisition chunk

device_serial_number

spinnaker_video_source_install_time

time of the spinnaker_video_source placed and started operation at this position

tracking_paramset_id

social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1

Total: 2

The PoseIdentity table contains information that identifies each subject and records the body part used as the anchor point in the SLEAP top-down-id-model network.

tracking.SLEAPTracking.PoseIdentity & experiment_key

experiment_name

e.g exp0-aeon3

chunk_start

datetime of the start of a given acquisition chunk

device_serial_number

spinnaker_video_source_install_time

time of the spinnaker_video_source placed and started operation at this position

tracking_paramset_id

identity_idx

identity_name

identity_likelihood

anchor_part

the name of the point used as anchor node for this class
social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 0 BAA-1104045 =BLOB= centroid
social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 1 BAA-1104047 =BLOB= centroid
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 0 BAA-1104045 =BLOB= centroid
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 1 BAA-1104047 =BLOB= centroid

Total: 4

The Part table contains the x, y coordinates of tracked body parts for each video frame. In this dataset, we only tracked each subject’s centroid.

tracking.SLEAPTracking.Part & experiment_key

experiment_name

e.g exp0-aeon3

chunk_start

datetime of the start of a given acquisition chunk

device_serial_number

spinnaker_video_source_install_time

spinnaker_video_source time of placement and start operation

tracking_paramset_id

identity_idx

part_name

sample_count

number of data points acquired from this stream for a given chunk

x

y

likelihood

timestamps

social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 0 anchor_centroid 19570 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 0 centroid 19570 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 1 anchor_centroid 40429 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 23032909 2024-03-01 16:46:12 1 1 centroid 40429 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 0 anchor_centroid 19654 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 0 centroid 19654 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 1 anchor_centroid 40342 =BLOB= =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 13:00:00 23032909 2024-03-01 16:46:12 1 1 centroid 40342 =BLOB= =BLOB= =BLOB= =BLOB=

Total: 8

Block-level data#

The DataJoint block analysis module contains tables that aggregate experimental events into defined time blocks. This allows for higher-level analyses, such as behavioural trends over extended periods.

Restricting data#

Using another restriction operator &, we can further filter the Block and BlockAnalysis tables to only include blocks longer than 1 hour for the experiment social0.2-aeon3.

block_analysis.Block & experiment_key & "block_duration_hr > 1"

experiment_name

e.g exp0-aeon3

block_start

block_end

block_duration_hr

(hour)
social0.2-aeon3 2024-03-02 12:00:00 2024-03-02 14:00:00 2.000

Total: 1

block_analysis.BlockAnalysis & experiment_key & "block_duration > 1"

experiment_name

e.g exp0-aeon3

block_start

block_duration

(hour)

patch_count

number of patches in the block

subject_count

number of subjects in the block
social0.2-aeon3 2024-03-02 12:00:00 1.99891 3 2

Total: 1

We can also use the SQL LIKE operator to filter records based on a pattern match. Here, we filter the Block table by block_start to only include blocks that started on 2024-03-02.

block_analysis.Block & experiment_key & 'block_start LIKE "%2024-03-02%"'

experiment_name

e.g exp0-aeon3

block_start

block_end

block_duration_hr

(hour)
social0.2-aeon3 2024-03-02 12:00:00 2024-03-02 14:00:00 2.000

Total: 1

Using primary keys#

We can fetch() the primary keys of the blocks matching the above query expression as a list of dictionaries.

block_key = (
    block_analysis.Block & experiment_key & 'block_start LIKE "%2024-03-02%"'
).fetch("KEY")
block_key
[{'experiment_name': 'social0.2-aeon3',
  'block_start': datetime.datetime(2024, 3, 2, 12, 0)}]

The primary keys can then be used to retrieve specific block-level records in other analysis tables that reference the Block table.

For instance, we can use the primary keys to retrieve the BlockAnalysis records associated with the blocks that started on 2024-03-02.

block_analysis.BlockAnalysis & block_key

experiment_name

e.g exp0-aeon3

block_start

block_duration

(hour)

patch_count

number of patches in the block

subject_count

number of subjects in the block
social0.2-aeon3 2024-03-02 12:00:00 1.99891 3 2

Total: 1

We can also retrieve subject-specific analyses (BlockSubjectAnalysis), such as subjects’ food patch preference (BlockSubjectAnalysis.Preference) and their interaction with the food patches (BlockSubjectAnalysis.Patch) for the same blocks.

block_analysis.BlockSubjectAnalysis.Preference & block_key
Measure of preference for a particular patch from a particular subject

experiment_name

e.g exp0-aeon3

block_start

patch_name

e.g. Patch1, Patch2

subject_name

cumulative_preference_by_wheel

cumulative_preference_by_time

running_preference_by_time

running_preference_by_wheel

final_preference_by_wheel

cumulative_preference_by_wheel at the end of the block

final_preference_by_time

cumulative_preference_by_time at the end of the block
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104045 =BLOB= =BLOB= =BLOB= =BLOB= 0.0 0.0363886
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104047 =BLOB= =BLOB= =BLOB= =BLOB= 0.105198 0.160914
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104045 =BLOB= =BLOB= =BLOB= =BLOB= 0.0231214 0.302103
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104047 =BLOB= =BLOB= =BLOB= =BLOB= 0.352164 0.368887
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104045 =BLOB= =BLOB= =BLOB= =BLOB= 1.00854 0.661508
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104047 =BLOB= =BLOB= =BLOB= =BLOB= 0.542637 0.470199

Total: 6

# based on subject position
block_analysis.BlockSubjectAnalysis.Patch & block_key

experiment_name

e.g exp0-aeon3

block_start

patch_name

e.g. Patch1, Patch2

subject_name

in_patch_timestamps

timestamps when a subject is at a specific patch

in_patch_time

total seconds spent in this patch for this block

in_patch_rfid_timestamps

in_patch_timestamps based on RFID

pellet_count

pellet_timestamps

patch_threshold

patch threshold value at each pellet delivery

wheel_cumsum_distance_travelled

wheel's cumulative distance travelled
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104045 =BLOB= 12.18 =BLOB= 0 =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104047 =BLOB= 300.22 =BLOB= 0 =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104045 =BLOB= 101.12 =BLOB= 0 =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104047 =BLOB= 688.24 =BLOB= 9 =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104045 =BLOB= 221.42 =BLOB= 0 =BLOB= =BLOB= =BLOB=
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104047 =BLOB= 877.26 =BLOB= 35 =BLOB= =BLOB= =BLOB=

Total: 6

Joining data#

To obtain a comprehensive view of the patch-preference interactions, we can join the Patch and Preference part tables using the join operator *.

block_analysis.BlockSubjectAnalysis.Patch * block_analysis.BlockSubjectAnalysis.Preference & block_key

experiment_name

e.g exp0-aeon3

block_start

patch_name

e.g. Patch1, Patch2

subject_name

in_patch_timestamps

timestamps when a subject is at a specific patch

in_patch_time

total seconds spent in this patch for this block

in_patch_rfid_timestamps

in_patch_timestamps based on RFID

pellet_count

pellet_timestamps

patch_threshold

patch threshold value at each pellet delivery

wheel_cumsum_distance_travelled

wheel's cumulative distance travelled

cumulative_preference_by_wheel

cumulative_preference_by_time

running_preference_by_time

running_preference_by_wheel

final_preference_by_wheel

cumulative_preference_by_wheel at the end of the block

final_preference_by_time

cumulative_preference_by_time at the end of the block
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104045 =BLOB= 12.18 =BLOB= 0 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 0.0 0.0363886
social0.2-aeon3 2024-03-02 12:00:00 Patch1 BAA-1104047 =BLOB= 300.22 =BLOB= 0 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 0.105198 0.160914
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104045 =BLOB= 101.12 =BLOB= 0 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 0.0231214 0.302103
social0.2-aeon3 2024-03-02 12:00:00 Patch2 BAA-1104047 =BLOB= 688.24 =BLOB= 9 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 0.352164 0.368887
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104045 =BLOB= 221.42 =BLOB= 0 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 1.00854 0.661508
social0.2-aeon3 2024-03-02 12:00:00 Patch3 BAA-1104047 =BLOB= 877.26 =BLOB= 35 =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= =BLOB= 0.542637 0.470199

Total: 6

Fetching and inspecting data#

Finally, data can be fetched as a pandas DataFrame using fetch() with the format="frame" argument.

block_patch_data = (
    (block_analysis.BlockAnalysis.Patch & block_key).fetch(format="frame").reset_index()
)
block_patch_data
experiment_name block_start patch_name pellet_count pellet_timestamps wheel_cumsum_distance_travelled wheel_timestamps patch_threshold patch_threshold_timestamps patch_rate patch_offset
0 social0.2-aeon3 2024-03-02 12:00:00 Patch1 0 [] [-0.0, 0.0015340744203575785, 0.00153407442035... [2024-03-02T12:00:00.000000000, 2024-03-02T12:... [] [] 0.0100 75.0
1 social0.2-aeon3 2024-03-02 12:00:00 Patch2 9 [2024-03-02T12:24:33.777504000, 2024-03-02T12:... [-0.0, 0.004602223261073846, -0.0, 0.001534074... [2024-03-02T12:00:00.000000000, 2024-03-02T12:... [364.05544835561926, 164.93215066599433, 485.7... [2024-03-02T12:24:09.392000000, 2024-03-02T12:... 0.0020 75.0
2 social0.2-aeon3 2024-03-02 12:00:00 Patch3 35 [2024-03-02T12:20:17.155488000, 2024-03-02T12:... [-0.0, 0.003068148840714713, 0.004602223261073... [2024-03-02T12:00:00.000000000, 2024-03-02T12:... [153.60720025718558, 166.6737735450297, 327.20... [2024-03-02T12:20:01.725984000, 2024-03-02T12:... 0.0033 75.0