Step 1: Time Gridding and Optional Filtering

This document describes Step 1 in the mooring-level processing workflow: combining multiple instruments onto a common time grid with optional time-domain filtering. This step consolidates individual instrument time series from different sampling rates into a unified mooring dataset suitable for further analysis.

This represents the first step in mooring-level processing:

  • Step 1: Time gridding (this document)

  • Step 2: Vertical gridding (future)

  • Step 3: Multi-deployment stitching (future)

This step is performed after individual instrument processing (Stages 1-3: standardisation, trimming, QC/calibration) and prior to vertical interpolation and transport calculations.

CRITICAL: Any filtering is applied to individual instrument records BEFORE interpolation onto the common time grid since normally we are low pass filtering and downsampling. Downsampling first could alias high frequency variability into the filtered dataset.

1. Overview

Multiple instruments on a mooring often sample at different rates, creating challenges for comparative analysis and further processing. Time gridding addresses this by:

  • Optionally applying time-domain filtering to individual instrument records

  • Interpolating all instruments onto a common temporal grid

  • Preserving high-frequency temporal resolution when desired

  • Creating unified mooring datasets with standardized structure

The process combines individual instrument files (*_use.nc) into single mooring files (*_mooring_use.nc) with an N_LEVELS dimension representing the different instruments/depths.

2. Purpose

  • Optional filtering: Apply lowpass filters to remove high frequency variability

  • Common time grid: Align instruments with different sampling rates onto a single time vector

  • Data consolidation: Create single files representing entire moorings

  • Preserve resolution (optional): Maintain temporal detail for high-frequency analysis

  • Standardized structure: Enable consistent downstream processing

3. Processing Workflow

The correct processing order is essential for data integrity:

  1. Load instrument datasets: Read all available *_use.nc files for a mooring

  2. Apply filtering to individual datasets: Filter each instrument on its native time grid

  3. Timing analysis: Analyze sampling rates and detect temporal coverage

  4. Common grid creation: Calculate median sampling interval across all instruments. Note: may be worthwhile to consider other options here in future.

  5. Interpolation: Interpolate filtered datasets onto the common temporal grid

  6. Dataset combination: Merge into single dataset with N_LEVELS dimension

  7. Metadata encoding: Convert string variables to CF-compliant integer flags

  8. NetCDF output: Write combined mooring dataset

Implemented Filtering Types

Low-pass Butterworth Filter (RAPID-style) - Purpose: Remove tidal and inertial variability for long-term analysis - Default parameters: 2-day cutoff, 6th order Butterworth - Applications: Climate studies, transport calculations, data volume reduction - Gap handling: Filters continuous segments separately

Consider for the future (especially for bottom pressure?):

Harmonic De-tiding (Future??) - Purpose: Remove specific tidal constituents using harmonic analysis - Status: Placeholder - currently falls back to low-pass filtering - Applications: Precise tidal removal, retained sub-tidal variability

4. Current Implementation

The time gridding process is implemented in the oceanarray.time_gridding module, which provides automated processing for mooring datasets.

4.1. Input Requirements

The oceanarray.time_gridding.TimeGriddingProcessor class processes Stage 1, 2 or 3 output files:

  • Individual instrument files (*_use.nc): Trimmed and clock-corrected time series. It can also be applied to *_qc.nc.

  • YAML configuration: Mooring metadata with instrument specifications

  • Multiple sampling rates: Automatic detection and handling of different temporal resolutions

4.2. Processing Workflow Implementation

The TimeGriddingProcessor.process_mooring() method follows these steps:

  1. Load instrument datasets: Read all available *_use.nc files for a mooring

  2. Apply individual filtering: Use _apply_time_filtering_single() on each dataset

  3. Timing analysis: Analyze sampling rates using _analyze_timing_info()

  4. Common grid creation: Calculate median sampling interval across filtered instruments

  5. Interpolation: Use _interpolate_datasets() on filtered data

  6. Dataset combination: Merge using _create_combined_dataset()

  7. Metadata encoding: Apply _encode_instrument_as_flags()

  8. NetCDF output: Write combined mooring dataset

4.3. Filtering Implementation

Individual Dataset Filtering

The _apply_time_filtering_single() method processes each instrument separately:

def _apply_time_filtering_single(self, dataset, filter_type, filter_params):
    """Apply filtering to individual instrument on native time grid."""
    if filter_type == 'lowpass':
        return self._apply_lowpass_filter(dataset, filter_params)
    elif filter_type == 'detide':
        return self._apply_detiding_filter(dataset, filter_params)
    else:
        return dataset  # No filtering

Low-pass Butterworth Filter

The _apply_lowpass_filter() method implements RAPID-style filtering:

  • Frequency analysis: Validates cutoff frequency against Nyquist limit

  • Filter design: 6th order Butterworth low-pass filter using scipy.signal

  • Gap handling: Processes continuous segments separately via _filter_with_gaps()

  • Quality control: Checks data length and validity before filtering

  • Robust processing: Graceful fallbacks when filtering fails

Filter Parameters

filter_params = {
    'cutoff_days': 2.0,     # Cutoff frequency in days
    'order': 6,             # Filter order
    'method': 'butterworth' # Filter type
}

4.4. Timing Analysis and Warnings

The processor provides comprehensive timing analysis:

  • Sampling rate detection: Identifies median intervals for each instrument

  • Interpolation warnings: Alerts when large sampling rate differences exist (>2x)

  • Missing instrument alerts: Compares loaded files against YAML configuration

  • Irregular sampling detection: Flags instruments with >10% timing variability

  • Filter impact assessment: Reports changes to original sampling rates

4.5. Configuration Example

Time gridding uses existing YAML mooring configurations:

name: mooring_name
instruments:
  - instrument: microcat
    serial: 7518
    depth: 100
  - instrument: adcp
    serial: 1234
    depth: 300

4.6. Usage Examples

Basic Processing (No Filtering)

from oceanarray.time_gridding import process_multiple_moorings_time_gridding

# Process moorings without filtering
moorings = ['mooring1', 'mooring2']
results = process_multiple_moorings_time_gridding(moorings, basedir)

RAPID-style De-tiding

# Apply 2-day low-pass filter (RAPID-style)
results = process_multiple_moorings_time_gridding(
    moorings, basedir,
    filter_type='lowpass',
    filter_params={'cutoff_days': 2.0, 'order': 6}
)

Custom Filter Parameters

# Custom filter settings
results = process_multiple_moorings_time_gridding(
    moorings, basedir,
    filter_type='lowpass',
    filter_params={'cutoff_days': 1.0, 'order': 4}
)

5. Output Format

The time-gridded output includes:

  • Combined mooring dataset (*_mooring_use.nc) with: - Time coordinates common to all instruments - N_LEVELS dimension representing instrument/depth levels - Variables stacked across instruments with NaN for missing data - Comprehensive metadata preservation - Filter provenance when filtering applied

  • Processing logs with detailed information about: - Filtering decisions and parameters - Timing analysis and interpolation decisions - Missing instruments and sampling rate warnings - Processing success/failure status

Example output structure:

<xarray.Dataset>
Dimensions:        (time: 8640, N_LEVELS: 3)
Coordinates:
  * time           (time) datetime64[ns] 2018-08-12T08:00:00 ... 2018-08-26T20:00:00
  * N_LEVELS       (N_LEVELS) int64 0 1 2
    nominal_depth  (N_LEVELS) float32 100.0 200.0 300.0
    serial_number  (N_LEVELS) int64 7518 7519 1234
    clock_offset   (N_LEVELS) int64 0 300 -120
Data variables:
    temperature    (time, N_LEVELS) float32 ...
    salinity       (time, N_LEVELS) float32 ...
    pressure       (time, N_LEVELS) float32 ...
    instrument_id  (N_LEVELS) int16 1 1 2
Attributes:
    mooring_name:              test_mooring
    instrument_names:          microcat, adcp
    time_filtering_applied:    {'cutoff_days': 2.0, 'order': 6}  # If filtered

6. Quality Control and Processing Intelligence

The time gridding processor includes several quality control features:

  • Temporal coverage analysis: Identifies gaps and overlaps in instrument records

  • Sampling rate optimization: Uses median interval to minimize interpolation artifacts

  • Missing data handling: Preserves NaN values and missing instrument periods

  • Filter validation: Checks filter parameters against data characteristics

  • Interpolation impact assessment: Quantifies changes to original sampling rates

  • Comprehensive logging: Detailed processing logs for debugging and validation

7. Time-Domain Filtering Details

Time-domain filtering is particularly useful for:

  • Long-term climate studies: Removing tidal signals for multi-year analysis

  • Transport calculations: Focusing on sub-inertial variability

  • Data volume reduction: Subsampling to lower frequencies for storage efficiency

  • Spectral analysis preparation: Removing specific frequency bands

But it is not necessarily appropriate for:

  • High-frequency process studies: Where tidal and inertial signals are of interest

  • Short-term deployments: Where filtering may remove significant portions of the record

7.1. RAPID Array Context: De-tiding for Long-term Records

The filtering implementation is based on the RAPID array processing workflow, where 2-day low-pass Butterworth filtering (6th order) was applied to remove tidal and inertial variability from year-long mooring records.

RAPID filtering characteristics: - Purpose: Remove tides from hourly-sampled, year-long records - Filter type: Butterworth, 6th order - Cutoff frequency: 2 days (~0.0058 Hz) - Application: Temperature, salinity, and pressure time series - Output frequency: Often subsampled to 12-hourly intervals - Gap handling: Interpolation across gaps <10 days

Historical RAPID workflow:

% MATLAB implementation (hydro_grid.m)
filtered_temp = auto_filt(temperature, sample_rate, cutoff_days);

This approach was essential for RAPID’s 20-year dataset management, converting high-frequency hourly data to manageable half-daily records suitable for transport calculations and long-term climate analysis.

Modern improvements:

The Python implementation in oceanarray.time_gridding provides equivalent functionality with:

  • Multi-instrument handling: Process entire moorings simultaneously

  • Flexible filtering: Multiple filter types and parameters

  • Quality control: Comprehensive timing analysis and warnings

  • Modern formats: NetCDF output with CF conventions

  • Gap-aware processing: Intelligent handling of data gaps

7.2. Filter Implementation Details (not yet implemented)

Gap Handling

The _filter_with_gaps() method processes data with missing values:

  • Segment identification: Finds continuous data segments

  • Minimum length: Only filters segments >50 points for stability

  • Separate processing: Filters each segment independently

  • Graceful fallbacks: Preserves original data if filtering fails

Quality Validation

Before applying filters, the system validates:

  • Data length: Minimum 100 points required for stable filtering

  • Sampling rate: Must be regular and well-defined

  • Nyquist criterion: Cutoff frequency must be below Nyquist limit

  • Data quality: Sufficient valid (non-NaN) data points

Filter Variables

The following variables are filtered when present:

  • Temperature, salinity, conductivity

  • Pressure and derived quantities

  • Velocity components (u, v, eastward, northward)

Coordinate variables and metadata are preserved unchanged.

8. Integration with Processing Chain

Time-gridded files serve as input to subsequent mooring-level processing:

  • Step 2: Vertical gridding onto common pressure levels

  • Step 3: Multi-deployment temporal stitching

  • Analysis workflows: Transport calculations, climatological analysis

The consistent structure and temporal alignment created during time gridding enables efficient downstream processing across different instrument configurations.

Processing provenance is maintained through:

  • Global attributes recording filter parameters

  • Processing logs with detailed decision information

  • Preserved original metadata where possible

  • Clear documentation of interpolation and filtering steps

9. Implementation Notes

  • Interpolation method: Linear interpolation via xarray.Dataset.interp()

  • Time handling: All times processed as UTC datetime64 objects

  • Memory efficiency: Chunked NetCDF output for large datasets

  • Attribute preservation: Global and variable attributes maintained through processing

  • Missing data: NaN values preserved and propagated appropriately

  • Filter dependencies: Requires scipy for Butterworth filter implementation

10. FAIR Considerations

  • Findable: Standardized file naming and comprehensive metadata

  • Accessible: NetCDF format with CF conventions for broad compatibility

  • Interoperable: Consistent structure across moorings and deployments

  • Reusable: Detailed processing logs and parameter documentation

Time gridding decisions, interpolation details, and filtering parameters are documented transparently in processing logs and dataset attributes to maintain full provenance.

Filter provenance includes:

  • Filter type and parameters in global attributes

  • Original sampling rates and interpolation changes

  • Gap locations and filter segment boundaries

  • Quality control decisions and warnings

See also: oceanarray API, 2. Trimming to Deployed period, Step 2: Vertical Gridding, stitching