seagliderOG1 demo

The purpose of this notebook is to demonstrate the functionality of seagliderOG1 to convert from Seaglider basestation files to OG1 format.

The test case is to convert sg015 data from the Labrador Sea in September 2004.

The demo is organised to show

  • Conversion of a single dive cycle (single p*.nc file)

  • Conversion for a folder of local dive-cycle files (full mission of p*.nc files)

  • Download from remote server + conversion (directory with full mission of p*.nc files)

Options are provided to only load e.g. 10 files, but note that OG1 format expects a full mission.

[1]:
import pathlib
import sys

script_dir = pathlib.Path().parent.absolute()
parent_dir = script_dir.parents[0]
sys.path.append(str(parent_dir))
sys.path.append(str(parent_dir) + '/seagliderOG1')
print(parent_dir)
print(sys.path)
### silence future warnings
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import xarray as xr
import os
from seagliderOG1 import readers, writers, plotters
from seagliderOG1 import convertOG1, vocabularies

/home/runner/work/seagliderOG1/seagliderOG1
['/home/runner/micromamba/envs/TEST/lib/python314.zip', '/home/runner/micromamba/envs/TEST/lib/python3.14', '/home/runner/micromamba/envs/TEST/lib/python3.14/lib-dynload', '', '/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages', '/home/runner/work/seagliderOG1/seagliderOG1', '/home/runner/work/seagliderOG1/seagliderOG1/seagliderOG1']
[2]:
# Specify the path for writing datafiles
data_path = os.path.join(parent_dir, 'data')

Reading basestation files

This has three ways to load a glider dataset.

Load an example dataset using seagliderOG1.fetchers.load_sample_dataset

Alternatively, use your own with e.g. ds = xr.open_dataset('/path/to/yourfile.nc')

Load single sample dataset

[3]:
ds = readers.load_sample_dataset()
ds
[3]:
<xarray.Dataset> Size: 27kB
Dimensions:                       (sg_data_point: 53, gps_info: 3, gc_event: 7,
                                   trajectory: 1)
Coordinates:
    longitude                     (sg_data_point) float64 424B ...
    latitude                      (sg_data_point) float64 424B ...
    ctd_time                      (sg_data_point) datetime64[ns] 424B ...
    ctd_depth                     (sg_data_point) float64 424B ...
  * trajectory                    (trajectory) int32 4B 1
Dimensions without coordinates: sg_data_point, gps_info, gc_event
Data variables: (12/337)
    surface_curr_north            float64 8B ...
    surface_curr_east             float64 8B ...
    start_of_climb_time           float64 8B ...
    sg_cal_volmax                 float64 8B ...
    sg_cal_vbd_min_cnts           int32 4B ...
    sg_cal_vbd_max_cnts           int32 4B ...
    ...                            ...
    buoyancy                      (sg_data_point) float64 424B ...
    SBE43_qc                      |S1 1B ...
    GPSE_qc                       |S1 1B ...
    GPS2_qc                       |S1 1B ...
    GPS1_qc                       |S1 1B ...
    CTD_qc                        |S1 1B ...
Attributes: (12/59)
    date_modified:                   2014-03-11T20:03:32Z
    quality_control_version:         1.1
    base_station_micro_version:      3897
    time_coverage_resolution:        PT1S
    geospatial_vertical_max:         51.62461963987948
    sea_name:                        North Atlantic Ocean
    ...                              ...
    disclaimer:                      Data provided AS-IS.
    geospatial_vertical_positive:    no
    date_created:                    2014-03-11T20:03:32Z
    geospatial_vertical_units:       meter
    dive_number:                     1
    history:                         Processing start:\n20:13:32 11 Mar 2014 ...

Load datasets from a local directory

[4]:
# Specify the input directory on your local machine
input_dir = data_path + '/demo_sg005' ### chose the input directory with your data

# Load and concatenate all datasets in the input directory
# Optionally, specify the range of profiles to load (start_profile, end_profile)
list_datasets = readers.load_basestation_files(input_dir, start_profile=0, end_profile=5)

# Where list_datasets is a list of xarray datasets.  A single dataset can be accessed as
ds = list_datasets[0]
Scanning files: 100%|██████████| 5/5 [00:00<00:00, 33.43file/s]
Loading datasets: 100%|██████████| 5/5 [00:00<00:00, 31.95file/s]
[5]:
ds = readers.load_sample_dataset()

Load datasets from a remote directory (URL)

[6]:
# Specify the server where data are located
#server = "https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/033/20100903/"

# Load and concatenate all datasets from the server, optionally specifying the range of profiles to load
#list_datasets = readers.load_basestation_files(server, start_profile=1, end_profile=10)

Convert to OG1 format

Process:

  1. For one basestation dataset, split the dataset by dimension (split_ds)

  2. Transform into OG1 format: dataset with dims sg_data_point

    • Change the dimension to N_MEASUREMENTS

    • Rename variables according to vocabularies.standard_names

    • Assign variable attributes according to vocabularies.vocab_attrs. (Note: This could go wrong since it makes assumptions about the input variables. May need additional handling.)

  3. Add missing mandatory variables:

    • From split_ds[(gps_info,)], add the LATITUDE_GPS, LONGITUDE_GPS and TIME_GPS (Note: presently TIME_GPS is stripped before saving, but TIME values contain TIME_GPS)

    • Create PROFILE_NUMBER and PHASE

    • Calculate DEPTH_Z which is positive up

  4. Update attributes for the file.

    • Combines creator and contributor from original attributes into contributor

    • Adds contributing_institutions based on institution

    • Reformats time in time_coverage_* and start_time–> start_date

    • Adds date_modified

    • Renames comments–>history, site–>summary

    • Adds title, platform, platform_vocabulary, featureType, Conventions, rtqc_method* according to OceanGliders format

    • Retains naming_authority, institution, project, geospatial_* as OG attributes

    • Retains extra attributes: license, keywords, keywords_vocabulary, file_version, acknowledgement, date_created, disclaimer

Future behaviour to be added:

  1. Retain the variables starting with sg_cal and check whether they vary over the mission (shouldn’t)

  2. Add sensors, using information in the split_ds with no dimensions

    • Need (from sg_cal_constants: sg_cal plus volmax, vbd_cnts_per_cc, therm_expan, t_*, mass, hd_*, ctcor, cpcor, c_*, abs_compress, a, Tcor, Soc, Pcor, Foffset)

    • Maybe also reviewed, magnetic_variation (which will change with position), log_D_FLARE, flight_avg_speed_north and flight_avg_speed_east also with _gsm, depth_avg_curr_north and depth_avg_curr_east also with _gsm, wlbb2f - means sensor sg_cal_mission_title sg_cal_id_str calibcomm_oxygen calibcomm sbe41 means ?? hdm_qc glider

Convert a single (sample) dataset

[7]:
# Loads one dataset (p0150500_20050213.nc)
ds = readers.load_sample_dataset()

ds_OG1, var_list = convertOG1.convert_to_OG1(ds)

# Check the results - uncomment the following lines to either generate a plot or show the variables.
plotters.plot_profile_depth(ds_OG1)
Processing datasets:   0%|          | 0/1 [00:00<?, ?dataset/s]
No conversion information found for micromoles/kg to micromoles/kg

No conversion information found for cm s-1 to cm s-1

No conversion information found for micromoles/kg to micromoles/kg
/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Variables removed from dataset: ['eng_depth', 'eng_elaps_t', 'eng_elaps_t_0000', 'latitude_gsm', 'longitude_gsm', 'glide_angle_gsm', 'horz_speed_gsm', 'north_displacement_gsm', 'east_displacement_gsm', 'speed_gsm', 'vert_speed_gsm', 'dive_num_cast', 'density']
Processing datasets: 100%|██████████| 1/1 [00:00<00:00,  2.01dataset/s]
The following HDM parameters were found: ['VBD_MIN_CNTS', 'VBD_CNTS_PER_CC', 'VBD_CC_PER_CNTS', 'MASS', 'VOLMAX', 'C_VBD', 'HD_A', 'HD_B', 'HD_C']
Warning: The following potential HDM parameters were not found in the datasets: ['VBD_BIAS']
_images/demo-output_12_2.png
[8]:
### print the list of inital variables of the dataset
var_list
[8]:
['log_AH0_24V',
 'log_D_FLARE',
 'depth',
 'log_SM_CCo',
 'glide_angle',
 'log_KALMAN_USE',
 'log_D_FINISH',
 'log_DEVICE4',
 'avg_latitude',
 'log_SENSOR_MAMPS',
 'sg_cal_roll_min_cnts',
 'log_SIM_PITCH',
 'log_gps_time',
 'log_RAFOS_CORR_THRESH',
 'log_CAP_FILE_SIZE',
 'log_VBD_DBAND',
 'log_D_SURF',
 'log_HEADING',
 'sg_cal_C',
 'salinity_qc',
 'log_RAFOS_HIT_WINDOW',
 'sg_cal_mass',
 'eng_wlbb2f_fluorCount',
 'log_USE_BATHY',
 'log_INT_PRESSURE_YINT',
 'sg_cal_B',
 'depth_avg_curr_qc',
 'log_MASS',
 'depth_avg_curr_north_gsm',
 'log_T_GPS_CHARGE',
 'log_ROLL_MIN',
 'log_P_OVSHOOT',
 'GPS1_qc',
 'sbe43_results_time',
 'eng_rollAng',
 'longitude',
 'log__XMS_NAKs',
 'time',
 'log_ERRORS',
 'log_ROLL_MAX',
 'log_VBD_TIMEOUT',
 'sg_cal_vbd_max_cnts',
 'log_TCM_ROLL_OFFSET',
 'log_ROLL_ADJ_DBAND',
 'sound_velocity',
 'sg_cal_Soc',
 'log_SENSOR_SECS',
 'sbe43',
 'log_C_PITCH',
 'sigma_t',
 'log_T_TURN_SAMPINT',
 'sg_cal_c_i',
 'longitude_gsm',
 'log_COURSE_BIAS',
 'glider',
 'depth_avg_curr_east',
 'gc_gcphase',
 'log_DEVICE1',
 'depth_avg_curr_east_gsm',
 'log_IRIDIUM_FIX',
 'log_D_SAFE',
 'log_PITCH_ADJ_DBAND',
 'log_AD7714Ch0Gain',
 'log_FIX_MISSING_TIMEOUT',
 'log_R_PORT_OVSHOOT',
 'speed_gsm',
 'log_ID',
 'log_SURFACE_URGENCY_FORCE',
 'eng_wlbb2f_blueCount',
 'log_UPLOAD_DIVES_MAX',
 'log_CALL_NDIVES',
 'surface_curr_north',
 'sg_cal_hd_c',
 'flight_avg_speed_east',
 'eng_pitchAng',
 'log_SURFACE_URGENCY_TRY',
 'sg_cal_calibcomm_oxygen',
 'log_SM_GC',
 'log_VBD_PUMP_AD_RATE_APOGEE',
 'log_ALTIM_SENSITIVITY',
 'glide_angle_gsm',
 'log_SEABIRD_C_I',
 'sg_cal_volmax',
 'log_D_OFFGRID',
 'log_DIVE',
 'latitude_gsm',
 'flight_avg_speed_north',
 'log_ALTIM_BOTTOM_PING_RANGE',
 'log_HUMID',
 'log_RHO',
 'sg_cal_hd_a',
 'gc_vbd_secs',
 'log_N_NOCOMM',
 'sg_cal_c_j',
 'log_ALTIM_PING_DELTA',
 'theta',
 'log_T_TURN',
 'gc_pitch_ctl',
 'log_SMARTS',
 'trajectory',
 'eng_condFreq',
 'log_XPDR_DEVICE',
 'log_SEABIRD_T_G',
 'gc_pitch_ad',
 'log_ESCAPE_HEADING_DELTA',
 'gc_st_secs',
 'log_SEABIRD_T_J',
 'log_RAFOS_PEAK_OFFSET',
 'east_displacement',
 'log_GPS',
 'horz_speed',
 'surface_curr_east',
 'log_D_GRID',
 'log_24V_AH',
 'flight_avg_speed_north_gsm',
 'gc_pitch_i',
 'log_HD_A',
 'log_CALL_TRIES',
 'sg_cal_Pcor',
 'sbe41',
 'log_DEVICE_MAMPS',
 'eng_vbdCC',
 'sg_cal_pitch_max_cnts',
 'north_displacement_gsm',
 'sg_cal_roll_max_cnts',
 'log_DATA_FILE_SIZE',
 'log_R_STBD_OVSHOOT',
 'log_N_FILEKB',
 'log_COMPASS_USE',
 'log_DEVICE3',
 'log_DEVICES',
 'ctd_time',
 'gc_roll_ad',
 'sbe43_dissolved_oxygen',
 'log_PITCH_GAIN',
 'sg_cal_pitch_min_cnts',
 'log_ALTIM_BOTTOM_TURN_MARGIN',
 'sg_cal_c_h',
 'sg_cal_ctcor',
 'log_ALTIM_PULSE',
 'log_PITCH_CNV',
 'log_GPS2',
 'east_displacement_gsm',
 'log_PHONE_DEVICE',
 'log_ALTIM_PING_DEPTH',
 'surface_curr_qc',
 'GPS2_qc',
 'log_FILEMGR',
 'vert_speed_gsm',
 'log_ICE_FREEZE_MARGIN',
 'CTD_qc',
 'log_SURFACE_URGENCY',
 'log_TCM_PITCH_OFFSET',
 'log_ROLL_AD_RATE',
 'log_GPS1',
 'log_USE_ICE',
 'log_DEEPGLIDERMB',
 'log_ROLL_ADJ_GAIN',
 'log_C_ROLL_DIVE',
 'log_C_VBD',
 'log_HEAPDBG',
 'salinity_raw',
 'gc_vbd_ad',
 'log_INT_PRESSURE_SLOPE',
 'conductivity_qc',
 'start_of_climb_time',
 'log_SEABIRD_C_G',
 'log_CAPUPLOAD',
 'log_T_WATCHDOG',
 'log_T_GPS',
 'log_PRESSURE_SLOPE',
 'log_D_ABORT',
 'log_DEVICE_SECS',
 'log_TCM_TEMP',
 'eng_wlbb2f_blueRef',
 'log_SPEED_FACTOR',
 'log_VBD_PUMP_AD_RATE_SURFACE',
 'log_SIM_W',
 'eng_tempFreq',
 'log_TT8_MAMPS',
 'log_PITCH_MIN',
 'log_ALTIM_FREQUENCY',
 'temperature_raw_qc',
 'log_SEABIRD_T_I',
 'log_DEVICE2',
 'eng_head',
 'log_HD_C',
 'log_10V_AH',
 'log_SENSORS',
 'log_C_ROLL_CLIMB',
 'log_T_DIVE',
 'log_ALTIM_TOP_MIN_OBSTACLE',
 'temperature_raw',
 'log_ROLL_TIMEOUT',
 'log_gps_lon',
 'log_SEABIRD_C_H',
 'log_ALTIM_TOP_TURN_MARGIN',
 'log_PITCH_AD_RATE',
 'sg_cal_QC_cond_spike_depth',
 'log_CALL_WAIT',
 'magnetic_variation',
 'log_SMARTDEVICE1',
 'sg_cal_pump_rate_intercept',
 'speed',
 'speed_qc',
 'sg_cal_pump_power_intercept',
 'log_COMPASS_DEVICE',
 'buoyancy',
 'eng_depth',
 'eng_pitchCtl',
 'log_T_GPS_ALMANAC',
 'gc_roll_i',
 'sg_cal_t_i',
 'log_APOGEE_PITCH',
 'log_SEABIRD_T_H',
 'log_FERRY_MAX',
 'log_TGT_NAME',
 'log_HEAD_ERRBAND',
 'wlbb2f',
 'eng_sbe43_O2Freq',
 'log_MAX_BUOY',
 'log_ROLL_MAXERRORS',
 'log_CFSIZE',
 'sg_cal_pump_rate_slope',
 'log_D_CALL',
 'log_AH0_10V',
 'gc_pitch_secs',
 'log_D_NO_BLEED',
 'sg_cal_cpcor',
 'gc_data_pts',
 'log_PITCH_VBD_SHIFT',
 'sg_cal_t_h',
 'ctd_depth',
 'log_TGT_AUTO_DEFAULT',
 'gc_depth',
 'log_D_TGT',
 'log_DEVICE6',
 'eng_wlbb2f_VFtemp',
 'directives',
 'log_TGT_DEFAULT_LAT',
 'gc_vbd_ctl',
 'log_KERMIT',
 'salinity',
 'flight_avg_speed_east_gsm',
 'horz_speed_gsm',
 'log_CF8_MAXERRORS',
 'log_KALMAN_CONTROL',
 'log_XPDR_PINGS',
 'sigma_theta',
 'log_RELAUNCH',
 'pressure',
 'log_SEABIRD_C_J',
 'log_TGT_LATLONG',
 'log_SPEED_LIMITS',
 'latitude',
 'log_PRESSURE_YINT',
 'gc_end_secs',
 'GPSE_qc',
 'sg_cal_QC_temp_spike_depth',
 'conductivity_raw',
 'eng_wlbb2f_redRef',
 'log_PITCH_MAXERRORS',
 'log_gps_lat',
 'temperature',
 'log_DEVICE5',
 'log_KALMAN_X',
 'eng_elaps_t',
 'sg_cal_A',
 'log_COMM_SEQ',
 'log_ESCAPE_HEADING',
 'vert_speed',
 'log_HD_B',
 'sg_cal_pump_power_slope',
 'log_T_NO_W',
 'sg_cal_Foffset',
 'log_UNCOM_BLEED',
 'log_T_MISSION',
 'log_SM_CC',
 'log_TGT_DEFAULT_LON',
 'log_T_RSLEEP',
 'log_CAPMAXSIZE',
 'log__SM_DEPTHo',
 'sg_cal_t_g',
 'log_VBD_MAXERRORS',
 'gc_vbd_i',
 'log_D_PITCH',
 'sg_cal_vbd_cnts_per_cc',
 'conductivity_raw_qc',
 'sg_cal_vbd_min_cnts',
 'eng_rollCtl',
 'log_ROLL_CNV',
 'density',
 'sg_cal_id_str',
 'sg_cal_calibcomm',
 'log_ALTIM_TOP_PING',
 'log_ALTIM_TOP_PING_RANGE',
 'log_DEEPGLIDER',
 'eng_elaps_t_0000',
 'SBE43_qc',
 'log_MHEAD_RNG_PITCHd_Wd',
 'log_VBD_CNV',
 'log_VBD_MIN',
 'temperature_qc',
 'log_PITCH_ADJ_GAIN',
 'log_SMARTDEVICE2',
 'log_RAFOS_DEVICE',
 'hdm_qc',
 'log_KALMAN_Y',
 'log_N_GPS',
 'sg_cal_t_j',
 'dissolved_oxygen_sat',
 'log_VBD_MAX',
 'sbe43_dissolved_oxygen_qc',
 'log_GPS_DEVICE',
 'conductivity',
 'north_displacement',
 'log_PITCH_TIMEOUT',
 'log_N_NOSURFACE',
 'log_GLIDE_SLOPE',
 'reviewed',
 'log__XMS_TOUTs',
 'log_ROLL_DEG',
 'gc_ob_vertv',
 'log_T_ABORT',
 'sg_cal_hd_b',
 'log_VBD_BLEED_AD_RATE',
 'log_PITCH_MAX',
 'sg_cal_cond_bias',
 'log_PITCH_DBAND',
 'log__SM_ANGLEo',
 'eng_wlbb2f_redCount',
 'sg_cal_mission_title',
 'log_NAV_MODE',
 'sg_cal_c_g',
 'log_TGT_RADIUS',
 'depth_avg_curr_north',
 'salinity_raw_qc',
 'log_MOTHERBOARD',
 'gc_roll_secs',
 'log__CALLS',
 'log_MISSION',
 'sg_cal_E']
[9]:
# Print to screen a table of attributes
plotters.show_contents(ds_OG1,'attrs')
information is based on xarray Dataset
[9]:
Attribute Value DType
0 title OceanGliders trajectory file str
1 id sg005_20080606T180738_delayed str
2 platform sub-surface gliders str
3 platform_vocabulary https://vocab.nerc.ac.uk/collection/L06/curren... str
4 naming_authority edu.washington.apl str
5 institution School of Oceanography\nUniversity of Washingt... str
6 geospatial_lat_min 61.41415 ndarray
7 geospatial_lat_max 61.41549829808199 ndarray
8 geospatial_lon_min -8.279016666666667 ndarray
9 geospatial_lon_max -8.273983333333332 ndarray
10 geospatial_vertical_min 0.0 ndarray
11 geospatial_vertical_max 51.40003042836588 ndarray
12 time_coverage_start 20080606T180256 str
13 time_coverage_end 20080606T183207 str
14 site Multiple transects of Faroe-Iceland Ridge uppe... str
15 project Iceland Scotland Ridge June 2008 str
16 contributor_name Charlie Eriksen, Peter Rhines str
17 contributor_role PI, Principal investigator str
18 contributor_role_vocabulary http://vocab.nerc.ac.uk/search_nvs/W08, str
19 contributor_email eriksen@uw.edu, str
20 contributing_institutions University of Washington - School of Oceanogra... str
21 contributing_institutions_vocabulary https://edmo.seadatanet.org/report/1434, str
22 contributing_institutions_role PI, str
23 contributing_institutions_role_vocabulary http://vocab.nerc.ac.uk/collection/W08/current/, str
24 uri 9e33a22e-a959-11e3-b35f-0026bb609360 str
25 rtqc_method No QC applied str
26 rtqc_method_doi n/a str
27 comment Processing start:\n20:13:32 11 Mar 2014 UTC: I... str
28 start_date 20080606T180738 str
29 date_created 20140311T200332 str
30 featureType trajectoryProfile str
31 Conventions CF-1.10,OG-1.0 str
32 date_modified 20260414T130559 str
33 keywords_vocabulary NASA/GCMD Earth Science Keywords Version 6.0.0.0 str
34 license These data may be redistributed and used witho... str
35 disclaimer Data provided AS-IS. str
36 keywords Water Temperature, Conductivity, Salinity, Den... str
37 file_version 2.71 float32
38 acknowledgment National Science Foundation, OCE Division, Gra... str
39 contributer_email null@null.com str
[10]:
# Print to screen a table of the variables and variable attributes
plotters.show_contents(ds_OG1,'variables')
information is based on xarray Dataset
[10]:
  dims units comment standard_name dtype
name          
BBP470 N_MEASUREMENTS As reported by instrument float32
BBP470_REF N_MEASUREMENTS As reported by instrument float32
BBP700 N_MEASUREMENTS As reported by instrument float32
BBP700_REF N_MEASUREMENTS As reported by instrument float32
BBP_VFTEMP N_MEASUREMENTS degrees_Celsius As reported by the instrument float32
BUOYANCY N_MEASUREMENTS g Buoyancy of vehicle, corrected for compression effects float32
CNDC N_MEASUREMENTS S/m Conductivity corrected for anomalies sea_water_electrical_conductivity float32
CNDC_QC N_MEASUREMENTS Whether to trust each corrected conductivity value status_flag float32
CNDC_RAW N_MEASUREMENTS S/m Uncorrected conductivity float32
CNDC_RAW_QC N_MEASUREMENTS Whether to trust each raw conductivity value status_flag float32
COND_FREQ N_MEASUREMENTS As reported by the instrument float32
DAVG_CURR_EAST N_MEASUREMENTS m/s Eastward component of depth-average current based on hdm eastward_sea_water_velocity float32
DAVG_CURR_NORTH N_MEASUREMENTS m/s Northward component of depth-average current based on hdm northward_sea_water_velocity float32
DEPTH N_MEASUREMENTS m from science pressure and interpolated depth float64
DEPTH_Z N_MEASUREMENTS meters Depth calculated from pressure using gsw library, positive up. depth float64
DIVE_NUMBER N_MEASUREMENTS 1 int32
DOXY N_MEASUREMENTS micromoles/kg Oxygen concentration corrected for salinity mole_concentration_of_dissolved_molecular_oxygen_in_sea_water float32
DOXY_QC N_MEASUREMENTS Whether to trust each SBE43 dissolved oxygen value status_flag float32
EAST_DISPLACEMENT N_MEASUREMENTS meters Eastward displacement from hdm float32
FLUOCHLA N_MEASUREMENTS As reported by instrument float32
GLIDER_HORZ_VELO_MODEL N_MEASUREMENTS cm/s Vehicle horizontal speed based on hdm float32
GLIDER_VERT_VELO_MODEL N_MEASUREMENTS cm/s Vehicle vertical speed based on hdm float32
GLIDE_ANGLE N_MEASUREMENTS cm/s Glide angle based on hdm float32
GLIDE_SPEED N_MEASUREMENTS cm/s Vehicle speed based on hdm float32
GLIDE_SPEED_QC N_MEASUREMENTS Whether to trust each hdm speed value status_flag float32
HEADING N_MEASUREMENTS degrees Vehicle heading (magnetic) float32
LATITUDE N_MEASUREMENTS degrees_north Latitude of the sample based on hdm DAC latitude float64
LATITUDE_GPS N_MEASUREMENTS degrees_north latitude float64
LONGITUDE N_MEASUREMENTS degrees_east Longitude of the sample based on hdm DAC longitude float64
LONGITUDE_GPS N_MEASUREMENTS degrees_east longitude float64
NORTH_DISPLACEMENT N_MEASUREMENTS meters Northward displacement from hdm float32
O2_FREQ N_MEASUREMENTS Hz As reported by instrument float32
OXYSAT N_MEASUREMENTS micromoles/kg Calculated saturation value for oxygen given measured presure and corrected temperature, and salinity float32
PHASE N_MEASUREMENTS float64
PHASE_QC N_MEASUREMENTS int64
PITCH N_MEASUREMENTS degrees Vehicle pitch float32
PITCH_CTL N_MEASUREMENTS float32
PRES N_MEASUREMENTS dbar Uncorrected sea-water pressure sea_water_pressure float32
PROFILE_NUMBER N_MEASUREMENTS float64
PSAL N_MEASUREMENTS 1e-3 Salinity corrected for thermal-inertia effects (PSU) sea_water_salinity float32
PSAL_QC N_MEASUREMENTS Whether to trust each corrected salinity value status_flag float32
PSAL_RAW N_MEASUREMENTS 1e-3 Uncorrected salinity derived from temperature_raw and conductivity_raw (PSU) float32
PSAL_RAW_QC N_MEASUREMENTS Whether to trust each raw salinity value status_flag float32
ROLL N_MEASUREMENTS degrees Vehicle roll float32
ROLL_CTL N_MEASUREMENTS float32
SIGMA_T N_MEASUREMENTS g/m^3 Sigma based on density sea_water_sigma_t float32
SIGTHETA N_MEASUREMENTS g/m^3 sea_water_sigma_theta float32
SOUND_VELOCITY N_MEASUREMENTS m/s Sound velocity speed_of_sound_in_sea_water float32
TEMP N_MEASUREMENTS degrees_Celsius Termperature (in situ) corrected for thermistor first-order lag sea_water_temperature float32
TEMP_FREQ N_MEASUREMENTS As reported by the instrument float32
TEMP_QC N_MEASUREMENTS Whether to trust each corrected temperature value status_flag float32
TEMP_RAW N_MEASUREMENTS degrees_Celsius Uncorrected temperature (in situ) float32
TEMP_RAW_QC N_MEASUREMENTS Whether to trust each raw temperature value status_flag float32
THETA N_MEASUREMENTS degrees_Celsius Potential temperature based on corrected salinity sea_water_potential_temperature float32
TIME N_MEASUREMENTS seconds since 1970-01-01T00:00:00Z Time of CTD sample in GMT epoch format time datetime64[ns]
TIME_DOXY N_MEASUREMENTS SBE43 time in GMT epoch format time datetime64[ns]
TIME_GPS N_MEASUREMENTS datetime64[ns]
VBD_CC N_MEASUREMENTS float32
C_VBD string int64
DEPLOYMENT_LATITUDE string float64
DEPLOYMENT_LONGITUDE string float64
DEPLOYMENT_TIME string datetime64[ns]
HD_A string Hydrodynamic lift factor for given hull shape (1/degrees of attack angle) float64
HD_B string Hydrodynamic drag factor for given hull shape (Pa^(-1/4)) float64
HD_C string Hydrodynamic induced drag factor for given hull shape (1/radians^2 of attack angle) float64
MASS string kg Mass of the glider float64
PLATFORM_MODEL string
PLATFORM_SERIAL_NUMBER string
TRAJECTORY string
VBD_CC_PER_CNTS string float64
VBD_CNTS_PER_CC string float64
VBD_MIN_CNTS string int64
VOLMAX string m^3 Maximum displaced volume of the glider float64
WMO_IDENTIFIER string

Convert mission from a local directory of basestation files

  • For local data in the directory input_dir

  • Creates a plot of ctd_depth against ctd_time.

[11]:
# Specify the input directory on your local machine
input_dir = data_path + '/demo_sg005' ### chose the input directory with your data

# Load and concatenate all datasets in the input directory
# Optionally, specify the range of profiles to load (start_profile, end_profile)
list_datasets = readers.load_basestation_files(input_dir, start_profile=1, end_profile=5)

# Convert the list of datasets to OG1
ds_OG1, var_list = convertOG1.convert_to_OG1(list_datasets)

# Generate a simple plot
plotters.plot_profile_depth(ds_OG1)
plotters.show_contents(ds_OG1,'attrs')
Scanning files: 100%|██████████| 5/5 [00:00<00:00, 40.16file/s]
Loading datasets: 100%|██████████| 5/5 [00:00<00:00, 38.82file/s]
Processing datasets:   0%|          | 0/5 [00:00<?, ?dataset/s]
No conversion information found for micromoles/kg to micromoles/kg

No conversion information found for cm s-1 to cm s-1

No conversion information found for micromoles/kg to micromoles/kg
/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Variables removed from dataset: ['eng_depth', 'eng_elaps_t', 'eng_elaps_t_0000', 'latitude_gsm', 'longitude_gsm', 'glide_angle_gsm', 'horz_speed_gsm', 'north_displacement_gsm', 'east_displacement_gsm', 'speed_gsm', 'vert_speed_gsm', 'dive_num_cast', 'density']
Processing datasets:  20%|██        | 1/5 [00:00<00:02,  1.97dataset/s]/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Processing datasets:  40%|████      | 2/5 [00:01<00:01,  1.99dataset/s]/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Processing datasets:  60%|██████    | 3/5 [00:01<00:01,  1.99dataset/s]/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Processing datasets:  80%|████████  | 4/5 [00:02<00:00,  1.99dataset/s]/home/runner/micromamba/envs/TEST/lib/python3.14/site-packages/xarray/core/duck_array_ops.py:264: RuntimeWarning: invalid value encountered in cast
  return data.astype(dtype, **kwargs)
Processing datasets: 100%|██████████| 5/5 [00:02<00:00,  1.99dataset/s]
The following HDM parameters were found: ['VBD_MIN_CNTS', 'VBD_CNTS_PER_CC', 'VBD_CC_PER_CNTS', 'MASS', 'VOLMAX', 'C_VBD', 'HD_A', 'HD_B', 'HD_C']
Warning: The following potential HDM parameters were not found in the datasets: ['VBD_BIAS']

_images/demo-output_17_3.png
information is based on xarray Dataset
[11]:
Attribute Value DType
0 title OceanGliders trajectory file str
1 id sg005_20080606T180738_delayed str
2 platform sub-surface gliders str
3 platform_vocabulary https://vocab.nerc.ac.uk/collection/L06/curren... str
4 naming_authority edu.washington.apl str
5 institution School of Oceanography\nUniversity of Washingt... str
6 geospatial_lat_min 61.41231666666666 ndarray
7 geospatial_lat_max 61.57591666666667 ndarray
8 geospatial_lon_min -8.747133333333332 ndarray
9 geospatial_lon_max -8.273983333333332 ndarray
10 geospatial_vertical_min -0.3214989667970032 ndarray
11 geospatial_vertical_max 845.8311973927603 ndarray
12 time_coverage_start 20080606T180256 str
13 time_coverage_end 20080607T080838 str
14 site Multiple transects of Faroe-Iceland Ridge uppe... str
15 project Iceland Scotland Ridge June 2008 str
16 contributor_name Charlie Eriksen, Peter Rhines str
17 contributor_role PI, Principal investigator str
18 contributor_role_vocabulary http://vocab.nerc.ac.uk/search_nvs/W08, str
19 contributor_email eriksen@uw.edu, str
20 contributing_institutions University of Washington - School of Oceanogra... str
21 contributing_institutions_vocabulary https://edmo.seadatanet.org/report/1434, str
22 contributing_institutions_role PI, str
23 contributing_institutions_role_vocabulary http://vocab.nerc.ac.uk/collection/W08/current/, str
24 uri 9e33a22e-a959-11e3-b35f-0026bb609360 str
25 rtqc_method No QC applied str
26 rtqc_method_doi n/a str
27 comment Processing start:\n20:13:32 11 Mar 2014 UTC: I... str
28 start_date 20080606T180738 str
29 date_created 20140311T200332 str
30 featureType trajectoryProfile str
31 Conventions CF-1.10,OG-1.0 str
32 date_modified 20260414T130602 str
33 keywords_vocabulary NASA/GCMD Earth Science Keywords Version 6.0.0.0 str
34 license These data may be redistributed and used witho... str
35 disclaimer Data provided AS-IS. str
36 keywords Water Temperature, Conductivity, Salinity, Den... str
37 file_version 2.71 float32
38 acknowledgment National Science Foundation, OCE Division, Gra... str
39 contributer_email null@null.com str

Convert mission from the NCEI server (with p*nc files)

[12]:
# Specify the server where data are located
#server = "https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/033/20100903/"

# Load and concatenate all datasets from the server, optionally specifying the range of profiles to load
#list_datasets = readers.load_basestation_files(server, start_profile=1, end_profile=19)

# Convert the list of datasets to OG1
#ds_OG1, var_list = convertOG1.convert_to_OG1(list_datasets)

Saving data

Due to problems with writing xarray datasets as netCDF when attributes are not of a specified type (str, Number, np.ndarray, np.number, list, tuple), a function was written save_dataset.

[13]:
# Write the file
# This writer catches errors in data types (DType errors) when using xr.to_netcdf()
# The solution is to convert them to strings, which may be undesired behaviour
output_file = os.path.join(data_path, 'demo_test.nc')
if os.path.exists(output_file):
    os.remove(output_file)

writers.save_dataset(ds_OG1, output_file);
[14]:
# Load the data saved
ds1 = xr.open_dataset(output_file)

# Generate a simple plot
#plotters.show_contents(ds_all,'attrs')
plotters.plot_depth_colored(ds1, color_by='PROFILE_NUMBER')

_images/demo-output_22_0.png

Run multiple missions

[15]:
# Add these to existing attributes - update to your details
contrib_to_append = vocabularies.contrib_to_append
print(contrib_to_append)
{'contributor_name': 'Eleanor Frajka-Williams', 'contributor_email': 'eleanorfrajka@gmail.com', 'contributor_role': 'Data scientist', 'contributor_role_vocabulary': 'http://vocab.nerc.ac.uk/search_nvs/W08', 'contributing_institutions': 'University of Hamburg - Institute of Oceanography', 'contributing_institutions_vocabulary': 'https://edmo.seadatanet.org/report/1156', 'contributing_institutions_role': 'Data scientist', 'contributing_institutions_role_vocabulary': 'http://vocab.nerc.ac.uk/search_nvs/W08'}
[16]:
# Specify a list of servers or local directories
input_locations = [
    # Either Iceland, Faroes or RAPID/MOCHA
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/005/20090829/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/005/20080606/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/005/20081106/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/012/20070831/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/014/20080214/",  # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/014/20080222/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/016/20061112/",  # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/016/20090605/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/016/20071113/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/016/20080607/",  # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/033/20100518/", # done
    "https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/033/20100903/", # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/101/20081108/",     # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/101/20061112/",    # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/101/20070609/",   # done
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/102/20061112/",  # done
    # Labrador Sea
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/015/20040924/",
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/014/20040924/",
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/008/20031002/",
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/004/20031002/",
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/016/20050406/",
    # RAPID/MOCHA
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/033/20100729/",
    #"https://www.ncei.noaa.gov/data/oceans/glider/seaglider/uw/034/20110128/",
]

#for input_loc in input_locations:
    # Example usage
#    ds_all = convertOG1.process_and_save_data(input_loc, output_dir=data_path, save=True,  run_quietly=True)
[ ]: