Skip to contents

Brief

We need a process that takes (i) raw proofsafe survey records, (ii) GPS transect lines in GeoJSON format, and (iii) project information, and uploads them to the database in a consistent manner that allows transect survey data across projects to be stored and analysed together.

Transect surveys at ARI follow a double-observer distance-sampling design, where two observers walk the same transect and independently record detections. Surveys are conducted either during the day (e.g. koala surveys) or at night via spotlight (e.g. greater glider surveys). Raw data is captured in the field using Proofsafe forms and exported as CSV files.

Input

There are three inputs to this process:

  1. A proofsafe records table — the raw CSV export from Proofsafe containing animal detections along transects
  2. A GeoJSON file of GPS transect lines — spatial line features with SiteID and Transect columns linking back to the records
  3. A project information table — a single-row CSV with metadata about the project (e.g. project name, target species, survey design)

See the appendix for the full list of columns expected in each uploaded table.

Transect database model

Once this process is completed, data will be uploaded into the raw layer of the transect schema on the database. Automatic database views will then process this raw data into the curated and processed layers.

Raw layer — all records as uploaded, including any re-uploads (de-duplication handled by curated views):

  • raw_transect_records — individual animal detections
  • raw_transects — transect deployment information (with spatial geometry)
  • raw_project_information — project metadata

Curated layer — duplicates removed, most data fields retained:

  • curated_transect_records — most recent records for each unique detection
  • curated_transects — most recent entry for each transect deployment
  • curated_project_information — most recent entry for each project

Processed layer — derived outputs ready for analysis:

  • processed_transect_presence_absence — presence/absence of each species at each transect × site × iteration combination, with spatial geometry attached

Process

Uploading transect data involves eight steps. The easiest way to complete these steps is through the interactive Shiny app described below. Alternatively, the steps can be run directly in R using the functions described in Upload using R code.

Step 1 — Import the proofsafe records CSV Step 2 — Import the GeoJSON transect lines Step 3 — Import the project information CSV Step 4 — Format proofsafe data to database format (applies the relevant *_proofsafe_format() function) Step 5 — Standardise species names to VBA taxonomy Step 6 — Inspect a map of records and transects Step 7 — Run data quality checks (all checks must pass before upload) Step 8 — Upload to database

Upload data using the Shiny app

The easiest way to upload transect data is through the hosted Shiny app at:

https://arisci.shinyapps.io/transect-app/

Note: the app requires VPN access and login credentials.

The app walks you through all eight steps in the sidebar. Download the example data from the “Download Example Data” button in the app to see the expected file formats before preparing your own data.

The app currently uses the region_gg_proofsafe_format() function in Step 4. If you need to upload koala survey data or standard (non-regional) greater glider data, use the R code workflow below with koala_proofsafe_format() or gg_proofsafe_format() respectively.

You can also run the app locally from R if needed. Set up a database connection and launch the app:

Note: to learn more about database connections, see vignette('database-connect')

# Make sure VPN is running
con <- weda::weda_connect(password = keyring::key_get(service = "ari-dev-weda-psql-01",
                                                      username = "psql_user"))
# Launch the transect app
shiny::runApp(system.file("app/transect-app.R", package = "weda"))

Upload using R code

The following steps replicate what the Shiny app does, but in R code. This is useful when the app’s default format function does not match your survey type, or when you want to script a repeatable upload workflow.

Example data files used below are bundled with the package. The regional greater glider example uses a raw proofsafe export (region_gg_records.csv) alongside its GPS transect lines and project metadata:

proofsafe_path <- system.file("dummydata/transectdata/region_gg_records.csv", package = "weda")
transects_path <- system.file("dummydata/transectdata/region_gg_transects.geojson", package = "weda")
project_path   <- system.file("dummydata/transectdata/region_gg_project.csv", package = "weda")

Step 1 — Load proofsafe records

Load the raw proofsafe CSV export. This is the direct output from exporting a Proofsafe form — no pre-processing is needed at this stage.

raw_proofsafe <- readr::read_csv(proofsafe_path, show_col_types = FALSE)

raw_proofsafe %>%
  head(5) %>%
  kbl() %>%
  kable_styling() %>%
  scroll_box(width = "100%")
Business_Id_B Business_Name_B Author_Id_F Author_Name_F Form_Id_F Form_Name_F File_Id_F File_Name_F Section_Id_F Data_Section_Id_F Parent_Data_Section_Id_F Section_Id_921_H1 Data_Section_Id_921_H1 SiteID_H1 Transect_H1 Date_H1 Start_time_H1 Observer_H1…18 ObserverOther_H1 Observer_H1…20 GPS_H1 StatusBurn_H1 Access_H1 TransectNotes_H1 Visibility_H1 Temp_C_H1 MoonPhase_H1 Nightlight_H1 CloudCover_H1 Wind_H1 Precipitation_H1 FlowerIndex_H1 Section_Id_922_H2 Data_Section_Id_922_H2 P1 Vis_rank_H2 P2 Vis_rank_H2 P3 Vis_rank_H2 P4 Vis_rank_H2 P5 Vis_rank_H2 P6 Vis_rank_H2 End_time_H2 Notes_H2 Section_Id_923_I3 Data_Section_Id_923_I3 Animal_I3 AnimObsTime_I3 SeenHeard_I3 Species_I3 Animal_sp_other_I3 L or R of trans_I3 Waypoint no._I3 AnimalHeight_I3 Distance to animal_I3 Bearing to A._I3 Dist_F_Transect_I3 Tree species_I3 Tree_sp_other_I3 SeenX2_I3 Comments_I3
1 Department of Environment 1 ARI Contractor 1 NA 1001 Example Road S1 N1 923 1001 0 921 1000 S1 1 2023-04-15 20:00:00 Other Jane Smith- Alex Jones 1 Other Control Access via main road NA Poor: thick understorey 15 50 Medium 40 Light breeze (6-11 km/h) No rain No trees in flower 922 1000 51-75% 51-75% 0-25% 0-25% 0-25% 51-75% 21:30:00 NA 923 1001 01 20:45:00 Heard Southern Boobook NA Left 01 NA 35.00 236 190 NA NA No Transect bearing = 335.
1 Department of Environment 1 ARI Contractor 1 NA 1001 Example Road S1 N1 923 1002 0 921 1000 S1 1 2023-04-15 20:00:00 Other Jane Smith- Alex Jones 1 Other Control Access via main road NA Poor: thick understorey 15 50 Medium 40 Light breeze (6-11 km/h) No rain No trees in flower 922 1000 51-75% 51-75% 0-25% 0-25% 0-25% 51-75% 21:30:00 NA 923 1002 03 22:20:00 Seen Southern Greater Glider NA Right 03 28 47.11 52 430 Peppermint sp. NA NA Transect bearing = 356.

Step 2 — Load GPS transect lines

The transect geometry must be provided as a GeoJSON file containing line features. The GeoJSON must include at least a SiteID and Transect column so that spatial lines can be joined to the survey records.

gps_transects <- sf::st_read(transects_path, quiet = TRUE)

gps_transects %>%
  kbl() %>%
  kable_styling() %>%
  scroll_box(width = "100%")
SiteID Transect geometry
S1 1 LINESTRING (146.8004 -37.90…

Step 3 — Create project information

Alongside the survey data, a single-row table is required to record project-level metadata. You can load this from a CSV (as done in the app) or construct it directly in R. Check that your ProjectShortName is not already in use by another project with check_unique_project().

project_information <- readr::read_csv(project_path, show_col_types = FALSE)

project_information %>%
  kbl() %>%
  kable_styling() %>%
  scroll_box(width = "100%")
ProjectName ProjectShortName DistanceSampling TerrestrialArboreal AllSpeciesTagged DistanceForAllSpecies DiurnalNocturnal ProjectDescription ProjectLeader
Greater Glider Example Survey gg_example TRUE Arboreal TRUE FALSE Nocturnal Example project for vignette demonstration First Last

The required fields are:

Column Description
ProjectName Full descriptive project name
ProjectShortName Short unique identifier used as a key in the database
DistanceSampling Logical — whether the survey used distance sampling
TerrestrialArboreal "Terrestrial" or "Arboreal"
AllSpeciesTagged Logical — whether all encountered species were recorded
DistanceForAllSpecies Logical — whether distances were recorded for all species
DiurnalNocturnal "Diurnal" or "Nocturnal"
ProjectDescription Brief description of the project objectives
ProjectLeader Name of the project lead

Step 4 — Format proofsafe data to database format

This is the key transformation step. The appropriate *_proofsafe_format() function converts the raw proofsafe CSV and GPS transect lines into the standardised database format, calculating derived fields such as perpendicular distances and projected animal locations.

Choose the function that matches your survey type:

Survey type Function
Koala (diurnal double-observer) koala_proofsafe_format()
Greater glider spotlight (standard) gg_proofsafe_format()
Greater glider spotlight (regional) region_gg_proofsafe_format()

For this example we use region_gg_proofsafe_format():

formatted_data <- region_gg_proofsafe_format(
  proofsafe          = raw_proofsafe,
  gps_transects      = gps_transects,
  Iteration          = 1L,
  MaxTruncationDistance = 100
)
#> Joining with `by = join_by(Author_Id_F, SiteID_H1, Transect_H1, Date_H1,
#> Observer_H1...20)`
#> Joining with `by = join_by(SiteID, Transect, Species, AnimalID, Date)`

# The output is a list with two elements:
names(formatted_data)
#> [1] "records"   "transects"

For koala surveys, the call is equivalent but uses koala_proofsafe_format():

formatted_data <- koala_proofsafe_format(
  proofsafe             = raw_proofsafe,
  gps_transects         = gps_transects,
  sp_filter             = "Koala",
  Iteration             = 1L,
  SurveyMethod          = "Diurnal double-observer distance-sampling",
  MaxTruncationDistance = 100
)

The returned list contains:

  • $records — data.frame of formatted animal detections
  • $transects — sf data.frame of transect deployments with spatial geometry

Step 5 — Standardise species names

Species names in the database follow the Victorian Biodiversity Atlas (VBA) taxonomy, which requires both a scientific_name and a common_name column. Use standardise_species_names() to check and append the missing name type from the VBA lookup table (weda::vba_name_conversions).

First, check whether your species names match the VBA without modifying the data:

standardise_species_names(recordTable = formatted_data$records,
                          format      = "scientific",
                          speciesCol  = "Species",
                          return_data = FALSE)
#> Warning in standardise_species_names(recordTable = formatted_data$records, : No match found for Southern Boobook, Southern Greater Glider. Please provide names within the VBA taxa list
#> Warning in max(conversions_grouped$n): no non-missing arguments to max;
#> returning -Inf
#> ->
#>  Southern Boobook
#>  Southern Greater Glider

If all names convert successfully, standardise the data:

formatted_records_std <- standardise_species_names(
  recordTable = formatted_data$records,
  format      = "scientific",
  speciesCol  = "Species",
  return_data = TRUE
)
#> Warning in standardise_species_names(recordTable = formatted_data$records, : No match found for Southern Boobook, Southern Greater Glider. Please provide names within the VBA taxa list
#> Warning in max(conversions_grouped$n): no non-missing arguments to max;
#> returning -Inf
#> ->
#>  Southern Boobook
#>  Southern Greater Glider

If a name cannot be matched, you will need to correct it in the source data to align with VBA conventions. The full VBA lookup can be browsed with View(weda::vba_name_conversions).

Step 6 — Inspect a map of records and transects

Before running data quality checks, it is worth visually inspecting the data to confirm that animal records fall in plausible locations relative to the transect lines.

visualise_records(records   = formatted_records_std,
                  transects = formatted_data$transects)
#> Warning in min(cc[, 1], na.rm = TRUE): no non-missing arguments to min;
#> returning Inf
#> Warning in min(cc[, 2], na.rm = TRUE): no non-missing arguments to min;
#> returning Inf
#> Warning in max(cc[, 1], na.rm = TRUE): no non-missing arguments to max;
#> returning -Inf
#> Warning in max(cc[, 2], na.rm = TRUE): no non-missing arguments to max;
#> returning -Inf
#> Warning in min(cc[, 1], na.rm = TRUE): no non-missing arguments to min;
#> returning Inf
#> Warning in min(cc[, 2], na.rm = TRUE): no non-missing arguments to min;
#> returning Inf
#> Warning in max(cc[, 1], na.rm = TRUE): no non-missing arguments to max;
#> returning -Inf
#> Warning in max(cc[, 2], na.rm = TRUE): no non-missing arguments to max;
#> returning -Inf
#> Warning: sf layer has inconsistent datum (+proj=longlat +ellps=GRS80 +no_defs).
#> Need '+proj=longlat +datum=WGS84'
#> Warning: sf layer has inconsistent datum (+proj=longlat +ellps=GRS80 +no_defs).
#> Need '+proj=longlat +datum=WGS84'
#> Warning: sf layer has inconsistent datum (+proj=longlat +ellps=GRS80 +no_defs).
#> Need '+proj=longlat +datum=WGS84'
#> Warning: sf layer has inconsistent datum (+proj=longlat +ellps=GRS80 +no_defs).
#> Need '+proj=longlat +datum=WGS84'
#> Warning: sf layer has inconsistent datum (+proj=longlat +ellps=GRS80 +no_defs).
#> Need '+proj=longlat +datum=WGS84'

The map shows:

  • Blue lines/polygons — transect lines and their truncation-distance buffers
  • Red points — projected animal locations
  • Black lines — observer-to-animal sight-lines
  • Orange lines — second-observer projections (where dual-observer data exists)

Records that fall outside the transect buffer can be inspected or removed using filter_records_outside_transect_area().

Step 7 — Run data quality checks

All data must pass the quality checks before upload. The transect_dq() function runs over 100 checks across the three tables using the pointblank package, covering column presence, data types, value ranges, coordinate bounds, and distance-sampling constraints.

dq <- transect_dq(
  records             = formatted_records_std,
  transects           = formatted_data$transects,
  project_information = project_information
)

dq[[1]]  # records
Pointblank Validation
[2026-05-07|05:28:52]
tibble recordsWARN STOP 1 NOTIFY
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W S N EXT

1
col_exists
 col_exists()

&marker;SiteID

1 1
1.00
0
0.00

2
col_exists
 col_exists()

&marker;Transect

1 1
1.00
0
0.00

3
col_exists
 col_exists()

&marker;Iteration

1 1
1.00
0
0.00

4
col_exists
 col_exists()

&marker;scientific_name

1 1
1.00
0
0.00

5
col_exists
 col_exists()

&marker;common_name

1 1
1.00
0
0.00

6
col_exists
 col_exists()

&marker;SurveyMethod

1 1
1.00
0
0.00

7
col_exists
 col_exists()

&marker;DateTime

1 1
1.00
0
0.00

8
col_exists
 col_exists()

&marker;AnimalID

1 1
1.00
0
0.00

9
col_exists
 col_exists()

&marker;SeenHeard

1 1
1.00
0
0.00

10
col_exists
 col_exists()

&marker;Adults

1 1
1.00
0
0.00

11
col_exists
 col_exists()

&marker;Joeys

1 1
1.00
0
0.00

12
col_exists
 col_exists()

&marker;Individuals

1 1
1.00
0
0.00

13
col_exists
 col_exists()

&marker;LoR

1 1
1.00
0
0.00

14
col_exists
 col_exists()

&marker;WaypointNo

1 1
1.00
0
0.00

15
col_exists
 col_exists()

&marker;ObserverLatitude

1 1
1.00
0
0.00

16
col_exists
 col_exists()

&marker;ObserverLongitude

1 1
1.00
0
0.00

17
col_exists
 col_exists()

&marker;AnimalDistance

1 1
1.00
0
0.00

18
col_exists
 col_exists()

&marker;AnimalHeight

1 1
1.00
0
0.00

19
col_exists
 col_exists()

&marker;AnimalHorizontalDistance

1 1
1.00
0
0.00

20
col_exists
 col_exists()

&marker;AnimalAngle

1 1
1.00
0
0.00

21
col_exists
 col_exists()

&marker;AnimalBearing

1 1
1.00
0
0.00

22
col_exists
 col_exists()

&marker;DistanceFromTransectStart

1 1
1.00
0
0.00

23
col_exists
 col_exists()

&marker;AnimalPerpDistance

1 1
1.00
0
0.00

24
col_exists
 col_exists()

&marker;TreeSpecies

1 1
1.00
0
0.00

25
col_exists
 col_exists()

&marker;BothSeen

1 1
1.00
0
0.00

26
col_exists
 col_exists()

&marker;ObservationNotes

1 1
1.00
0
0.00

27
col_exists
 col_exists()

&marker;ObserverPosition

1 1
1.00
0
0.00

28
col_exists
 col_exists()

&marker;AnimalLongitude

1 1
1.00
0
0.00

29
col_exists
 col_exists()

&marker;AnimalLatitude

1 1
1.00
0
0.00

30
col_exists
 col_exists()

&marker;ColourForm

1 1
1.00
0
0.00

31
col_exists
 col_exists()

&marker;PhotoID

1 1
1.00
0
0.00

32
col_exists
 col_exists()

&marker;AnimalLongitude2

1 1
1.00
0
0.00

33
col_exists
 col_exists()

&marker;AnimalLatitude2

1 1
1.00
0
0.00

34
rows_distinct
 rows_distinct()

&marker;SiteID, &marker;Transect, &marker;Iteration, &marker;scientific_name, &marker;common_name, &marker;SurveyMethod, &marker;DateTime, &marker;AnimalID, &marker;SeenHeard, &marker;Adults, &marker;Joeys, &marker;Individuals, &marker;LoR, &marker;WaypointNo, &marker;ObserverLatitude, &marker;ObserverLongitude, &marker;AnimalDistance, &marker;AnimalHeight, &marker;AnimalHorizontalDistance, &marker;AnimalAngle, &marker;AnimalBearing, &marker;DistanceFromTransectStart, &marker;AnimalPerpDistance, &marker;TreeSpecies, &marker;BothSeen, &marker;ObservationNotes, &marker;ObserverPosition, &marker;AnimalLongitude, &marker;AnimalLatitude, &marker;ColourForm, &marker;PhotoID, &marker;AnimalLongitude2, &marker;AnimalLatitude2

2 2
1.00
0
0.00

35
col_is_character
 col_is_character()

&marker;SiteID

1 1
1.00
0
0.00

36
col_is_character
 col_is_character()

&marker;scientific_name

1 1
1.00
0
0.00

37
col_is_character
 col_is_character()

&marker;common_name

1 1
1.00
0
0.00

38
col_is_character
 col_is_character()

&marker;SeenHeard

1 1
1.00
0
0.00

39
col_is_character
 col_is_character()

&marker;LoR

1 1
1.00
0
0.00

40
col_is_character
 col_is_character()

&marker;WaypointNo

1 1
1.00
0
0.00

41
col_is_character
 col_is_character()

&marker;TreeSpecies

1 1
1.00
0
0.00

42
col_is_character
 col_is_character()

&marker;ObservationNotes

1 1
1.00
0
0.00

43
col_is_character
 col_is_character()

&marker;SurveyMethod

1 1
1.00
0
0.00

44
col_is_character
 col_is_character()

&marker;ColourForm

1 1
1.00
0
0.00

45
col_is_character
 col_is_character()

&marker;PhotoID

1 1
1.00
0
0.00

46
col_is_character
 col_is_character()

&marker;AnimalID

1 1
1.00
0
0.00

47
col_is_integer
 col_is_integer()

&marker;Iteration

1 1
1.00
0
0.00

48
col_is_integer
 col_is_integer()

&marker;Transect

1 1
1.00
0
0.00

49
col_is_integer
 col_is_integer()

&marker;Adults

1 1
1.00
0
0.00

50
col_is_integer
 col_is_integer()

&marker;Joeys

1 1
1.00
0
0.00

51
col_is_integer
 col_is_integer()

&marker;Individuals

1 1
1.00
0
0.00

52
col_is_integer
 col_is_integer()

&marker;ObserverPosition

1 1
1.00
0
0.00

53
col_is_numeric
 col_is_numeric()

&marker;ObserverLatitude

1 1
1.00
0
0.00

54
col_is_numeric
 col_is_numeric()

&marker;ObserverLongitude

1 1
1.00
0
0.00

55
col_is_numeric
 col_is_numeric()

&marker;AnimalDistance

1 1
1.00
0
0.00

56
col_is_numeric
 col_is_numeric()

&marker;AnimalHeight

1 1
1.00
0
0.00

57
col_is_numeric
 col_is_numeric()

&marker;AnimalHorizontalDistance

1 1
1.00
0
0.00

58
col_is_numeric
 col_is_numeric()

&marker;AnimalAngle

1 1
1.00
0
0.00

59
col_is_numeric
 col_is_numeric()

&marker;AnimalBearing

1 1
1.00
0
0.00

60
col_is_numeric
 col_is_numeric()

&marker;DistanceFromTransectStart

1 1
1.00
0
0.00

61
col_is_numeric
 col_is_numeric()

&marker;AnimalLongitude

1 1
1.00
0
0.00

62
col_is_numeric
 col_is_numeric()

&marker;AnimalLatitude

1 1
1.00
0
0.00

63
col_is_numeric
 col_is_numeric()

&marker;AnimalLongitude2

1 1
1.00
0
0.00

64
col_is_numeric
 col_is_numeric()

&marker;AnimalLatitude2

1 1
1.00
0
0.00

65
col_is_numeric
 col_is_numeric()

&marker;AnimalPerpDistance

1 1
1.00
0
0.00

66
col_vals_in_set
 col_vals_in_set()

&marker;SiteID

S1

2 2
1.00
0
0.00

67
col_vals_in_set
 col_vals_in_set()

&marker;Transect

1

2 2
1.00
0
0.00

68
col_vals_in_set
 col_vals_in_set()

&marker;Iteration

1

2 2
1.00
0
0.00

69
col_vals_between
 col_vals_between()

&marker;ObserverLatitude

[−60.55, −8.47]

2 2
1.00
0
0.00

70
col_vals_between
 col_vals_between()

&marker;AnimalLatitude

[−60.55, −8.47]

2 2
1.00
0
0.00

71
col_vals_between
 col_vals_between()

&marker;ObserverLongitude

[93.41, 173.34]

2 2
1.00
0
0.00

72
col_vals_between
 col_vals_between()

&marker;AnimalLongitude

[93.41, 173.34]

2 2
1.00
0
0.00

73
col_vals_in_set

Combination of Iteration, SiteID, and Transect

col_vals_in_set()

&marker;Iteration_SiteID_Transect

1_S1_1

2 2
1.00
0
0.00

74
col_vals_in_set
 col_vals_in_set()

&marker;scientific_name

2 0
0.00
2
1.00

75
col_vals_in_set
 col_vals_in_set()

&marker;common_name

2 0
0.00
2
1.00

76
col_vals_in_set
 col_vals_in_set()

&marker;SeenHeard

Seen, Heard, Other - define in comments

2 2
1.00
0
0.00

77
col_vals_in_set
 col_vals_in_set()

&marker;SurveyMethod

Diurnal double-observer distance-sampling, Spotlight double-observer distance-sampling, Thermal double-observer distance-sampling, Diurnal single-observer distance-sampling, Spotlight single-observer distance-sampling, Thermal single-observer distance-sampling, Thermal detection, Spotlight detection, Spotlight/call-playback detection, Owl call-playback, Recce, Diurnal bird survey, Diurnal bird survey (with call playback), Diurnal drone survey, Nocturnal drone survey

2 2
1.00
0
0.00

78
col_vals_not_null
 col_vals_not_null()

&marker;SiteID

2 2
1.00
0
0.00

79
col_vals_not_null
 col_vals_not_null()

&marker;scientific_name

2 2
1.00
0
0.00

80
col_vals_not_null
 col_vals_not_null()

&marker;common_name

2 0
0.00
2
1.00

81
col_vals_not_null
 col_vals_not_null()

&marker;DateTime

2 2
1.00
0
0.00

82
col_vals_not_null
 col_vals_not_null()

&marker;Iteration

2 2
1.00
0
0.00

83
col_vals_not_null
 col_vals_not_null()

&marker;Individuals

2 2
1.00
0
0.00

84
col_vals_not_null
 col_vals_not_null()

&marker;ObserverLatitude

2 2
1.00
0
0.00

85
col_vals_not_null
 col_vals_not_null()

&marker;ObserverLongitude

2 2
1.00
0
0.00

86
col_vals_not_null
 col_vals_not_null()

&marker;ObserverPosition

2 2
1.00
0
0.00

87
col_is_posix
 col_is_posix()

&marker;DateTime

1 1
1.00
0
0.00

88
col_vals_between
 col_vals_between()

&marker;DateTime

[&marker;StartTime, &marker;EndTime]

2 1
0.50
1
0.50
2026-05-07 05:28:53 UTC 1.2 s 2026-05-07 05:28:54 UTC
dq[[2]]  # transects
Pointblank Validation
[2026-05-07|05:28:54]
tibble transects %>% tidyr::as_tibble()WARN STOP 1 NOTIFY
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W S N EXT

1
col_exists
 col_exists()

&marker;SiteID

1 1
1.00
0
0.00

2
col_exists
 col_exists()

&marker;Transect

1 1
1.00
0
0.00

3
col_exists
 col_exists()

&marker;Iteration

1 1
1.00
0
0.00

4
col_exists
 col_exists()

&marker;ObserverPosition

1 1
1.00
0
0.00

5
col_exists
 col_exists()

&marker;ObserverName

1 1
1.00
0
0.00

6
col_exists
 col_exists()

&marker;Date

1 1
1.00
0
0.00

7
col_exists
 col_exists()

&marker;StartTime

1 1
1.00
0
0.00

8
col_exists
 col_exists()

&marker;EndTime

1 1
1.00
0
0.00

9
col_exists
 col_exists()

&marker;Duration

1 1
1.00
0
0.00

10
col_exists
 col_exists()

&marker;Weather

1 1
1.00
0
0.00

11
col_exists
 col_exists()

&marker;Temperature

1 1
1.00
0
0.00

12
col_exists
 col_exists()

&marker;TransectNotes

1 1
1.00
0
0.00

13
col_exists
 col_exists()

&marker;MoonPhase

1 1
1.00
0
0.00

14
col_exists
 col_exists()

&marker;Cloud

1 1
1.00
0
0.00

15
col_exists
 col_exists()

&marker;RelativeHumidity

1 1
1.00
0
0.00

16
col_exists
 col_exists()

&marker;Wind

1 1
1.00
0
0.00

17
col_exists
 col_exists()

&marker;Precipitation

1 1
1.00
0
0.00

18
col_exists
 col_exists()

&marker;FlowerIndex

1 1
1.00
0
0.00

19
col_exists
 col_exists()

&marker;Access

1 1
1.00
0
0.00

20
col_exists
 col_exists()

&marker;Visibility

1 1
1.00
0
0.00

21
col_exists
 col_exists()

&marker;TransectLength

1 1
1.00
0
0.00

22
col_exists
 col_exists()

&marker;TransectType

1 1
1.00
0
0.00

23
col_exists
 col_exists()

&marker;MaxTruncationDistance

1 1
1.00
0
0.00

24
col_exists
 col_exists()

&marker;geometry

1 1
1.00
0
0.00

25
rows_distinct
 rows_distinct()

&marker;SiteID, &marker;Transect, &marker;Iteration, &marker;ObserverPosition, &marker;ObserverName, &marker;ObserverID, &marker;Date, &marker;StartTime, &marker;EndTime, &marker;Duration, &marker;Weather, &marker;Temperature, &marker;TransectNotes, &marker;MoonPhase, &marker;Cloud, &marker;RelativeHumidity, &marker;Wind, &marker;Precipitation, &marker;FlowerIndex, &marker;Access, &marker;Visibility, &marker;TransectLength, &marker;MaxTruncationDistance, &marker;TransectType, &marker;geometry

1 1
1.00
0
0.00

26
col_is_character
 col_is_character()

&marker;SiteID

1 1
1.00
0
0.00

27
col_is_character
 col_is_character()

&marker;ObserverID

1 1
1.00
0
0.00

28
col_is_character
 col_is_character()

&marker;ObserverName

1 1
1.00
0
0.00

29
col_is_character
 col_is_character()

&marker;Weather

1 1
1.00
0
0.00

30
col_is_character
 col_is_character()

&marker;TransectNotes

1 0
0.00
1
1.00

31
col_is_character
 col_is_character()

&marker;Wind

1 1
1.00
0
0.00

32
col_is_character
 col_is_character()

&marker;Precipitation

1 1
1.00
0
0.00

33
col_is_character
 col_is_character()

&marker;FlowerIndex

1 1
1.00
0
0.00

34
col_is_character
 col_is_character()

&marker;Access

1 1
1.00
0
0.00

35
col_is_character
 col_is_character()

&marker;Visibility

1 1
1.00
0
0.00

36
col_is_character
 col_is_character()

&marker;TransectType

1 1
1.00
0
0.00

37
col_is_numeric
 col_is_numeric()

&marker;TransectLength

1 1
1.00
0
0.00

38
col_is_numeric
 col_is_numeric()

&marker;MaxTruncationDistance

1 1
1.00
0
0.00

39
col_is_date
 col_is_date()

&marker;Date

1 1
1.00
0
0.00

40
col_is_integer
 col_is_integer()

&marker;Iteration

1 1
1.00
0
0.00

41
col_is_integer
 col_is_integer()

&marker;ObserverPosition

1 1
1.00
0
0.00

42
col_is_integer
 col_is_integer()

&marker;Transect

1 1
1.00
0
0.00

43
col_is_posix
 col_is_posix()

&marker;StartTime

1 1
1.00
0
0.00

44
col_is_posix
 col_is_posix()

&marker;EndTime

1 1
1.00
0
0.00

45
col_vals_in_set
 col_vals_in_set()

&marker;SiteID

S1, S1

1 1
1.00
0
0.00

46
col_vals_in_set
 col_vals_in_set()

&marker;Transect

1, 1

1 1
1.00
0
0.00

47
col_vals_not_null
 col_vals_not_null()

&marker;SiteID

1 1
1.00
0
0.00

48
col_vals_not_null
 col_vals_not_null()

&marker;Transect

1 1
1.00
0
0.00

49
col_vals_not_null
 col_vals_not_null()

&marker;Date

1 1
1.00
0
0.00

50
col_vals_not_null
 col_vals_not_null()

&marker;StartTime

1 1
1.00
0
0.00

51
col_vals_not_null
 col_vals_not_null()

&marker;EndTime

1 1
1.00
0
0.00

52
col_vals_not_null
 col_vals_not_null()

&marker;ObserverID

1 1
1.00
0
0.00

53
col_vals_not_null
 col_vals_not_null()

&marker;ObserverName

1 1
1.00
0
0.00

54
col_vals_not_null
 col_vals_not_null()

&marker;Iteration

1 1
1.00
0
0.00

55
col_vals_not_null
 col_vals_not_null()

&marker;Duration

1 1
1.00
0
0.00

56
col_vals_not_null
 col_vals_not_null()

&marker;TransectLength

1 1
1.00
0
0.00

57
col_vals_not_null
 col_vals_not_null()

&marker;MaxTruncationDistance

1 1
1.00
0
0.00

58
col_vals_not_null
 col_vals_not_null()

&marker;TransectType

1 1
1.00
0
0.00

59
col_vals_not_null
 col_vals_not_null()

&marker;geometry

1 1
1.00
0
0.00

60
col_vals_in_set
 col_vals_in_set()

&marker;Visibility

Poor, Moderate, Excellent

1 1
1.00
0
0.00

61
col_vals_in_set
 col_vals_in_set()

&marker;TransectType

Line, Point

1 1
1.00
0
0.00

62
col_vals_in_set
 col_vals_in_set()

&marker;FlowerIndex

No trees in flower, Light flowering, Medium flowering, Heavy flowering

0 0
NA
0
NA
2026-05-07 05:28:54 UTC 1.3 s 2026-05-07 05:28:55 UTC
dq[[3]]  # project information
Pointblank Validation
[2026-05-07|05:28:55]
tibble project_informationWARN STOP 1 NOTIFY
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W S N EXT

1
row_count_match
 row_count_match()

1

1 1
1.00
0
0.00

2
col_vals_not_null
 col_vals_not_null()

&marker;ProjectName

1 1
1.00
0
0.00

3
col_vals_not_null
 col_vals_not_null()

&marker;ProjectShortName

1 1
1.00
0
0.00

4
col_vals_not_null
 col_vals_not_null()

&marker;DistanceSampling

1 1
1.00
0
0.00

5
col_vals_not_null
 col_vals_not_null()

&marker;TerrestrialArboreal

1 1
1.00
0
0.00

6
col_vals_not_null
 col_vals_not_null()

&marker;AllSpeciesTagged

1 1
1.00
0
0.00

7
col_vals_not_null
 col_vals_not_null()

&marker;DistanceForAllSpecies

1 1
1.00
0
0.00

8
col_vals_not_null
 col_vals_not_null()

&marker;DiurnalNocturnal

1 1
1.00
0
0.00

9
col_vals_not_null
 col_vals_not_null()

&marker;ProjectDescription

1 1
1.00
0
0.00

10
col_vals_not_null
 col_vals_not_null()

&marker;ProjectLeader

1 1
1.00
0
0.00

11
col_is_logical
 col_is_logical()

&marker;DistanceSampling

1 1
1.00
0
0.00

12
col_is_logical
 col_is_logical()

&marker;AllSpeciesTagged

1 1
1.00
0
0.00

13
col_is_logical
 col_is_logical()

&marker;DistanceForAllSpecies

1 1
1.00
0
0.00

14
col_vals_in_set
 col_vals_in_set()

&marker;TerrestrialArboreal

Terrestrial, Arboreal, Both

1 1
1.00
0
0.00

15
col_vals_in_set
 col_vals_in_set()

&marker;DiurnalNocturnal

Diurnal, Nocturnal, Both

1 1
1.00
0
0.00
2026-05-07 05:28:55 UTC < 1 s 2026-05-07 05:28:55 UTC

Each output table shows:

  • W (Warning) — yellow dot; data is usable but should be reviewed
  • S (Stop) — red dot; data cannot be uploaded until the issue is resolved
  • N (Notify) — informational flag for unusual but not invalid values

Upload will only proceed when there are no red dots. Use the EXT download button on any failing row to export the problematic records and trace them back to your source data.

To confirm all checks have passed programmatically:

all(sapply(dq, function(x) all(x[["validation_set"]][["all_passed"]])))

Step 8 — Prepare and upload

Once all quality checks pass, prepare the data for upload. This step generates MD5 hash IDs for each record to prevent duplicates on the database.

Note: the bundled example data is intentionally minimal (2 records, single observer) and will not pass all quality checks. The chunks below use eval = FALSE and are intended to be run with your own complete survey data.

data_for_upload <- prepare_transect_upload(agent_list = dq)

Then upload to the database. You will need an active database connection.

Note: before uploading, confirm that your ProjectShortName is unique using check_unique_project(ProjectShortName, con) if you have not already done so.

con <- weda::weda_connect(password = keyring::key_get(service = "ari-dev-weda-psql-01",
                                                      username = "psql_user"))

upload_transect_data(
  con              = con,
  data_list        = data_for_upload,
  uploadername     = "Firstname Surname",
  schema           = "transect"
)

The function appends data to the three raw tables (raw_transect_records, raw_transects, raw_project_information). The upload may take a few minutes. Leave your R session running and be patient.

To upload to the development schema first (recommended when testing a new project for the first time):

upload_transect_data(
  con          = con,
  data_list    = data_for_upload,
  uploadername = "Firstname Surname",
  schema       = "transect_dev"
)

Accessing uploaded data

Once uploaded, use the curated view functions to query the most recent records without duplicates:

# Curated records (SQL query, collected on demand)
transect_records_curated_view(con, return_data = TRUE)
transect_curated_view(con, return_data = TRUE)
transect_project_curated_view(con, return_data = TRUE)

# Processed presence/absence (all species)
processed_transect_presence_absence(con, return_data = TRUE)

# Presence/absence for a specific species
processed_transect_presence_absence(con,
                                    return_data = TRUE,
                                    species     = "Greater Glider")

See vignette('data-download') for more details on working with database data in R.

Appendix

A data dictionary is provided in this package (data(data_dictionary)) and also available in the data_dictionary schema on the database. Below is the data dictionary for the transect schema:

data_dictionary %>%
  filter(schema == "transect") %>%
  select(table_name, table_description, column_name, column_class, column_description) %>%
  kbl() %>%
  kable_styling(c("condensed"), full_width = FALSE) %>%
  collapse_rows(1:3, valign = "top") %>%
  scroll_box(width = "100%", height = "1000px")
table_name table_description column_name column_class column_description
raw_transect_records Records of animals detected along transects SiteID character NA
Transect integer NA
Iteration integer NA
scientific_name character NA
common_name character NA
SurveyMethod character NA
DateTime POSIXct, POSIXt NA
AnimalID character NA
SeenHeard character NA
Adults integer NA
Joeys integer NA
Individuals integer NA
LoR character NA
WaypointNo character NA
ObserverLatitude numeric NA
ObserverLongitude numeric NA
AnimalDistance numeric NA
AnimalHeight numeric NA
AnimalHorizontalDistance numeric NA
AnimalAngle numeric NA
AnimalBearing numeric NA
DistanceFromTransectStart numeric NA
AnimalPerpDistance numeric NA
TreeSpecies character NA
BothSeen logical NA
ObservationNotes character NA
ObserverPosition integer NA
AnimalLongitude numeric NA
AnimalLatitude numeric NA
ColourForm character NA
PhotoID character NA
AnimalLongitude2 numeric NA
AnimalLatitude2 numeric NA
LoR2 character NA
SeenOnSameSide logical NA
DistanceBetweenAnimalProj numeric NA
raw_transect_details Records of when and where transects were undertaken SiteID character NA
Transect integer NA
Iteration integer NA
ObserverPosition integer NA
ObserverName character NA
ObserverID character NA
Date Date NA
StartTime POSIXct, POSIXt NA
EndTime POSIXct, POSIXt NA
Duration difftime NA
Weather character NA
Temperature numeric NA
TransectNotes character NA
MoonPhase character NA
Cloud logical NA
RelativeHumidity logical NA
Wind character NA
Precipitation character NA
FlowerIndex character NA
Access character NA
Visibility character NA
TransectLength numeric NA
MaxTruncationDistance numeric NA
TransectType character NA
geometry sfc_MULTILINESTRING, sfc NA
raw_project_information Details of the project under which transects were searched ProjectName character NA
ProjectShortName character NA
DistanceSampling logical NA
TerrestrialArboreal character NA
AllSpeciesTagged logical NA
DistanceForAllSpecies logical NA
DiurnalNocturnal character NA
ProjectDescription character NA
ProjectLeader character NA