Data guidelines

Data sharing in PAGES-acknowledged publications

The PAGES Data Stewardship Integrative Activity seeks to advance best practices for sharing data generated as part of all PAGES-related activities, including data used for synthesis and analysis.

PAGES is a signatory on the International Accord on Open Data, and the role of data sharing for the integrity and advancement of science is strongly promoted through the data policies of journals and funders at all levels internationally.

Every publication stemming from PAGES activities should include a statement specifying how the underlying data used in or produced by the study can be accessed. For most journal outlets, this information is provided as a separate section titled "Data Availability".

Prior to publication, all essential input and output data must be archived in a community recognized, publically accessible, long-term data repository to allow for a proper "data citation" (see below).

Each dataset used or generated by the study should be accompanied with a data citation or a persistent identifier (doi or URL). A data citation tracks the provenance of the original data giving credit to the original data generator; this is in addition to any bibliographical references to publications where the data are described. Essential underlying data and metadata that have not previously been lodged at a public repository must be submitted and receive a persistent identifier prior to publication of the article.

"Data available upon request" is no longer acceptable as part of the "Data Availability" statement. Authors are also strongly encouraged to deposit significant code and other underlying digital assets into suitable repositories and to cite these using a data citation or link to doi/URL.

The summary below is to assist authors in completing the "Data Availability" or equivalent section of PAGES-acknowledged publications:

All input or underlying data

- Data currently archived in a public repository: Include a data citation or URL/doi in addition to the bibliographic reference where the data are interpreted. Include the version identifier for datasets with versioning schemes (e.g. PAGES 2k temperature v2.0.0).

- Data not currently archived in a public repository: Submit the data and metadata to a long-term archive and include a data citation or URL/doi. This typically requires working with the original data generator to transfer a quality-controlled version of the data and metadata in his/her name.

- Data obtained from a public repository and modified slightly as input data: Include a data citation and note the simple process used to modify the data in the current study (e.g. truncation, conversion to anomalies).

- Data obtained from a public repository and modified significantly as input data: Submit the processed data to a repository and include data citations for both the original and the modified versions (cross referenced).

All output or resulting data

- Essential results of numerical/statistical analyses or model output: Submit to a repository and provide the data citation or URL/doi. The submitted data should allow others to replicate the published data analyses and to readily compare the outcome with future studies. Ideally, the data used to plot every substantive figure should be archived, especially if they will be useful in digital form for future studies.

Community engagement is essential to develop efficient and practical practices in data management. PAGES recognizes that additional effort required to meet this level of stewardship and appreciates the foresight and dedication of the community. If you have any questions about the Data Stewardship policy of PAGES, please contact PAGES' Science Officer Lucien von Gunten: This email address is being protected from spambots. You need JavaScript enabled to view it.

What is a "data citation"?

Data Citations track the provenance of a dataset giving credit to the data generator; this is in addition to any references to publications where the data are described. Data Citations are used in the text (or tables) alongside and in the same way as publication citations. In the Reference list, they include: Creators, Title, Repository, Identifier, Submission Year. More information about Data Citations is here: https://www.datacite.org/mission.html

Here is an example of text and corresponding citations (using Climate of the Past punctuation style):

Text
The PAGES2k Consortium (2017a) assembled a large global dataset of temperature-sensitive proxy records (PAGES2k Consortium, 2017b). Among the records is the paleo-temperature reconstruction from Laguna Chepical (de Jong et al. 2016), which was described by de Jong et al. (2013).

References
de Jong, R., von Gunten, l., Maldonado, A., and Grosjean, M.: Late Holocene summer temperatures in the central Andes reconstructed from the sediments of high-elevation Laguna Chepical, Chile (32° S), Climate of the Past, 9, 1921-1932, 2013.

de Jong, R., von Gunten, l., Maldonado, A., and Grosjean, M.: Laguna Chepical summer temperature reconstruction, World Data Center for Paleoclimatology, https://www.ncdc.noaa.gov/paleo/study/20366, 2016.

PAGES 2k Consortium: A global multiproxy database for temperature reconstructions of the Common Era, Scientific Data, 4,170088, 2017a.

PAGES 2k Consortium: A global multiproxy database for temperature reconstructions of the Common Era, version 2.0.0, figshare, https://figshare.com/s/d327a0367bb908a4c4f2, 2017b.