Users of the cluster pod must follow the NSF's guidelines for proper paper publication and data management, curation, and archiving ( https://www.nsf.gov/bfa/dias/policy/dmp.jsp ). Towards that end, we request that you submit to the CSC post acceptance preprint PDF/A's of your paper and that you use UCSB's DASH instance for data management, curation, and archiving ( https://www.library.ucsb.edu/data-curation/repository ).
Given the wide-ranging scope of data generated by CSC's users we leave it up to the researchers to determine what data needs to be archived.
A few notes on preparing your data for curation and archiving from https://www.library.ucsb.edu/repository-preparing-data :
Dash is free for affiliates of UC Santa Barbara with a UCSB NetID. A personal ORCID identifier is required to deposit data in Dash. Visit the ORCID site to obtain one.
When you submit your data through Dash, several fields are required:
- Institutional Affiliation
With an ORCID login, your submission can be linked to your ORCID account.
There are several optional fields for describing your submission in more detail:
- Granting Organization
- Award Number
- Usage Notes
- Related Work(s)
You can document how the dataset was collected and how it was processed in the Methods section. Any additional information that someone would need to know to use the dataset can be added in the Usage Notes section. You can reference any related journal or data publications in the Related Work(s) field.
Additionally, you can upload readme files or other ancillary documents as needed. Consider what others will need to know to understand and reuse your data.
All files are uploaded at the same hierarchical level. When downloaded they will be bundled as a ZIP file. Consider using file names (which sort alphanumerically in many programs) as a way to organize the files.
Good practices in choosing file names:
- Uniquely name each file.
- Be consistent and include similar information in all file names of the same file type.
- Consider sorting order (usually lexicographic) and logical hierarchies in file directories.
- Avoid ambiguous and confusing names, such as 'MyData' or 'sample'
- Derivatives and versions should have similar (but differentiated) names to keep them co-located but still uniquely identified.
- Names should reflect the contents of the file and/or the stage of development.
- When using dates, if you want the files to sort chronologically, put the year first and use numerical two-digit months and days (YYYY-MM-DD). (Example: March 7, 2004 would be written '2004-03-07'.)
- Use only alphanumeric characters but use dashes (-) or underscores (_) instead of spaces; avoid special characters such as colons (:) and slashes (/).
- Avoid using case differences to distinguish between files: ‘Record’, ‘record’, and ‘RECORD’ may be three different file names or the same file name, depending on the operating system.
Although any file format is accepted in Dash, consider using open source formats which are more broadly accessible to others.
The library’s contact for beginning the data curation and archiving process is firstname.lastname@example.org .