Skip to Main Content

Managing your research data

Preparation

It is important to complete the following steps before collecting data:

1. Search for any existing data (secondary data) related to you subject or your project:

  • Some secondary data may be reusable for your project
  • Find out how others collect and organise their data 

   Search the web or data repositories.

2. Decide what file naming conventions, file structure and documentation methods you will use for your data

  • Keeping file naming convention and file structure in a consistent format helps you access and reuse the data in the future

Documentation and Metadata

Documentation

It is important to document the context and methodology of your data collection. Documentation should cover how data were collected, when and where they were created, organisation of the files, access, quality control, etc.  Text files are often used to describe contextual information. Documenting data ensures research data will be discoverable and useable. Metadata is often used in documentations for managing data. 

 

Metadata

Metadata is a set of data describes other data.  It provides information about an item or the content of a collection.  Metadata is used for resource discovery, providing searchable information, a bibliographic record for citation, or for online data browsing.​

Metadata used for data or data collection usually contain the purpose, authors, timeframe, location, and subjects/keywords.

Use a vocabulary to describe your data:

Metadata guides:

Metadata standards:

File Formats

Accessibility and long term preservation are two key points for considering data file formats.  Open, unencrypted and uncompressed file formats are easier to be preserved.

Examples of preferred formats:

  • Text: plain text (.txt), HTML, XML, PDF/A, not Word
  • Image: JPEG, JPG-2000, PNG, TIFF, not GIF or JPG
  • Audio: AIFF, WAVE
  • CSV, not Excel

Links to online guides:

File naming, version control and file structure

File name and version control

A file name should be short and descriptive. Version control is a system of keeping, tracking and recording changes to files. It ensures that the most recent file can be easily identified, and there is an audit trail of changes.

  • Use clear file names and date created e.g. 20220203_VersionControl_v1
  • Original or raw data files should be kept in a master copy
  • Data for analysis or processing should be kept in a separate working file
  • Record a history of major changes to the data
  • Keep documentation of analysis, e.g. software scripts, code

File organisation

Organising your files in a hierarchical file structure and chronological subfolders may help you better manage your files. 

Software Carpentry's video provides some good advice on file structure, version control, metadata and documentations. 

Quality control

Quality control activities ensure that the data is reliable, valid and reproducible.

Quality checking during data collection:

  • Check a portion of recorded data or transcription against the source
  • Check a portion of data entry for errors
  • Use data validation rules for electronic data entry forms
  • Regularly test or calibrate instruments

Quality checking after data collection:

  • Check expected ranges of quantitative data, e.g. scores, measurements
  • Check data for logic, i.e. does it make sense
  • Check for missed data and collect if possible 
  • Code any missing values (do not use zero)
  • Anonymise raw data

Te Mana Raraunga/Māori Data Sovereignty Network

Māori data are data that are produced by Māori, and data that are about Māori and the environments we have relationships with. Data are a living tāonga and are of strategic value to Māori. Māori data include but are not limited to:

  • Data from government agencies, organisations and/or businesses

  • Data about Māori that are used to describe or compare Māori collectives

  • Data about Te Ao Māori that emerges from research