Metadata Harvesting Policy

1.0 Repository Metadata Harvesting

The purpose of the Service is to enhance the discoverability of Canadian research data and, as such, the Service will harvest publicly available metadata distributed across multiple platforms and repositories containing data. Canadian data repositories, that is, repositories owned and operated by a Canadian institution or organization, will be selected for harvesting, with the exception of other data repositories that facilitate querying data created by researchers at Canadian institutions.

The Service will not transfer or ingest any research data from harvested repositories, with the exception of openly accessible geospatial data files. Geospatial data files are used to generate a visual preview of the data and are made available to download from Lunaris. Otherwise, if the data are openly available, Users will be able to access them from the source repository. If no license is explicitly given, Users are encouraged to contact the repository to confirm license or terms of use. The Service will endeavour to work with harvested repositories to ensure that dataset licensing terms are clearly indicated.

Representatives of repositories interested in having their metadata harvested by the Service should contact

2.0 Criteria for Harvesting

The Service will endeavour to index new repositories for search and discovery as they are identified. Repositories will be prioritized for indexing based on the following criteria:

  • Support for one of the metadata API formats implemented in the Lunaris harvester. Currently, this is ArcGIS, CKAN, DataCite, DataStream, Dataverse, Dryad, GeoNetwork, MarkLogic, Nexus, OAI-PMH, OpenDataSoft, Socrata, and certain repositories with Google Sitemaps. Support for additional formats may be added in the future. Representatives of repositories interested in having their metadata harvested are encouraged to contact regardless of the formats listed here.
  • If a repository holds more than just research data -- for example, some university institutional repositories also hold theses and pre-print articles -- it must have a means of querying for only research data.
  • A plausible workflow for plaintext/keyword search and retrieval of datasets from the repository.
  • The existence of a reliable point of contact at the host institution for resolving technical and/or metadata issues.

3.0 Use of Harvested Metadata

Metadata from the Service will be available through the Site ( and exposed via an API (such as an OAI-PMH endpoint). When assessing a repository’s eligibility for harvesting, the Service confirms licensing terms with repository representatives to ensure metadata is publicly available for reuse, but the ultimate responsibility for confirming the licensing terms of the metadata in the source repository rests with the User. Effort will be made to ensure harvested metadata meets internationally accepted best practices and standards.