[Volute] r3984 - trunk/projects/dm/provenance/description

Volute commit messages volutecommits at g-vo.org
Fri Apr 28 14:34:41 CEST 2017

Author: kriebe
Date: Fri Apr 28 14:34:41 2017
New Revision: 3984

Log:
Minor changes in introduction, put section on links to other IVOA data models in a separate tex-file; removed implementation-details section

Modified:
trunk/projects/dm/provenance/description/ProvenanceDM.pdf
trunk/projects/dm/provenance/description/ProvenanceDM.tex
trunk/projects/dm/provenance/description/intro-general.tex
trunk/projects/dm/provenance/description/prov-refs.bib

Modified: trunk/projects/dm/provenance/description/ProvenanceDM.pdf
==============================================================================
Binary file (source and/or target). No diff available.

Modified: trunk/projects/dm/provenance/description/ProvenanceDM.tex
==============================================================================
--- trunk/projects/dm/provenance/description/ProvenanceDM.tex	Fri Apr 28 13:46:41 2017	(r3983)
+++ trunk/projects/dm/provenance/description/ProvenanceDM.tex	Fri Apr 28 14:34:41 2017	(r3984)
@@ -53,7 +53,7 @@

\newcommand{\note}[1]{%
\noindent%
-    \textcolor{darkgrey}{{\sffamily Note:} \emph{#1}}%
+    \textcolor{darkgrey}{{\sffamily Note:} \emph{#1}}%
}

@@ -82,10 +82,10 @@

% define new command for classes, in case we decide later on for a different style
-\newcommand{\class}[1]{\emph{#1}}
+\newcommand{\class}[1]{\emph{#1}}

\begin{document}
-\newcolumntype{Y}{>{\raggedright\arraybackslash}X}
+\newcolumntype{Y}{>{\raggedright\arraybackslash}X}

\begin{abstract}
This document describes how provenance information for astronomical datasets
@@ -134,129 +134,27 @@
\input{intro-VOarchitecture}
\input{intro-previousefforts}

+
\section{The provenance data model}
\input{datamodel-description}

%\section{Applying provenance -- Interactions with other Data models}\label{sec:dmlinks}
-%In this section we discuss how the Provenance Data Model interacts with
-%classes and attributes from other VO data models (especially DatasetDM).
-%(e.g. DatasetDM, SpectralDM (share some same classes), SimDM)
-%and how provenance information can be stored.
-
-The Provenance Data Model can be applied without making links to any other
-IVOA data model classes. For example when the data is not yet published, provenance information
-can be stored already, but a DatasetDM-description for the data may not yet exist.
-But if there are data models implemented for the datasets, then it is
-very useful to connect the classes and attributes of the different models,
-which we are going to discuss in this Section. These links help to avoid
-unnecessary repetitions in the metadata of datsets, and also offer the possibility
-to derive some basic provenance information from existing data model classes automatically.
-
-
-Entities and their descriptions in the Provenance Data Model
-are tightly linked to the \class{DataSet}-class in the DatasetDM/ObsCore Data Model, as well as to
-InputDataset and OutputDataSet in the Simulation Data Model \citep[SimDM,][]{std:SimDM}.
-Table \ref{tab:datasetmapping} maps classes and attributes from the Dataset Data Model
-to concepts in the Provenance Data Model.
-
-
-%\begin{figure}[h]
-%\centering
-%\includegraphics[width=\textwidth]{../datamodel-diagrams/classes-relations-dms}
-%\caption{Links between Agent and Party, Entity and Dataset.}
-%\label{fig:class-relations-dm}
-%\end{figure}
-% --> a similar figure is already given in the sections on entity and agent.
-
-\begin{table}[h]
-\small
-\tymax  0.5\textwidth
-\begin{tabulary}{1.0\textwidth}{@{}lLp{4cm}@{}}
-\toprule
-\midrule
-DataID.title      & Entity.label               & title of the dataset\\
-DataID.collection    & HadMember.collectionId  & link to the collection to which the dataset belongs\\
-DataID.creator       & Agent.name          & name of agent\\
-DataID.creatorDID    & AlternateOf.entityId     & id for the dataset given by the creator\\
-DataID.ObservationID & WasGeneratedBy.activityId  & identifier to everything describing the observation; maybe it belongs to entity?\\
-DataID.date          & WasGeneratedBy.time & date and time when the dataset was completely created\\
-Curation.PublisherDID  & Entity.id      & unique identifier for the dataset assigned by the publisher\\
-Curation.PublisherID & Agent.id  & link to the publisher; role: publisher, type: organization/astronomer private collection)\\
-Curation.Publisher     & Agent.name & name of the publisher\\
-Curation.Date          & Entity.releaseDate & release date of the dataset\\
-Curation.Version       & Entity.version     & version of the dataset\\
-Curation.Rights        & Entity.access      & access rights to the dataset; one of [...]\\
-Curation.Contact       & Agent.Id or name? & link to Agent with role contact\\
-DataProductType  & EntityDescription.dataproduct\_type & type of a dataproduct/entity\\
-DataProductSubType & EntityDescription.dataproduct\_subtype & subtype of a dataproduct/entity\\
-ObsDataset.calibLevel  & EntityDescription.level & (output) calibration level, integer between 0 and 3\\\hline
-\bottomrule
-\end{tabulary}
-\caption{Mapping between attributes from \class{Dataset}-classes from Dataset Metadata Model to classes in ProvenanceDM.}
-\label{tab:datasetmapping}
-\end{table}
-
-
-The \class{Agent} class, which is used for defining responsible persons and
-organizations, is similar to the \class{Party} class in the Dataset Metadata Model and SimDM.
-
-In SimDM one also encounters a normalization similar to our split-up of descriptions from
-actual data instances and executions of processes: the SimDM class experiment''
-is a type of \class{Activity} and its general, reusable description is called a protocol'',
-which can be considered as a type of this model's \class{ActivityDescription}.
-More direct mappings between classes and attributes of both models are given in Table~\ref{tab:simdmmapping}.
-
-\begin{table}[h]
-\small
-\tymax  0.5\textwidth
-\begin{tabulary}{1.0\textwidth}{@{}lLp{4cm}@{}}
-\toprule
-\midrule
-Experiment      & Activity               & \\
-Experiment.name & Activity.label         & human readable name; name attribute in SimDM is inherited from Resource-class\\
-Experiment.executionTime  & Activity.endTime & end time of the execution of an experiment/activity \\
-Experiment.protocol & Activity.description\_ref & reference to the protocol or description class \\
-Protocol        & ActivityDescription    & \\
-Protocol.name   & ActivityDescription.label  & human readable name\\
-Protocol.referenceURL & ActivityDescription.doculink & reference to a webpage describing it\\
-ParameterSetting     & Parameter              & value of an (input) parameter\\
-InputParameter       & ParameterDescription              & description of an (input) parameter\\
-Party           & Agent                 & responsible person or organization\\
-Party.name      & Agent.label & name of the agent \\
-Contact         & WasAssociatedWith & \\
-Contact.role    & WasAssociatedWith.role & role which the agent/party had for a certain experiment (activity); SimDM roles contain: \texttt{owner}, \texttt{creator}, \texttt{publisher}, \texttt{contributor}\\
-Contact.party    & WasAssociatedWith.agent & reference to the agent/party \\
-DataObject     & Entity        & a dataset, which can be/refer to a collection\\
-
-\bottomrule
-\end{tabulary}
-\caption{Mapping between classes and attributes from SimDM to classes/attributes in ProvenanceDM.}
-\label{tab:simdmmapping}
-\end{table}
-
-
-
-
-More similarities and links to other data models will be detailed in future
-versions of this working draft.

\section{Accessing provenance information}
\input{provaccess}

+
\section{Discussion}
\input{discussion}

-\section{Implementations of the data model for specific use cases}\label{sec:usecases-implementations}
+
+\section{Implementations of the data model for specific use cases}
+\label{sec:usecases-implementations}
\input{usecases-implementations}

@@ -267,6 +165,7 @@
% Use itemize environments.
\subsection{Changes from WD-ProvenanceDM-1.0-20161121}
\begin{itemize}
+\item Moved detailed implementation section from appendix to a separate document (implementation note)
\item Removed description\_ref as attribute, since it's expressed by the corresponding link in the model anyway.
@@ -286,12 +185,12 @@
\end{itemize}

-\section{Implementation details}\label{sec:implementation-details}
-In this section we will give more details on the classes and attributes which were used
-in implementations for each use case. This maybe needs to go into a different document, so it can
-be updated without affecting this standard.
+% \section{Implementation details}\label{sec:implementation-details}
+% In this section we will give more details on the classes and attributes which were used
+% in implementations for each use case. This maybe needs to go into a different document, so it can
+% be updated without affecting this standard.

-TBD.
+% TBD.

\bibliography{ivoatex/ivoabib,prov-refs}

==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ trunk/projects/dm/provenance/description/datamodel-links.tex	Fri Apr 28 14:34:41 2017	(r3984)
@@ -0,0 +1,108 @@
+%In this section we discuss how the Provenance Data Model interacts with
+%classes and attributes from other VO data models (especially DatasetDM).
+%(e.g. DatasetDM, SpectralDM (share some same classes), SimDM)
+%and how provenance information can be stored.
+
+The Provenance Data Model can be applied without making links to any other
+IVOA data model classes. For example when the data is not yet published, provenance information
+can be stored already, but a DatasetDM-description for the data may not yet exist.
+But if there are data models implemented for the datasets, then it is
+very useful to connect the classes and attributes of the different models,
+which we are going to discuss in this Section. These links help to avoid
+unnecessary repetitions in the metadata of datsets, and also offer the possibility
+to derive some basic provenance information from existing data model classes automatically.
+
+
+Entities and their descriptions in the Provenance Data Model
+are tightly linked to the \class{DataSet}-class in the DatasetDM/ObsCore Data Model, as well as to
+InputDataset and OutputDataSet in the Simulation Data Model \citep[SimDM,][]{std:SimDM}.
+Table \ref{tab:datasetmapping} maps classes and attributes from the Dataset Data Model
+to concepts in the Provenance Data Model.
+
+
+%\begin{figure}[h]
+%\centering
+%\includegraphics[width=\textwidth]{../datamodel-diagrams/classes-relations-dms}
+%\caption{Links between Agent and Party, Entity and Dataset.}
+%\label{fig:class-relations-dm}
+%\end{figure}
+% --> a similar figure is already given in the sections on entity and agent.
+
+\begin{table}[h]
+\small
+\tymax  0.5\textwidth
+\begin{tabulary}{1.0\textwidth}{@{}lLp{4cm}@{}}
+\toprule
+\midrule
+DataID.title      & Entity.label               & title of the dataset\\
+DataID.collection    & HadMember.collectionId  & link to the collection to which the dataset belongs\\
+DataID.creator       & Agent.name          & name of agent\\
+DataID.creatorDID    & AlternateOf.entityId     & id for the dataset given by the creator\\
+DataID.ObservationID & WasGeneratedBy.activityId  & identifier to everything describing the observation; maybe it belongs to entity?\\
+DataID.date          & WasGeneratedBy.time & date and time when the dataset was completely created\\
+Curation.PublisherDID  & Entity.id      & unique identifier for the dataset assigned by the publisher\\
+Curation.PublisherID & Agent.id  & link to the publisher; role: publisher, type: organization/astronomer private collection)\\
+Curation.Publisher     & Agent.name & name of the publisher\\
+Curation.Date          & Entity.releaseDate & release date of the dataset\\
+Curation.Version       & Entity.version     & version of the dataset\\
+Curation.Rights        & Entity.access      & access rights to the dataset; one of [...]\\
+Curation.Contact       & Agent.Id or name? & link to Agent with role contact\\
+DataProductType  & EntityDescription.dataproduct\_type & type of a dataproduct/entity\\
+DataProductSubType & EntityDescription.dataproduct\_subtype & subtype of a dataproduct/entity\\
+ObsDataset.calibLevel  & EntityDescription.level & (output) calibration level, integer between 0 and 3\\\hline
+\bottomrule
+\end{tabulary}
+\caption{Mapping between attributes from \class{Dataset}-classes from Dataset Metadata Model to classes in ProvenanceDM.}
+\label{tab:datasetmapping}
+\end{table}
+
+
+The \class{Agent} class, which is used for defining responsible persons and
+organizations, is similar to the \class{Party} class in the Dataset Metadata Model and SimDM.
+
+In SimDM one also encounters a normalization similar to our split-up of descriptions from
+actual data instances and executions of processes: the SimDM class experiment''
+is a type of \class{Activity} and its general, reusable description is called a protocol'',
+which can be considered as a type of this model's \class{ActivityDescription}.
+More direct mappings between classes and attributes of both models are given in Table~\ref{tab:simdmmapping}.
+
+\begin{table}[h]
+\small
+\tymax  0.5\textwidth
+\begin{tabulary}{1.0\textwidth}{@{}lLp{4cm}@{}}
+\toprule
+\midrule
+Experiment      & Activity               & \\
+Experiment.name & Activity.label         & human readable name; name attribute in SimDM is inherited from Resource-class\\
+Experiment.executionTime  & Activity.endTime & end time of the execution of an experiment/activity \\
+Experiment.protocol & Activity.description\_ref & reference to the protocol or description class \\
+Protocol        & ActivityDescription    & \\
+Protocol.name   & ActivityDescription.label  & human readable name\\
+Protocol.referenceURL & ActivityDescription.doculink & reference to a webpage describing it\\
+ParameterSetting     & Parameter              & value of an (input) parameter\\
+InputParameter       & ParameterDescription              & description of an (input) parameter\\
+Party           & Agent                 & responsible person or organization\\
+Party.name      & Agent.label & name of the agent \\
+Contact         & WasAssociatedWith & \\
+Contact.role    & WasAssociatedWith.role & role which the agent/party had for a certain experiment (activity); SimDM roles contain: \texttt{owner}, \texttt{creator}, \texttt{publisher}, \texttt{contributor}\\
+Contact.party    & WasAssociatedWith.agent & reference to the agent/party \\
+DataObject     & Entity        & a dataset, which can be/refer to a collection\\
+
+\bottomrule
+\end{tabulary}
+\caption{Mapping between classes and attributes from SimDM to classes/attributes in ProvenanceDM.}
+\label{tab:simdmmapping}
+\end{table}
+
+
+
+
+More similarities and links to other data models will be detailed in future
+versions of this working draft.

Modified: trunk/projects/dm/provenance/description/intro-general.tex
==============================================================================
--- trunk/projects/dm/provenance/description/intro-general.tex	Fri Apr 28 13:46:41 2017	(r3983)
+++ trunk/projects/dm/provenance/description/intro-general.tex	Fri Apr 28 14:34:41 2017	(r3984)
@@ -3,7 +3,7 @@
describing the provenance of astronomical data.
We follow the definition of provenance as proposed by the W3C \citep{std:W3CProvDM}, i.e. that provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness''.

-In astronomy, entities are generally datasets composed of VOTables, FITS files or database tables, or files containing logs, values (spectra, lightcurves), parameters, etc. The activities correspond to an observation, a simulation, or processing steps (image stacking, object extraction, etc.). The people involved can be individual persons (observer, publisher...), groups or organisations.
+In astronomy, such entities are generally datasets composed of VOTables, FITS files, database tables or files containing values (spectra, lightcurves), logs, parameters, etc. The activities correspond to processes like an observation, a simulation, or processing steps (image stacking, object extraction, etc.). The people involved can be individual persons (observer, publisher...), groups or organisations.
An example for activities, entities and agents as they can be discovered backwards in time is given in Figure~\ref{fig:example-workflow}.

\begin{figure}[h]
@@ -16,12 +16,11 @@
\end{figure}

+The currently discussed Provenance Data Model is sufficiently abstract that its core pattern could be applied to any kind of process using either observation or simulation data.
+It could also be used to describe the workflow for observation proposals or the publication of scientific articles based on (astronomical) data. However, here we focus on astronomical data. The links between the Provenance Data Model and other IVOA data models
+will be discussed in Section~\ref{sec:dmlinks}. We note here already, that the provenance of simulated data is already covered by the Simulation Data Model
+\citep[SimDM,][]{std:SimDM}. Therefore we also give a mapping between SimDM and the Provenance Data Model in Section~\ref{sec:dmlinks}.

-We note that the provenance of simulated data is already described inside the Simulation Data Model
-\citep[SimDM,][]{std:SimDM}. However, the Provenance Data Model currently discussed is
-sufficiently abstract that its core pattern could be applied to any kind of process using either observation or simulation data. It could also be used to describe the workflow for observation proposals or the publication of scientific articles based on (astronomical) data.
-The links between the Provenance Data Model and other IVOA data models

%including extraction of data from
%databases or even the flow of scientific proposals from application to
@@ -32,10 +31,6 @@
%does not extend a certain period, etc...

-% TODO: Include here somewhere in this section a simple workflow/provenance graph
-% to illustrate the basic concepts
-
-
\subsection{Goal of the provenance model}\label{sec:goals}

The goal of this Provenance Data Model is to describe how provenance information

Modified: trunk/projects/dm/provenance/description/prov-refs.bib
==============================================================================
--- trunk/projects/dm/provenance/description/prov-refs.bib	Fri Apr 28 13:46:41 2017	(r3983)
+++ trunk/projects/dm/provenance/description/prov-refs.bib	Fri Apr 28 14:34:41 2017	(r3984)
@@ -1,6 +1,27 @@
+ at misc{std:ProvenanceDM,
+    author = {Kristin Riebe and Mathieu Servillat and François Bonnarel and Mireille
+Louys and Florian Rothmaier and Michèle Sanguillon and the IVOA Data Model Working Group},
+    title = {{IVOA} Provenance Data Model},
+    howpublished = {{IVOA} Working Draft},
+    month =        apr,
+    year =         2017,
+    url =          {http://www.ivoa.net/documents/ProvenanceDM/}
+}
+
+ at misc{std:ProvenanceImplementationNote,
+    author = {Kristin Riebe and Mathieu Servillat and François Bonnarel and Mireille
+Louys and Florian Rothmaier and Michèle Sanguillon and the IVOA Data Model Working Group},
+    title = {Provenance Implementation Note},
+    howpublished = {{IVOA} Note},
+    month =        apr,
+    year =         2017,
+    url =          {http://volute.g-vo.org/svn/trunk/projects/dm/provenance/implementation-note/}
+%    url =          {http://www.ivoa.net/documents/ProvenanceImplementationNote/}
+}
+
@MISC{std:SimDM,
-   author = {{Lemson}, G. and {Wozniak}, H. and {Bourges}, L. and {Cervino}, M. and
-    {Gheller}, C. and {Gray}, N. and {LePetit}, F. and {Louys}, M. and
+   author = {{Lemson}, G. and {Wozniak}, H. and {Bourges}, L. and {Cervino}, M. and
+    {Gheller}, C. and {Gray}, N. and {LePetit}, F. and {Louys}, M. and
{Ooghe}, B. and {Wagner}, R.},
title = "{Simulation Data Model Version 1.0}",
howpublished = {IVOA Recommendation 03 May 2012},
@@ -44,8 +65,7 @@
}

@misc{std:DatasetDM,
-    author = {Francois Bonnarel and Omar Laurino and Gerard Lemson and Mireille Louys and Arnold Rots and Doug
-Tody and the IVOA Data Model Working Group},
+    author = {Francois Bonnarel and Omar Laurino and Gerard Lemson and Mireille Louys and Arnold Rots and Doug Tody and the IVOA Data Model Working Group},
title = {{IVOA} Dataset Metadata Model},
howpublished = {{IVOA} Working draft},
month =        oct,