[Volute] r3630 - trunk/projects/dm/provenance/description

Volute commit messages volutecommits at g-vo.org
Tue Oct 18 00:48:21 CEST 2016


Author: kriebe
Date: Tue Oct 18 00:48:20 2016
New Revision: 3630

Log:
References added, inserted new votable-serialisation from Francois, minor editing in provaccess and previousefforts

Modified:
   trunk/projects/dm/provenance/description/ProvenanceDM.pdf
   trunk/projects/dm/provenance/description/intro-previousefforts.tex
   trunk/projects/dm/provenance/description/prov-refs.bib
   trunk/projects/dm/provenance/description/provaccess.tex

Modified: trunk/projects/dm/provenance/description/ProvenanceDM.pdf
==============================================================================
Binary file (source and/or target). No diff available.

Modified: trunk/projects/dm/provenance/description/intro-previousefforts.tex
==============================================================================
--- trunk/projects/dm/provenance/description/intro-previousefforts.tex	Mon Oct 17 14:55:07 2016	(r3629)
+++ trunk/projects/dm/provenance/description/intro-previousefforts.tex	Tue Oct 18 00:48:20 2016	(r3630)
@@ -1,9 +1,9 @@
 \subsection{Previous efforts}
-The provenance concept was early introduced by the IVOA within the scope of the Observation Data Model (ref1 : IVOA note 2005) as a class describing where the data are coming from. A full observation data model dedicated to the specific spectral data was then designed (Ref2 : spectral data model) as well as a fully generic characterisation data model of the measureemnt axes of the data (ref3: characterisation data model) while the progress on the provenance data model were slowing down.
+The provenance concept was early introduced by the IVOA within the scope of the Observation Data Model \citep[see IVOA note][]{note:observationdm} as a class describing where the data are coming from. A full observation data model dedicated to the specific spectral data was then designed \citep[Spectral Data Model,][]{std:SpectralDM} as well as a fully generic characterisation data model of the measurement axes of the data \citep[Characterisation Data Model,][]{std:CharacterisationDM} while the progress on the provenance data model was slowing down.
 
-IVOA DM WG first gathered various use cases coming from different communities of observational  astronomy (optical,  radio, Xray, interferometry). Common motivations for a provenance tracing of the history included : quality assesment, discovery of dataset progenitors and access to metadata necessary for reprocessing. Provenance datamodel was then designed as the combination of Data processing, Observing Configuration and Observation ambiant conditions datamodel classes. 
-The Processing class was embedding a sequence of processing stages which were hooking specific ad hoc details and links to input and output datasets, as well as processing step description. 
-Despite the attempts of UML description of the model and writing of xml serialization examples the IVOA effort failed to provide a workable solution:  the scope was probably too ambitious and the technical background too instable. A compilation of these early developments can be found on the IVOA site (ref4). From 2013 onwards IVOA concentrated on use cases related to processing description and decided to design the model  by extending the basic W3C provenance basic structure,as described in the current specification. 
+The IVOA Data Model Working Group first gathered various use cases coming from different communities of observational astronomy (optical, radio, Xray, interferometry). Common motivations for a provenance tracing of the history included: quality assessment, discovery of dataset progenitors and access to metadata necessary for reprocessing. The provenance data model was then designed as the combination of \emph{Data processing}, \emph{Observing configuration} and \emph{Observation ambient conditions} data model classes. 
+The \emph{Processing class} was embedding a sequence of processing stages which were hooking specific ad hoc details and links to input and output datasets, as well as processing step description. 
+Despite the attempts of UML description of the model and writing of xml serialization examples the IVOA effort failed to provide a workable solution: the scope was probably too ambitious and the technical background too instable. A compilation of these early developments can be found on the IVOA site \citep{std:previousefforts}. From 2013 onwards IVOA concentrated on use cases related to processing description and decided to design the model by extending the basic W3C provenance structure, as described in the current specification. 
 
 Outside of the astronomical community, the Provenance Challenge series (2006 -- 2010), a community effort to achieve inter-operability between different representations of provenance in scientific workflows, resulted in the Open Provenance Model (\cite{moreau2010}). 
 Later, the W3C Provenance Working Group was founded and released the W3C Provenance Data Model as Recommendation in 2013 (\cite{std:W3CProvDM}). 

Modified: trunk/projects/dm/provenance/description/prov-refs.bib
==============================================================================
--- trunk/projects/dm/provenance/description/prov-refs.bib	Mon Oct 17 14:55:07 2016	(r3629)
+++ trunk/projects/dm/provenance/description/prov-refs.bib	Tue Oct 18 00:48:20 2016	(r3630)
@@ -62,3 +62,46 @@
     url = {http://www.ivoa.net/documents/latest/RM.html}
 }
 
+ at misc{std:VODML,
+    author = {Gerard Lemson and Omar Laurino and Laurent Bourges and Markus Demleitner andPatrick Dowler and Matthew Graham andNorman Gray andJesus Salgado},
+    title = {VO - DML: a consistent modeling language for IVOA data models, Version 1.0},
+    howpublished = {{IVOA} Draft},
+    month = sep,
+    year = 2016,
+    url = {http://www.ivoa.net/documents/VODML/}
+}
+ at misc{std:SpectralDM,
+    author = {Jonathan McDowell and Jesus Salgado and Carlos Rodrigo Blanco and Pedro Osuna and Doug Tody and Enrique Solano and Joe Mazzarella and Raffaele D’Abrusco and Mireille Louys and Tamas Budavari and Markus Dolensky and Inga Kamp and Kelly McCusker and Pavlos Protopapas and Arnold Rots and Randy Thompson and Frank Valdes and Petr Skoda and Bruno Rino and Jim Cant and Omar Laurino and the IVOA Data Access Layer and Data Model Working Groups},
+    title = {IVOA Spectral Data Model, Version 2.0},
+    howpublished = {{IVOA} Draft},
+    month = sep,
+    year = 2016,
+    url = {http://www.ivoa.net/documents/SpectralDM/}
+}
+
+ at misc{note:observationdm,
+    author = {IVOA Data Model Working Group},
+    title = {Data Model for Observation, Version 1.00},
+    howpublished = {{IVOA} Note},
+    month = apr,
+    year = 2005,
+    url = {http://www.ivoa.net/documents/latest/DMObs.html}
+}
+
+ at misc{std:CharacterisationDM,
+    author = {IVOA Data Model Working Group},
+    title = {Data Model for Astronomical DataSet Characterisation, Version 1.13},
+    howpublished = {{IVOA} Note},
+    month = mar,
+    year = 2008,
+    url = {http://www.ivoa.net/documents/latest/CharacterisationDM.html}
+}
+
+ at misc{std:previousefforts,
+    author = {Francois Bonnarel, IVOA Data Model Working Group},
+    title = {Provenance Data Model Legacy},
+    howpublished = {Webpage},
+    month = oct,
+    year = 2016,
+    url = {http://wiki.ivoa.net/twiki/bin/view/IVOA/ProvenanceDataModelLegacy}
+}

Modified: trunk/projects/dm/provenance/description/provaccess.tex
==============================================================================
--- trunk/projects/dm/provenance/description/provaccess.tex	Mon Oct 17 14:55:07 2016	(r3629)
+++ trunk/projects/dm/provenance/description/provaccess.tex	Tue Oct 18 00:48:20 2016	(r3630)
@@ -1,58 +1,111 @@
 \subsection{Provenance Data Model serialization}
-There are three possible families of ProvDM metadata serializations.
+There are three possible families of ProvenanceDM metadata serializations.
 \begin{itemize}
- \item W3C serializations: Prov-N, PROV\-Json, PROV\-XML. These are serialization of the W3C provenance data model. They allow the possibility to add additional IVOA or ad hoc attributes to the basic ones in each class. This way the IVOA models can produce W3C compliant serializations.
- \item Mapping of ProvDM classes onto tables with appropriate relationships. This can allow managment by a TAP service (the model mapping is then described with the TAP schema). The serialization will be a single table according to the query.
+ \item W3C serializations: PROV\-N, PROV\-JSON, PROV\-XML. These are serialization of the W3C provenance data model. They allow the possibility to add additional IVOA or ad hoc attributes to the basic ones in each class. This way the IVOA models can produce W3C compliant serializations.
+ \item Mapping of ProvenanceDM classes onto tables with appropriate relationships. This can allow management by a TAP service (the model mapping is then described with the TAP schema). The serialization will be a single table according to the query.
 
- \TODO{TAP SCHEMA of the ProvDM datamodel: Maybe Mathieu can provide us with a copy of the TAP schema he designed ?}
- \item Direct VOTABLE mapping by using some ad hoc mapping based on transcription of PROV-N format: this is called PROV-VOTABLE. Moreover in the future we could also define a VO-DML (\textbf{ref}) version of the mapping.
+ \TODO{TAP SCHEMA of the ProvenanceDM datamodel: Maybe Mathieu can provide us with a copy of the TAP schema he designed ?}
+
+ \item Direct VOTABLE mapping by using some ad hoc mapping based on transcription of PROV-N format: this is called PROV-VOTABLE. Moreover in the future we could also define a VO-DML \citep{std:VODML} version of the mapping.
 The following is an example of provenance metadata in this PROV-VOTABLE format. Objects become tables, the class of which is rendered by a utype. Attributes and relationships become FIELDS or PARAMS. The model attribute names also become VOTABLE utypes.  
 \begin{verbatim}
-<TABLE name="cta:telescope_stage_520" utype="prov:activity">
-    <PARAM name="start" utype="prov:startTime" datatype="char" 
-           arraysize="*" xtype="ISO8601" value="2015-07-30T09:45:00">
-    <PARAM name ="stop" utype="prov:endTime"  datatype="char" 
-           arraysize="*" xtype="ISO8601" value = "2015-07-30T10:00:00">
-    <PARAM name="methodname" utype="voprov:method_name" dataype="char" 
-           arraysize="*" value="Telescope_stage">
-    <PARAM name="version" utype="voprov:method_version" datatype="char" 
-           arraysize="*" value="1.0">    
-    <PARAM utype="voprov:used" datatype="char" arraysize="*" 
-           value="cta:run13000_EVT0">
-    <PARAM utype="voprov:used" datatype="char" arraysize="*" 
-           value="cta:Stage1Config_5250">
+
+<?xml version="1.0" encoding="UTF-8"?>
+<VOTABLE version="1.2" 
+  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+  xmlns="http://www.ivoa.net/xml/VOTable/v1.2"
+  xsi:schemaLocation=
+  "http://www.ivoa.net/xml/VOTable/v1.2 http://www.ivoa.net/xml/VOTable/v1.2">
+
+<RESOURCE name="Stage1">
+
+<TABLE name="activities" utype="prov:activity">
+  <FIELD name="name" utype="prov:activity.name" datatype="char" 
+          arraysize="*"/>
+  <FIELD name="start" utype="prov:startTime" datatype="char" 
+          arraysize="*" xtype="ISO8601"/>
+  <FIELD name ="stop" utype="prov:endTime"  datatype="char" 
+          arraysize="*" xtype="ISO8601"/>
+  <FIELD name="methodname" utype="voprov:method_name" 
+          dataype="char" arraysize="*"/>
+  <FIELD name="version" utype="voprov:method_version" 
+          datatype="char" arraysize="*"/>  
+  <DATA>
+    <TABLEDATA>
+      <TR><TD>cta:telescope_stage_520</TD>
+          <TD>2015-07-30T09:45:00</TD><TD>2015-07-30T10:00:00</TD>
+          <TD>Telescope_stage</TD><TD>1.0</TD></TR>
+    </TABLEDATA>
+  </DATA>      
 </TABLE>
-<TABLE name="cta:Stage1Config_5250", utype="prov:entity">
-    <PARAM name="type" utype="prov:type" datatype="char" 
-           arraysize="*" value="file">   
+
+<TABLE name="entities" utype="prov:entity">
+  <FIELD name="name" utype="prov:entitity.name" datatype="char" 
+          arraysize="*"/>
+  <FIELD name="label" utype="prov:label" datatype="char" 
+          arraysize="*"/>
+  <FIELD name="type" utype="prov:type" datatype="char" 
+          arraysize="*"/>
+  <FIELD name="run" utype="cta:runNumber" 
+          datatype="int"/>
+  <FIELD name="tel" utype="cta:telescope" datatype="char" 
+          arraysize="*"/>
+  <DATA>
+    <TABLEDATA>
+      <TR><TD>cta:Stage1Config_520</TD><TD></TD><TD>file</TD>
+          <TD></TD><TD></TD></TR>
+      <TR><TD>cta:run1000_EVT1</TD><TD>EVT1 file</TD><TD>file</TD>
+          <TD>1000</TD><TD>MST21</TD></TR>
+      <TR><TD>cta:run13000_EVT0</TD><TD>EVT0 file</TD><TD>file</TD>
+          <TD>13000</TD><TD>MST21</TD></TR>
+    </TABLEDATA>
+  </DATA>
 </TABLE>
-<TABLE name="cta:run1000_EVT1", utype="prov:entity">
-    <PARAM name="label" utype="prov:label" datatype="char" arraysize="*" 
-           value="EVT1 file">
-    <PARAM name="type" utype="prov:type" datatype="char" arraysize="*" 
-           value="file">
-    <PARAM name="run" utype="cta:runNumber" datatype="int"  value="13000">
-    <PARAM name="tel" utype="cta:telescope" datatype="char" arraysize="*" 
-           value="MST21">
-    <PARAM utype="wasGeneratedBy" datatype="char" arraysize="*" 
-           value="cta:Stage1Config_5250">
+
+<TABLE name="usedRelationship" utype="voprov:used" >
+  <FIELD name="head" datatype="char" arraysize="*" />
+  <FIELD name="tail" datatype="char" arraysize="*" />
+  <DATA>
+    <TABLEDATA>
+      <TR><TD>cta:telescope_stage_520</TD>
+          <TD>cta:run13000_EVT0</TD></TR>
+      <TR><TD>cta:telescope_stage_520</TD>
+          <TD>cta:Stage1Config_5250</TD></TR>
+    </TABLEDATA>
+  </DATA>
 </TABLE>
+
+<TABLE name="wasGeneratedByRelationship" utype="voprov:wasGeneratedBy" >
+  <FIELD name="head" datatype="char" arraysize="*" />
+  <FIELD name="tail" datatype="char" arraysize="*" />
+  <DATA>
+    <TABLEDATA>
+      <TR><TD>cta:run1000_EVT1</TD><TD>cta:telescope_stage_520</TD></TR>
+    </TABLEDATA>
+  </DATA>
+</TABLE>
+
+</RESOURCE>
+</VOTABLE>
 \end{verbatim}
   
   
 \end{itemize}
+
+
 \subsection{Access protocols}
+We envision two possible access protocols:
 \begin{itemize}
-\item ProvDAL: retrieve provenance information based on given id of a dataEntity or activity
+\item ProvDAL: retrieve provenance information based on given id of a data entity or activity
 
-ProvDAL is a service the interface of which is organized around one main PARAMETER, the "ID" of the entity (obs\_publisher\_did of an ObSDataSet for example) The response is given in one of the following formats: PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE. Additional parameters can complete ID to refine the query. FORMAT allows to choose the output format. STEP allows to discriminate between STEP=LAST which gives the last step in the provenace chain and STEP = ALL which gives the whole chain.
+ProvDAL is a service the interface of which is organized around one main PARAMETER, the ``ID'' of the entity (obs\_publisher\_did of an ObSDataSet for example). The response is given in one of the following formats: PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE. Additional parameters can complete ID to refine the query. FORMAT allows to choose the output format. STEP allows to discriminate between STEP=LAST which gives the last step in the provenance chain and STEP=ALL which gives the whole chain.
 Multiple ID PARAMETER is allowed in order to retrieve several data set provenance details at the same time.
 \item ProvTAP: allows detailed queries for provenance information, discovery of datasets based on 
 e.g. code version.
 
-ProvTAP is a TAP service implementing the ProvDM datamodel. The PROVDM  mapping is included in the TAP schema (see above). The result of any query is a single table joining information coming from one or several "provenance" tables available in the database. 
+ProvTAP is a TAP service implementing the ProvenanceDM data model. The data model mapping is included in the TAP schema (see above). The result of any query is a single table joining information coming from one or several ``provenance'' tables available in the database. 
 
-A special case is considered where ProvDM and OBscore are both implemented in the same TAP service and queried together. The TAP response is then providing an Obscore Table with a ProvDM extension. We can imagine that in the future this could be hard-coded and registered as an ObsTAPRov service. 
+A special case is considered where ProvenanceDM and ObsCore are both implemented in the same TAP service and queried together. The TAP response is then providing an Obscore table with a ProvenanceDM extension. We can imagine that in the future this could be hard-coded and registered as an ObsTapProv service. 
 
 
 \item Do we need combined query possibilities, i.e. ask for ObsCore-fields and Provenance fields


More information about the Volutecommits mailing list