[Volute] r4907 - trunk/projects/time-domain/time-series/note

Volute commit messages volutecommits at g-vo.org
Tue Apr 10 15:20:00 CEST 2018


Author: nebot
Date: Tue Apr 10 15:20:00 2018
New Revision: 4907

Log:
Modif-10April2018-metadata

Added:
   trunk/projects/time-domain/time-series/note/ada_metadata.tex

Added: trunk/projects/time-domain/time-series/note/ada_metadata.tex
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ trunk/projects/time-domain/time-series/note/ada_metadata.tex	Tue Apr 10 15:20:00 2018	(r4907)
@@ -0,0 +1,151 @@
+\section{Time Series}
+\label{sect:metadata}
+In this section we describe what Time Series data is in a wide context, defining the most relevant parameters that define it. We describe the common requirements of the different science use cases collected by the SPC \cite{enrique}. A common frame for time is defined with the minimum set of parameters taken from and compatible with STC. We then compare the defined fields describing time with the fields content of Obscore and EPNcore. 
+
+\subsection{Definition}
+Time Series can be defined in a very large sense as a collection of any kind of data over time for a particular source (e.~g. star, binary, QSO) or part of a source (e.~g. sun spots), independent on the type of data (images, light-curves, radial velocity, polarisation estates or degrees, positions, number of sunspots, densities,...), the duration of the observation or the cadence. 
+
+Independent on the type of data we can sketch Time Series data as shown in Fig.~\ref{fig:time-series}. Time Series data is composed of a set of observations (n\_observations = 3 in this example), each with a different exposure or integration time (t\_exp). Although in some cases the cadance or time spam between each observation (delta\_t) is fixed, in the general case it can be different and we can therefore define a minimum and a maximum value (delta\_t\_min, delta\_t\_max). Each observation has it's own time stamp (t\_i) with a given precision or resolution (t\_resolution). As can be seen from this figure the duration of the observation can be defined in different ways: a) as the total integration or exposure time, i.~e. the sum of all the exposure times: t\_exp\_total = $\sum$t\_exp; or b) as the time spam between the beginning and the end of the observations: t\_exp\_total = t\_max - time\_min). Note that in the case that the exposure time is constant for all the observ!
 ations th
 en t\_exp\_total = n\_observations $\times$t\_exp. The situation can be more complicated, for instance during the observation there could be clouds and we therefore pause the exposure for a while and resume once the cloud has passed or we might want to remove parts of the observation due to artefacts in the data. In any case these values can be taken as approximative of the minimum and the maximum value this specific field can have. 
+The most relevant fields of Time Series data are summarized in Table~\ref{tab:fields}.
+
+\begin{figure}
+\begin{center}
+  \includegraphics[width=\textwidth]{figs/fig1.png}
+\caption{Simple representation of Time Series data.}\label{fig:time-series}
+\end{center}
+\end{figure}
+
+\begin{table}[th]
+  \begin{center}
+  \caption{Time Series data fields.}\label{tab:fields}
+    \begin{tabular}{p{0.35\textwidth}p{0.64\textwidth}}
+      \sptablerule
+      \textbf{Field}  & \textbf{Explanation}                        \\\sptablerule
+      (RA,Dec)        & Coordinates$^1$                             \\
+      target\_name    & Target name$^1$                             \\ 
+      t\_min          & Date of the begining of the of observation  \\
+      t\_max          & Date of the end of the observation          \\
+      t\_exp\_min     & Minimum exposure time                       \\
+      t\_exp\_max     & Maximum exposure time                       \\
+      t\_exp\_total   & Total exposure time                         \\
+      delta\_t\_min   & Minimum time sampling / cadence             \\
+      delta\_t\_max   & Maximum time sampling / cadence             \\
+      t\_resolution   & Time resolution/precision                   \\
+      n\_observations & Number of observations                      \\
+      type\_of\_data  & Type of data (fluxes, radial velocities, images,...)\\
+      \sptablerule
+    \end{tabular}
+  \end{center}
+  \textbf{Note:} $^1$For SSO or moving objects coordinates might not be enough or relevant. 
+\end{table}
+
+In many cases time series data is composed of only three colums: 
+\begin{center}
+time, magnitude, magnitude error 
+\end{center}
+For this data to be fully exploitable and reusable (interoperable) it has to be properly documented. In this specific case the minimum information that needs to be provided is: the object coordinates (or name), the filter in which the observations have been carried out, and the time frame and offset (if applicable).
+
+\subsection{Science use cases}
+\label{sect:usecases}
+Different science use cases for Time Series have been collected and described in by \cite{solano2012} and can be found under \url{http://wiki.ivoa.net/twiki/bin/view/IVOA/CSPTimeSeries}. Science cases are grouped according to their common requirements: 
+\begin{itemize}
+\item \textbf{Group A} Common requirement: Combine photometry and light curves of a given object/list of objects in the same photometric band
+\item \textbf{Group B} Common requirement: Combine photometry and light curves of a given object/list of objects in different photometric bands
+\item \textbf{Group C} Common requirement: Time series other than light curves
+\end{itemize}
+
+\begin{table}
+  \begin{center}
+  \begin{tabular}{|c|c|c|c|c|c|}
+    \sptablerule
+    \bf{Science} & \bf{Target(s)} & \bf{Datatype} & \bf{Time} & \bf{Brightness} & \bf{Photometric} \\
+    \bf{Case}    &                &               &           &                 & \bf{Band}        \\\sptablerule
+    Group A      &  yes      &  lightcurves &  yes &    yes     &      one         \\
+    Group B      &  yes      &  lightcurves &  yes &    yes     &      several     \\
+    Group C      &  yes      &  other       &  yes &    no      &      no          \\
+    \sptablerule
+  \end{tabular}
+  \end{center}
+\end{table}
+
+As highlighted by the different science uces cases described in \url{http://wiki.ivoa.net/twiki/bin/view/IVOA/CSPTimeSeries}, there are astrophysical phenomenae that vary in different timescales and to study the different physical underlying mechanims a user might need to collect and analyse data from different missions and of different nature. 
+Answering all the possible science cases is a difficult task. We would therefore like to keep a practical approach to the problem, solving the simplest cases in a first step and allowing having incremental solutions for more complex systems at later stages. 
+
+Looking at the different science cases we simplify the questions to two:
+\begin{enumerate}
+\item \emph{Have these two missions observed this object within these two dates?}
+\item \emph{Is it possible to discover long/short term variability within the data?}
+\end{enumerate}
+To answer the first question a user needs to be sure that dates are comparable, that is time has to be brought into a common time frame. To answer the second question we need to keep track of the minimum and maximum time spam. We aim in a first step to answer these fundamental questions and, later on we will move to answer the specific science cases, which focus of the nature of the Time Series data, giving priority to lightcurves which represent the mayority of the cases, while having in mind a wider approach.
+
+\subsection{Using a common time frame}
+To compare datasets from different missions or archives a common representation of time is needed. In order to do so we propose to map time into a pivot format. Following \cite{rots2015} and \cite{std:STC} we propose a set of minimum metadata to be added for serializations of Time Series (see Table~\ref{tab:metadata}). 
+
+\begin{table}
+  \begin{center}
+    \caption{Metadata for time in Time Series data serialisation.}\label{tab:metadata}
+      \begin{tabular}{p{0.35\textwidth}p{0.64\textwidth}}
+      \sptablerule
+      \textbf{Parameter} & \textbf{Explanation} \\\sptablerule
+      time\_frame\_scale & Time frame scale is the scale used to meassure time. IAU definition: ``A time scale is simply a well defined way of measuring time based on a specific periodic natural phenomenon.'' \url{http://aa.usno.navy.mil/publications/docs/Circular_179.pdf}. Recognized time scale values and their meaning are listed in Table~\ref{tab:scales}. If we don’t know use “UNKOWN”. \\
+      time\_frame\_position &  Time Frame Position is the place where the time is measured. Standard values are liste in Table~\ref{tab:positions}. If we don’t know use “UNKOWN”.\\
+      time\_uncertainty & Resolution or uncertainty of the time stamps. \\
+      time\_sys\_error  & Time Systematic Error to take into account our knwoledge of the time frame (scale and position). If time\_scale is not known then 100s as DEFAULT value, if time\_scale and time\_frame\_position are both not known then use  1000s as DEFAULT value. Approximately 100s is good for the time\_scale since that’s related to changes in the clock in space/earth; 1000s is good if we don’t know if times are corrected for the position of the Earth/satellite on its orbit around the Sun since that’s approximately twice the time it takes the light to travel the Sun-Earth/satellite distance. \\
+      time\_representation  & JD, MJD, ISO-8601. \\
+      time\_offset &  Offset that has been subtracted to the time. Time can be relative to a certain moment, e.~g. time after the GRB that happened on date YYYYMMHHMMSS.SS or a random number the authors have subtracted from data to allow higher precision in the time stamps. Its default value is 0.0. \\
+      Description & A text briefly describing what is varying with time. “Photometric variability in filter V”, “Radial velocity curve in HJD”. This field is aimed to help the reader. \\
+    \sptablerule
+    \end{tabular}
+  \end{center}
+\end{table}
+
+We recommend to be specific on the time frame and we suggest to use:
+\begin{center}
+  JD(TT;BARYCENTER)
+\end{center}
+We also give some values that can be used as default in the case that some information is not known and impossible to recover. We minimize the impact of doing this by adding a systematic error to time when those values are unkown. 
+
+\subsection{Extension of ObsCore based on EPNCore}
+Some of the fields described for Time Series data have already been explecitely defined and used in the context of data discovery using ObsCore \cite{std:OBSCORE}, and the remaining ones have been defined in the context of EPNcore \cite{xx}. In Table~\ref{tab:obs_epn} we show the equivalence between the fields we define here and those by ObsCore and EPNcore. 
+
+\begin{table}[h!]
+  \begin{center}
+  \caption{Equivalence between Time Series data fields and ObsCore and EPNCore fields}\label{tab:obs_epn}
+  \begin{tabular}{|c|c|c|}
+%    \sptablerule
+\hline
+    \textbf{Field}      & \textbf{ObsCore field name} & \textbf{EPNCore field name}  \\%\sptablerule
+\hline
+    coordinates     & s\_ra, s\_dec          & -                         \\
+    \hline
+    target\_name    & target\_name           & target\_name              \\
+    \hline
+    t\_min          & t\_min                 & time\_min                 \\
+    \hline
+    t\_max          & t\_max                 & time\_max                 \\
+    \hline
+    t\_exp\_min     &  -                     & time\_exp\_min            \\
+    \hline
+    t\_exp\_max     &  -                     & time\_exp\_max            \\
+    \hline
+    t\_exp\_total   &  t\_exp                & -                         \\
+    \hline
+    delta\_min      &  -                     & time\_sampling\_step\_min \\
+    \hline
+    delta\_max      &  -                     & time\_sampling\_step\_max \\
+    \hline
+    n\_observations & t\_xel                 & -                         \\
+    \hline
+%    t\_resolution   & t\_resolution$^1$      & -                         \\
+%    \hline
+    type\_of\_data  & dataproduct\_type$^2$  & dataproduct\_type         \\
+\hline
+    %    \sptablerule
+  \end{tabular}
+  \end{center}
+\textbf{Note:} $^1$ The explanation of t\_resolution in Obscore as ``Temporal resolution FWHM'' should be modified to simply ``Temporal resolution''. $^2$ dataproduct\_type should be set to timeseries in Obscore, but this would still not tell the user that the timeseries is composed of images or lightcurves or a combination of both in a more complicated case. \textbf{Q : is it possible to combine dataproduct\_type=timeseries;images;sed for instance?}
+
+\end{table}
+
+For discovering Time Series data, an extension of ObsCore based on the fields listed in Table~\ref{tab:obs_epn} and missing in ObsCore would suffice. For these extra fields we would recommend using the EPNcore name convention. 
+


More information about the Volutecommits mailing list