[Dachs-support] s_regions, errors during import
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Wed Aug 17 10:02:07 CEST 2022
Chloé,
On Tue, Aug 16, 2022 at 05:07:10PM +0200, Chloé Azria wrote:
> I always have a lot of troubles implementing s_regions from dachs, with the
> error :
>
> InternalError("spherepoly_from_array: a line segment overlaps or
> polygon too large\nLINE 1: ... 283.384, -77.0712, -5.68097, NULL,
> NULL, spoly '{(4.49544...\n
> ^\n")
>
> I simplified a lot the polygons but I still have this error on some of them
> that stops completely the import.
I'm afraid that's beyond what DaCHS can (easily) fix. The error
messages bubble up from pgsphere, and pgsphere's polygon code
simply isn't all that great. I'm trying to get the major pgsphere users to
donate some resources (money and/or developer time) to fixing these,
but as usual that's not easy.
> I would like to understand what is the criterion on polygons that makes it
> not handled during the import. In order to, at least, exclude the s_regions
The pgsphere documentation states:
* A spherical polygon has the same restrictions as a spherical path
(see Section 3.7). [that is in particular: maximum distance between
two points is 180 deg] Except that a polygon needs at least 3
positions.
* The line segments can not be crossed.
* The maximum dimension of a polygon must be less than 180°.
> on granules that leads to this error and manage to import the others.
>
> Example of a polygon producing the error :
>
> POLYGON ((128.8216306026312 -58.34924436446448, 63.49 -46.66, 71.41 -53.15,
> 127.77 -55.92, 128.8216306026312 -58.34924436446448))
First (more generally), in cases when the database chokes, make sure
you get the right record. You see, to speed things up, DaCHS batches
rows for import (5000 by default), which means that when the database
chokes the current record probably is not the culprit. In such
cases, pass a -b1 (meaning: use batches of size 1) to dachs imp.
Things will be a lot slower, but you will find the exact record that
makes the database fail.
In this concrete case, the polygon probably was the culprit, because
it has crossing line segments (that's based on what it looks like in
TOPCAT's Sky Plot (using the Area control)). If my analysis is
right, pgsphere is behaving as documented.
Which raises the question how you can move on.
First, I'd repeat my recommendation to use MOCs here and use FX
Pineau's MOC library (which has robust polygons) to generate the
MOCs; I *think* the last version of EPN-TAP doesn't forbid MOCs in
s_region any more
(http://svn.ari.uni-heidelberg.de/svn/gavo/hdinputs/openngc/ has
examples for how to do that).
If you insist on polygons and would be ok with just skipping bad
polygons, I see two ways forward:
(a) Try to figure out whether a polygon has crossing or overly long
segments and skip them as per
<http://docs.g-vo.org/DaCHS/tutorial.html#skipping-things> (the trick
is to check for the intersections efficiently; I don't have a
reference on this, but I *think* it's relatively doable by standards
of spherical geometry), or
(b) Just have DaCHS continue when the database has choked.
DaCHS has a feature to do (b) per source (the -c flag to imp). It
doesn't have an analogous functionality to skip individual rows,
though. I could retrofit that if (a) proves untractable, but as it's
functionality that feels wrong, I'm not too wild about it.
So, if MOCs are still out, could you try (a)?
-- Markus
More information about the Dachs-support
mailing list