[Dachs-support] migrating/changing location of q.rd
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Thu Oct 15 12:25:41 CEST 2020
Silvan,
On Wed, Oct 07, 2020 at 12:41:56PM +0000, Laube Silvan wrote:
> Baptiste advised to switch to a git repo for each resource, which
> seems a good idea as that seems to have become the standard way to
> do it. So my question at this point is:
> Can I easily migrate the location of the resource descriptor (and
> python-metadata extractor) without having to re-import the whole
> thing (after all we have currently ~7.8M files). Is there any
> action needed at all or can it just be moved? If necessary I can
> manually adjust the rd-path in the DB (or other places where this
> might be stored?)
RD moves are always painful, as the RD identifier (essentially, the
relative path from the inputs directory to the RD file) is used as
a key in many tables. Untangling that is at least hard, and so a
dachs move command I've considered several times still hasn't come to
pass.
Hence, my recommendation is to let the computer do a bit of extra
work rather than hack things (which would certainly be possible);
that means:
dachs drop <old RD>
mv <old resdir> <new resdir>
dachs imp <new RD>
With 8 million files, the re-import shouldn't take more than a few
hours, right? If that's significantly more, I'd probably still
rather see if there's any reason why the import takes so long than
trying to get all the different references right.
> P.S. (as per our short discussion on github) I have to admit I am
> not a big fan of these old-school mailing lists... its not really
> searchable and finding related content is pretty much impossible,
> best bet is google-searching the archive-site. So at least for
Well, that's a feature, not a bug: there are many mailing lists and
discussions that shouldn't be archived in the first place.
> issues I find github a lot more convenient and accessible, also it
> allows sophisticated formatting such as code highlighting 😊 but of
But of course it's non-standard, meaning I can't choose the client
I'm using to work with it, which again means that if code
highlighting makes my eyes water (say) I'll have to suck it up.
Also, I'd be giving github even more control over content I've
created with public funds, which I find at least undesirable.
So, call me a graybeard, but I think there are very good reasons to
try and work on public, standards-compliant infrastructure.
Although I have to admit that that failed badly this time -- your
mail went into moderation, and the moderation mail didn't make it to
me, all because the infrastructure this list runs on was moved, and
*something* in the mad complexity of SPF, DKIM, and whatnot isn't
quite right.
Aw, if only the spammers went to the proprietary platforms, too!
> Support is another thing, but even there are (in my opinion) better
> ways. At least something like a google group would already be much
> easier to browse and search through topics that were already
But then we'd be feeding Google with traffic, content, and
behavioural surplus, and if Google decided it's not worth their
while any more (which happened before), it's all gone. Mind you, I'd
like to use netnews (on top of NNTP) for this -- but that's been
killed off ages ago, regrettable.
Having said that, so far I've figured that the archive of this list
isn't big enough to warrant much effort in making it more accessible.
If this is really something people miss, I expect it's not hard to
put a somewhat friendlier interface on top of the archive -- it's not
like mailman archives are something exotic. Just let me know.
-- Markus
More information about the Dachs-support
mailing list