[Dachs-support] UWS and user access control

Ivan Zolotukhin ivan.zolotukhin at gmail.com
Mon May 2 19:39:28 CEST 2016


Hi Markus,

I'm finally back on track after several weeks of travel. Thanks for
these features that you developed. I'll post here issues one by one as
I discover them.

I'm sitting on DaCHS revision 5026. Here's the first bug I came across
when trying to fetch XML document of the job through the incorrect URL
belonging to another service:

$ curl -I http://example.com/res/service1/r/uws.xml/VWmNgD
HTTP/1.1 303 OK
Date: Mon, 02 May 2016 17:30:23 GMT
Content-Type: text/plain
Location: http://example.com/res/service1/r/uws.xml/http://example.com/res/service2/r/uws.xml/VWmNgD
Server: TwistedWeb/15.2.1

Note the incorrectly constructed redirect URL.

--
With best regards,
 Ivan

On Wed, Mar 30, 2016 at 4:59 PM, Markus Demleitner
<msdemlei at ari.uni-heidelberg.de> wrote:
> Hi Ivan,
>
> [bringing back on-list since I guess the off-list reply was an
> accident]
>
> TL;DR: There's now experimental support for sort-of authenticated UWS
> in DaCHS.
>
> So, I've found a bit of time to squeeze in basic support for "owners"
> into DaCHS UWS infrastructure.
>
> I'm a bit too busy elsewhere, but since I'm not in an ideal position
> to test things anyway (don't have services that need that), perhaps I
> can ask you to see how far you get?  All this needs thorough testing
> still, I'm afraid.
>
> Here's an excerpt of the commits I managed to squeeze in:
>
> ------------------------------------------------------------------------
> r4952 | msdemlei | 2016-03-30 14:52:24 +0200 (Wed, 30 Mar 2016) | 4 lines
>
> New config item [async]maxUserUWSRunningDefault to say how many user UWS
> jobs are being run at any one time.
>
>
> ------------------------------------------------------------------------
> r4951 | msdemlei | 2016-03-30 14:47:33 +0200 (Wed, 30 Mar 2016) | 4 lines
>
> UWS now requires authentication to view job info, authenticated
> users no longer see anonymous jobs.
>
>
> ------------------------------------------------------------------------
> r4950 | msdemlei | 2016-03-30 13:33:37 +0200 (Wed, 30 Mar 2016) | 3 lines
>
> UWS now notes down the user from a web request if present.
>
>
> ------------------------------------------------------------------------
> r4946 | msdemlei | 2016-03-30 11:09:48 +0200 (Wed, 30 Mar 2016) | 4 lines
>
> In the presence of multiple UWSes, attempts to deserialise a job with
> the wrong one now lead to redirects to the original one.
>
>
> In effect, you can now simply add an http basic auth header -- with
> python requests, just have auth=('username', ''), and it will be used
> as the current user.  Only if you claim to be that user you will see
> the contents of the jobs.
>
> Unauthenticated users will still see all the job ids and their state
> but can no longer peek into jobs with owners.
>
> Authenticated users will only see their own jobs.
>
> DaCHS will *not* check username and password for such unsolicited
> authentication.  To make it do so, you must limitTo the UWS service
> itself as described in my previous mail.  Meaning: If you don't care
> that the UWS is usable anonymously and that users that can guess
> other users' names can see the formers' jobs, you don't even need to
> bother creating users on the DaCHS side.
>
> On Thu, Mar 24, 2016 at 05:38:58PM +0100, Ivan Zolotukhin wrote:
>> Hi Markus,
>>
>> On Wed, Mar 23, 2016 at 1:58 PM, Markus Demleitner
>> <msdemlei at ari.uni-heidelberg.de> wrote:
>> > DaCHS already contains user management (gavo admin adduser and
>> > friends); in your scenario, you would use the service's limitTo
>> > attribute to limit access to the members of a group, and everyone
>> > entitled to use the service would be added to that group.  That part
>> > you can set up already:
>> >
>> > $ gavo admin adduser mysimulation "secret password"\
>> > "Group privileged to run mysimulation"
>> > $ gavo admin adduser user1 "also secret"
>> > $ gavo admin addtogroup user1 mysimulation
>> > $ gavo admin adduser user2 "another password"
>> > $ gavo admin addtogroup user2 mysimulation
>> >
>> > With service/@limitTo set to mysimulation, you should see
>> > authentication requests.
>>
>> Can you recommend a way to add users programmatically? In my project's
>
> Depending on your setup, calls to gavo admin adduser might just be
> what you want to do.
>
> Otherwise, the tables containing the credentials are really trivial at
> the moment -- it's dc.users and tc.groups.  There's nothing wrong
> with manipulating that stuff directly, which in principle you can
> even do through a remote postgres connection if you want.
>
> Just note that you might have to update your code at some point --
> DaCHS, to my knowledge, does not handle valuable credentials at this
> point, and it doesn't do SSL, so the passwords go over the net in
> clear text anyway.  When either fact will change, I'll have to hash
> the passwords, so *something* will change that you may have to follow
> if do direct manipulation of these tables.
>
>> > An alternative might be to have each service only display jobs
>> > created by itself.  But that has the drawback that users don't see
>> > jobs blocking their queue, which I don't like too much either.
>>
>> By the way, are there settings for job parallelism in DaCHS? So far
>> when there's a job running, new one gets only queued -- can I have 2+
>> jobs running in parallel? If not, is it possible to auto-launch the
>> previously queued jobs when the blocking one finishes?
>
> I simply forgot to add this knob when I did this stuff last summer --
> the trouble simply is that I don't have a use case for this so far,
> and so I depend on you complaining...
>
> Anyway, you can add something like
>
>   [async]
>   maxUserUWSRunningDefault: 10
>
> in your /etc/gavo.rc now, and it'll run 10 jobs at the same time.
>
> The "Default" in there already suggests that I've almost given up on
> the idea that there'll only be one queue for all user UWS services.
> If I really do separate queues, this will be overridable by service.
>
>
>> > So -- why would you want to disentangle the job lists?
>>
>> Because different services represent different entities in fact --
>> they accept different arguments and produce different results. If one
>> service cooks pasta, another one sends an SMS and third one changes TV
>> channel, there's no big sense in listing their results together except
>> for the housekeeping interest (seeing which failed / completed) --
>
> Well, actually the underlying assumption is that all UWSes use the
> same resource (i.e., computer); in that case, having it all in one
> place makes sense (in your simile: you're balancing one checkbook,
> regardless whether you've used it to by pasta, a TV, or your
> telephone contract).
>
> But anyway, I *think* the thing with the shared resource is not true
> in your case anyway.
>
> What would be reasonably easy is adding an option that creates a
> separate UWS for certain kinds of services (those managing some
> external resource).  Of course, it then would stand to reason that
> different services using the same resource (cluster, say) would again
> share a queue, and figuring out how to let people specify this and
> how to create (and possibly tear down) such queues is not entirely
> straightforward.
>
> Ideally, I'd wait for a bit more usage experience from your side
> before I commit to a particular solution -- keep me posted.
>
>> > (1) enable storing the creating user for UWS jobs
>>
>> And probably exposing it in the job parameters as well together with
>> the jobclass attribute?
>
> It's always been (as per UWS) in the UWS job document.  It's just
> always been nil so far.
>
> It should be there now.
>
>>
>> > (2) require authentication of the user when access UWS job
>> >     directories
>
> If the job has an owner, this is what happens now.
>
>> > (3) make sure jobs are actually deserialised with the job classes
>> >     that created them
>
> Again, that's what should happen now (via a redirect).
>
>> possible, nor the list of past jobs of other users. Your own
>> historical jobs have tags like the context (database schema) that the
>> job was run in, and the name of the output table. These tags are
>> extremely helpful for the navigation in the historical jobs.
>
> Hmha, there's no notion of tags in current UWS.  I *could* add some
> custom feature, but since I don't have a good use case myself I'm
> somewhat reluctant to do so.
>
> If you come up with an API to this kind of thing and proposed it on
> the UWS mailing list and people appear interested there, I could
> very well see myself implementing a prototype, though.
>
>> Therefore, the minimal thing from DaCHS for me would be actually
>> exposing jobclass and job owner properties in the list of jobs --
>> ideally in a way which is compatible to access through the python-uws
>> client.
>
> The list of jobs has a fairly strict format, and there's no way to
> have user, let alone jobclass, without resorting to dirty tricks.
>
> But perhaps the per-user joblists already do the trick for you?
>
> As to the job classes, there are a couple of ways out, and lacking a
> use case of my own I cannot really say where I should go.
>
> (a) I could fix the job URL generation so you can tell from the form
> of the job URL which service generated it
> (http://example.com/res1/svc1/async/jobid vs.
> http://example.com/res2/svc1/async/jobid -- essentially, the two
> segments after the hostname tell you what you're looking at).  Doing
> the right URLs would arguably be the right thing to do anyway, but
> I've shyed away from it as it incurs a code uglification.
>
> (b) I could, in each UWS, just list the jobs submitted by it.  This
> loses the queue perspective, but perhaps that's a silly argument in the
> first place.  Advantage: clean on the code side.
>
> (c) If I disentangle the UWS' job tables, the problem goes away
> naturally.  Disadvantage: incurs quite a lot of administrative
> overhead.
>
> So, that's my take on these things -- essentially, it's now up to you
> to say what you absolutely need, what would be really nice and what
> you can live without...
>
> Any feedback, including of course bug reports, is highly welcome.
>
>         -- Markus
> _______________________________________________
> Dachs-support mailing list
> Dachs-support at g-vo.org
> http://lists.g-vo.org/cgi-bin/mailman/listinfo/dachs-support


More information about the Dachs-support mailing list