Discussion:
[pcp] pcp updates: context mutex (c_lock) is no longer recursive
Ken McDonell
2017-06-30 20:39:01 UTC
Permalink
Changes committed to https://github.com/kmcdonell/pcp master

Ken McDonell (2):
libpcp: context lock (c_lock) is no longer recursive
app changes to match changed c_lock semantics

src/libpcp/src/context.c | 30 +++++++++++++++++++++++-------
src/libpcp/src/desc.c | 2 ++
src/libpcp/src/fetch.c | 7 +++++--
src/libpcp/src/instance.c | 2 ++
src/libpcp/src/lock.c | 28 +++++++++++++++++-----------
src/libpcp/src/logmeta.c | 2 ++
src/libpcp/src/logutil.c | 13 +++++++++----
src/libpcp/src/p_result.c | 6 ++++++
src/libpcp/src/pmns.c | 20 +++++++++++++-------
src/libpcp/src/store.c | 2 ++
src/libpcp/src/util.c | 6 ++++++
src/pmdumplog/pmdumplog.c | 5 +++++
src/pmlogcheck/pass3.c | 7 ++++++-
src/pmlogrewrite/pmlogrewrite.c | 8 +++++++-
14 files changed, 105 insertions(+), 33 deletions(-)

Details ...

commit 720141c2dd1a4752b9ed6348dc6f2105ee2243d6
Author: Ken McDonell <***@internode.on.net>
Date: Sat Jun 24 16:43:49 2017 +1000

app changes to match changed c_lock semantics

Apps that call the *_ctx() routines in libpcp now need to be aware
that these routines may check that the context is indeed locked, so
although these apps may be single-threaded they still need to lock
the context before diving down into libpcp.

Apps changed:
pmdumplog
pmlogcheck
pmlogrewrite

commit e5da450cf0ad149bad2d30b7c3e88da847f01f4b
Author: Ken McDonell <***@internode.on.net>
Date: Sat Jun 24 16:40:49 2017 +1000

libpcp: context lock (c_lock) is no longer recursive

Change c_lock from PTHREAD_MUTEX_RECURSIVE to
PTHREAD_MUTEX_ERRORCHECK.

Some small code tweaks to accommodate the changed semantics.

Add checks that c_lock is locked when calling the *_ctx() routines
that assume they're operating on a locked context.

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16159): https://groups.io/g/pcp/message/16159
View All Messages In Topic (1): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Dave Brolley
2017-07-05 15:05:50 UTC
Permalink
The more eyes we get on this the better, especially with Nathan away.
I'll be having a look today.
Post by Ken McDonell
Changes committed to https://github.com/kmcdonell/pcp master
libpcp: context lock (c_lock) is no longer recursive
app changes to match changed c_lock semantics
src/libpcp/src/context.c | 30 +++++++++++++++++++++++-------
src/libpcp/src/desc.c | 2 ++
src/libpcp/src/fetch.c | 7 +++++--
src/libpcp/src/instance.c | 2 ++
src/libpcp/src/lock.c | 28 +++++++++++++++++-----------
src/libpcp/src/logmeta.c | 2 ++
src/libpcp/src/logutil.c | 13 +++++++++----
src/libpcp/src/p_result.c | 6 ++++++
src/libpcp/src/pmns.c | 20 +++++++++++++-------
src/libpcp/src/store.c | 2 ++
src/libpcp/src/util.c | 6 ++++++
src/pmdumplog/pmdumplog.c | 5 +++++
src/pmlogcheck/pass3.c | 7 ++++++-
src/pmlogrewrite/pmlogrewrite.c | 8 +++++++-
14 files changed, 105 insertions(+), 33 deletions(-)
Details ...
commit 720141c2dd1a4752b9ed6348dc6f2105ee2243d6
Date: Sat Jun 24 16:43:49 2017 +1000
app changes to match changed c_lock semantics
Apps that call the *_ctx() routines in libpcp now need to be aware
that these routines may check that the context is indeed locked, so
although these apps may be single-threaded they still need to lock
the context before diving down into libpcp.
pmdumplog
pmlogcheck
pmlogrewrite
commit e5da450cf0ad149bad2d30b7c3e88da847f01f4b
Date: Sat Jun 24 16:40:49 2017 +1000
libpcp: context lock (c_lock) is no longer recursive
Change c_lock from PTHREAD_MUTEX_RECURSIVE to
PTHREAD_MUTEX_ERRORCHECK.
Some small code tweaks to accommodate the changed semantics.
Add checks that c_lock is locked when calling the *_ctx() routines
that assume they're operating on a locked context.
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16176): https://groups.io/g/pcp/message/16176
View All Messages In Topic (2): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Dave Brolley
2017-07-06 18:06:11 UTC
Permalink
I've had a look at this and, other than the change to actually make the
lock non-recursive, the rest looks mostly like assertions that the lock
is in the correct state and debugging code to help when the lock is not
in the correct state.

As noted by Ken in his comments, there are several changes that look
similar to these:

+ PM_LOCK(new->c_lock);
__dmopencontext(new);
+ PM_UNLOCK(new->c_lock);

and ....

+ PM_LOCK(ctxp->c_lock);
+ sts = __pmLogRead_ctx(l_ctxp, l_ctxp->c_mode, NULL, &result,
PMLOGREAD_NEXT);
+ PM_UNLOCK(ctxp->c_lock);

These look problematic to me, since they require the caller to know
which (speudo-)API calls require locking which do not. It also feels
funny to require single threaded apps to perform this kind of locking.
For these cases, can we not move the locking to within the called
function in a form similar to:

__pseudoAPIFunction (....)
{
if (! PM_IS_LOCKED(ctxp->c_lock))
PM_LOCK(ctxp->c_lock);

.... API function code ...

PM_UNLOCK(ctxp->c_lock);
}

This would guarantee the correct locking state for these functions
without placing that burden on the caller.

Other then that, it all LGTM.

Dave
Post by Ken McDonell
Changes committed to https://github.com/kmcdonell/pcp master
libpcp: context lock (c_lock) is no longer recursive
app changes to match changed c_lock semantics
src/libpcp/src/context.c | 30 +++++++++++++++++++++++-------
src/libpcp/src/desc.c | 2 ++
src/libpcp/src/fetch.c | 7 +++++--
src/libpcp/src/instance.c | 2 ++
src/libpcp/src/lock.c | 28 +++++++++++++++++-----------
src/libpcp/src/logmeta.c | 2 ++
src/libpcp/src/logutil.c | 13 +++++++++----
src/libpcp/src/p_result.c | 6 ++++++
src/libpcp/src/pmns.c | 20 +++++++++++++-------
src/libpcp/src/store.c | 2 ++
src/libpcp/src/util.c | 6 ++++++
src/pmdumplog/pmdumplog.c | 5 +++++
src/pmlogcheck/pass3.c | 7 ++++++-
src/pmlogrewrite/pmlogrewrite.c | 8 +++++++-
14 files changed, 105 insertions(+), 33 deletions(-)
Details ...
commit 720141c2dd1a4752b9ed6348dc6f2105ee2243d6
Date: Sat Jun 24 16:43:49 2017 +1000
app changes to match changed c_lock semantics
Apps that call the *_ctx() routines in libpcp now need to be aware
that these routines may check that the context is indeed locked, so
although these apps may be single-threaded they still need to lock
the context before diving down into libpcp.
pmdumplog
pmlogcheck
pmlogrewrite
commit e5da450cf0ad149bad2d30b7c3e88da847f01f4b
Date: Sat Jun 24 16:40:49 2017 +1000
libpcp: context lock (c_lock) is no longer recursive
Change c_lock from PTHREAD_MUTEX_RECURSIVE to
PTHREAD_MUTEX_ERRORCHECK.
Some small code tweaks to accommodate the changed semantics.
Add checks that c_lock is locked when calling the *_ctx() routines
that assume they're operating on a locked context.
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16183): https://groups.io/g/pcp/message/16183
View All Messages In Topic (3): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Frank Ch. Eigler
2017-07-06 18:42:24 UTC
Permalink
Hi, Dave -
Post by Dave Brolley
[...]
As noted by Ken in his comments, there are several changes that look
+ PM_LOCK(new->c_lock);
__dmopencontext(new);
+ PM_UNLOCK(new->c_lock);
+ PM_LOCK(ctxp->c_lock);
+ sts = __pmLogRead_ctx(l_ctxp, l_ctxp->c_mode, NULL, &result,
PMLOGREAD_NEXT);
+ PM_UNLOCK(ctxp->c_lock);
These look problematic to me, since they require the caller to know
which (speudo-)API calls require locking which do not.
The "_ctx" naming convention, if adopted fully, would clear that up.
Post by Dave Brolley
It also feels funny to require single threaded apps to perform this
kind of locking.
Right, but this applies only to those apps that elect to make direct
calls into low level, generally undocumented, functions of libpcp.
These locks do not show up at all at the PMAPI level, for single- or
multi-threaded apps.
Post by Dave Brolley
For these cases, can we not move the locking to within the
__pseudoAPIFunction (....)
{
if (! PM_IS_LOCKED(ctxp->c_lock))
PM_LOCK(ctxp->c_lock);
.... API function code ...
PM_UNLOCK(ctxp->c_lock);
}
Nope, because this would repeat unlocks, if the caller did the right
thing. But I believe the idea was to have wrappers such as these
Post by Dave Brolley
__pseudoAPIFunction (....)
{
PM_LOCK(ctxp->c_lock);
__pseudoAPIFunction_ctx (....);
PM_UNLOCK(ctxp->c_lock);
}
So the low-level internal-symbol-reliant apps can/must choose which
variant they need - much the same way as libpcp internals need to
choose.


- FChE

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16184): https://groups.io/g/pcp/message/16184
View All Messages In Topic (4): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Dave Brolley
2017-07-06 19:38:42 UTC
Permalink
Post by Frank Ch. Eigler
Post by Dave Brolley
For these cases, can we not move the locking to within the
__pseudoAPIFunction (....)
{
if (! PM_IS_LOCKED(ctxp->c_lock))
PM_LOCK(ctxp->c_lock);
.... API function code ...
PM_UNLOCK(ctxp->c_lock);
}
Nope, because this would repeat unlocks, if the caller did the right
thing. But I believe the idea was to have wrappers such as these
OK, yes, but this could be fixed with some kind of "unlock it if I
locked it" logic.
Post by Frank Ch. Eigler
Post by Dave Brolley
__pseudoAPIFunction (....)
{
PM_LOCK(ctxp->c_lock);
__pseudoAPIFunction_ctx (....);
PM_UNLOCK(ctxp->c_lock);
}
So the low-level internal-symbol-reliant apps can/must choose which
variant they need - much the same way as libpcp internals need to
choose.
This approach would seem to solve both the clarity issue and the logic
issue.

Dave


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16185): https://groups.io/g/pcp/message/16185
View All Messages In Topic (5): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Dave Brolley
2017-07-11 13:22:44 UTC
Permalink
Ken, do you have any thoughts about the proposal below?

Dave
Post by Dave Brolley
Post by Frank Ch. Eigler
Post by Dave Brolley
For these cases, can we not move the locking to within the
__pseudoAPIFunction (....)
{
if (! PM_IS_LOCKED(ctxp->c_lock))
PM_LOCK(ctxp->c_lock);
.... API function code ...
PM_UNLOCK(ctxp->c_lock);
}
Nope, because this would repeat unlocks, if the caller did the right
thing. But I believe the idea was to have wrappers such as these
OK, yes, but this could be fixed with some kind of "unlock it if I
locked it" logic.
Post by Frank Ch. Eigler
Post by Dave Brolley
__pseudoAPIFunction (....)
{
PM_LOCK(ctxp->c_lock);
__pseudoAPIFunction_ctx (....);
PM_UNLOCK(ctxp->c_lock);
}
So the low-level internal-symbol-reliant apps can/must choose which
variant they need - much the same way as libpcp internals need to
choose.
This approach would seem to solve both the clarity issue and the logic
issue.
Dave
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16198): https://groups.io/g/pcp/message/16198
View All Messages In Topic (6): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Ken McDonell
2017-07-11 23:33:27 UTC
Permalink
Most of Frank's comments are addressed in my earlier email.
Post by Frank Ch. Eigler
...
The "_ctx" naming convention, if adopted fully, would clear that up.
Actually the _ctx() variants were intended to generally mean that a
pointer to the current context (the additional __pmContext *ctxp
argument) may be pushed down the call stack ... if ctxp is not NULL the
context can validly assumed to be locked, because the only place
(outside context.c) to get a __pmContext * is from __pmHandleToPtr()
which returns on success with the context locked.
Post by Frank Ch. Eigler
...
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16206): https://groups.io/g/pcp/message/16206
View All Messages In Topic (8): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Ken McDonell
2017-07-11 23:29:11 UTC
Permalink
G'day.

Apologies Dave for not responding earlier ... I've been a bit distracted with refactoring the PMNS locking which is no less complicated than the context locking ...
Post by Dave Brolley
I've had a look at this and, other than the change to actually make the
lock non-recursive, the rest looks mostly like assertions that the lock
is in the correct state and debugging code to help when the lock is not
in the correct state.
Yep the real refactoring happened in an earlier commit.
Post by Dave Brolley
As noted by Ken in his comments, there are several changes that look
+ PM_LOCK(new->c_lock);
__dmopencontext(new);
+ PM_UNLOCK(new->c_lock);
and ....
+ PM_LOCK(ctxp->c_lock);
+ sts = __pmLogRead_ctx(l_ctxp, l_ctxp->c_mode, NULL, &result, PMLOGREAD_NEXT);
+ PM_UNLOCK(ctxp->c_lock);
These look problematic to me, since they require the caller to know
which (speudo-)API calls require locking which do not. ...
At the time these changes were done c_lock was not recursive, so the only option was for the caller to assume responsibility for acquiring and releasing the lock (you cannot use the PM_IS_LOCKED() macro on a recursive lock because the pthread_mutex_trylock() will always succeed).
Post by Dave Brolley
... It also feels
funny to require single threaded apps to perform this kind of locking.
For these cases, can we not move the locking to within the called
__pseudoAPIFunction (....)
{
if (! PM_IS_LOCKED(ctxp->c_lock))
PM_LOCK(ctxp->c_lock);
.... API function code ...
PM_UNLOCK(ctxp->c_lock);
}
This would guarantee the correct locking state for these functions
without placing that burden on the caller.
Now that c_lock is not recursive and PM_IS_LOCKED() can be used, the change you suggest is possible (modulo Frank's comments about only unlocking if you actually locked). In fact, I've discovered a need for something similar in the PMNS locking where the preamble to the PMAPI entry points have something like:

/*
* Control structure for the current context ...
*/
typedef struct {
__pmContext *ctxp; /* NULL or a locked context */
int need_ctx_unlock; /* == 1 if the lock was acquired */
/* in a call to lock_ctx() */
} ctx_ctl_t;

/*
* ensure the current context, if any, is locked
*/
static int
lock_ctx(__pmContext *ctxp, ctx_ctl_t *ccp)
{
int handle;

handle = pmWhichContext();

if (ctxp != NULL) {
/* have a context pointer, so it must already be locked */
ccp->ctxp = ctxp;
ccp->need_ctx_unlock = 0;
}
else {
/*
* if we have a valid context, get a pointer to the __pmContext
* and hence the c_lock (context lock) mutex
*/
if (handle >= 0) {
ccp->ctxp = __pmHandleToPtr(handle);
if (ccp->ctxp != NULL)
ccp->need_ctx_unlock = 1;
else
ccp->need_ctx_unlock = 0;
}
else {
ccp->ctxp = NULL;
ccp->need_ctx_unlock = 0;
}
}

if (ccp->ctxp != NULL)
PM_ASSERT_IS_LOCKED(ccp->ctxp->c_lock);


return handle;
}

pmFOO(...)
{
...

ctx_ctl_t ctx_ctl = { NULL, 0 };

lock_ctx(NULL, &ctx_ctl);

and the pre-return code fragment looks like:

if (ctx_ctl.need_ctx_unlock)
PM_UNLOCK(ctx_ctl.ctxp->c_lock);

return sts;
}

And for the pmFoo_ctx(__pmContext *ctxp, ...) variant the lock_ctx() call looks like lock_ctx(ctxp, &ctx_ctl).

I'll try pushing the conditional context locking into the *_ctx() routines (and __dmopencontext()) when I surface to breadth from the PMNS work.

Thanks for the review.

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16205): https://groups.io/g/pcp/message/16205
View All Messages In Topic (7): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Frank Ch. Eigler
2017-07-20 18:53:24 UTC
Permalink
Hi -

Guys, sorry for not digging into this earlier, so these comments are a
little late.
Post by Ken McDonell
pmFOO(...)
{
...
ctx_ctl_t ctx_ctl = { NULL, 0 };
lock_ctx(NULL, &ctx_ctl);
[...]
}
I see why you're going down this path, but I'm not fully comfortable.
When you make locking itself conditional, you hamstring tools like
helgrind, because then the tool gets to see even fewer combinations of
call paths than before. You'd have to helgrind all the tools that use
different entry points, and different locking conditions, and even
then who knows. It may make deadlocks harder to identify by
inspection or by tool.

ISTM the code mostly affected by this were those tools that use
below-pmapi level function calls that deal with raw __pmContext*
pointers. Could whatever sub-pmapi functions these use just return an
already-PM_LOCK'd __pmContext* for use by these tools? Then they
could call__pmLowLevelFoo_ctx() functions() that internally
PM_ASSERT_LOCKED().

We already do just about that, but e.g. src/pmdumplog/pmdumplog.c:885
then releases the lock! Couldn't we just hang onto that lock instead
of all this conditional lock/unlock stuff?


- FChE

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16252): https://groups.io/g/pcp/message/16252
View All Messages In Topic (9): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Ken McDonell
2017-07-20 20:40:18 UTC
Permalink
G'day Frank.
-----Original Message-----
Sent: Friday, July 21, 2017 4:53 AM
Subject: Re: [pcp] pcp updates: context mutex (c_lock) is no longer
recursive
Hi -
Guys, sorry for not digging into this earlier, so these comments are a
little
late.
Better late than never ... feedback is always welcome.
Post by Ken McDonell
pmFOO(...)
{
...
ctx_ctl_t ctx_ctl = { NULL, 0 };
lock_ctx(NULL, &ctx_ctl);
[...]
}
I think you have to treat the PMNS locking as a separate case ... it uses
something similar, but a local lock_ctx_and_pmns() method and I cannot see
an alternative there because of the potential (and highly likely) deep
re-entry into PMAPI from the pmTraversePMNS() callback, coupled with the 3
flavours of PMNS (local file, archive and over the wire). So, assuming
you're not commenting on those cases, the template you describe is only
(very recently) used for __pmLogRead_ctx() and __pmLogFetch() to allow the
removal of explicit PM_LOCK() and PM_UNLOCK() ops above the PMAPI for 3
tools that know about these low level interfaces. This was done in response
to Dave's review feedback that I agree with.
I see why you're going down this path, but I'm not fully comfortable.
When you make locking itself conditional, you hamstring tools like
helgrind,
because then the tool gets to see even fewer combinations of call paths
than
before. ...
Can you please explain this bit ... I don't understand. The combination of
call paths is not changed by this in any way I can see, other than for a
handful of cases where the locking is done locally within PMAPI sometimes
... for everyone else the call paths are the same (except for some
additional trylock calls).
... You'd have to helgrind all the tools that use different entry points,
and different locking conditions, and even then who knows. It may make
deadlocks harder to identify by inspection or by tool.
ISTM the code mostly affected by this were those tools that use
below-pmapi
level function calls that deal with raw __pmContext* pointers. Could
whatever
sub-pmapi functions these use just return an already-PM_LOCK'd
__pmContext* for
use by these tools? Then they could call__pmLowLevelFoo_ctx() functions()
that
internally PM_ASSERT_LOCKED().
The _ctx variants were introduced to push the context down the callstack to
avoid nested locking of the per-context mutex ... this was done before any
of the lock_ctx() or even lock_ctx_and_pmns() changes that are aimed at two
different classes of problems.
We already do just about that, but e.g. src/pmdumplog/pmdumplog.c:885 then
releases the lock! Couldn't we just hang onto that lock instead of all
this
conditional lock/unlock stuff?
Nope. Some of these tools call into places where we (rightly for everyone
else and all other cases) assume the context is unlocked and call
__pmHandleToPtr(). This used to work, but is no longer an option now the
context lock is not recursive.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16253): https://groups.io/g/pcp/message/16253
View All Messages In Topic (10): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Frank Ch. Eigler
2017-07-21 00:43:36 UTC
Permalink
Hi, Ken -
Post by Ken McDonell
Post by Frank Ch. Eigler
Post by Ken McDonell
pmFOO(...)
{
...
ctx_ctl_t ctx_ctl = { NULL, 0 };
lock_ctx(NULL, &ctx_ctl);
[...]
}
I think you have to treat the PMNS locking as a separate case ... it
uses something similar, but a local lock_ctx_and_pmns() method and I
cannot see an alternative there because of the potential (and highly
likely) deep re-entry into PMAPI from the pmTraversePMNS() callback,
coupled with the 3 flavours of PMNS (local file, archive and over
the wire).
I haven't looked into this deeply, but it seems wrong for any locks to
be held across a callback into the application. Can pmTraversePMNS
work by snapshotting the results somewhere (while holding the locks),
and then iterating the snapshot with callbacks (without holding locks)?
Post by Ken McDonell
[...]
Post by Frank Ch. Eigler
I see why you're going down this path, but I'm not fully
comfortable. When you make locking itself conditional, you
hamstring tools like > helgrind, because then the tool gets to see
even fewer combinations of call paths > than before. ...
Can you please explain this bit ... I don't understand. The combination of
call paths is not changed by this in any way I can see, other than for a
handful of cases where the locking is done locally within PMAPI sometimes
... for everyone else the call paths are the same (except for some
additional trylock calls).
What I mean is that if locking is conditional, then a single helgrind
run over one call path may not see a big enough subset of all possible
locking-paths, so as to identify lock ordering errors.
Post by Ken McDonell
The _ctx variants were introduced to push the context down the
callstack to avoid nested locking of the per-context mutex ... this
was done before any of the lock_ctx() or even lock_ctx_and_pmns()
changes that are aimed at two different classes of problems.
Right, I wonder if that second class of problem was sort of an own-goal
that could have been solved another way.
Post by Ken McDonell
Post by Frank Ch. Eigler
We already do just about that, but
e.g. src/pmdumplog/pmdumplog.c:885 then releases the lock!
Couldn't we just hang onto that lock instead of all > this
conditional lock/unlock stuff?
Nope. Some of these tools call into places where we (rightly for everyone
else and all other cases) assume the context is unlocked and call
__pmHandleToPtr(). This used to work, but is no longer an option now the
context lock is not recursive.
So -repeated- __pmHandleToPtr() calls against the same pmapi context#?
Could the programs arrange not to do that? So that they create a
__pmContext* one time, hold onto that, while they make calls into the
_pm*_ctx() functions?


- FChE

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16254): https://groups.io/g/pcp/message/16254
View All Messages In Topic (11): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Ken McDonell
2017-07-24 06:05:12 UTC
Permalink
-----Original Message-----
Sent: Friday, July 21, 2017 10:44 AM
Subject: Re: [pcp] pcp updates: context mutex (c_lock) is no longer
recursive
...
I haven't looked into this deeply, but it seems wrong for any locks to be
held
across a callback into the application. Can pmTraversePMNS work by
snapshotting the results somewhere (while holding the locks), and then
iterating the snapshot with callbacks (without holding locks)?
That's exactly what the code does now for all PMNS variants ... until the
introduction of pmns_lock, this was not so!
...
So -repeated- __pmHandleToPtr() calls against the same pmapi context#?
Without an unlock this would cause a deadlock, so that's not happening now
... but it was happening before I pushed the context down the call stack as
a precursor to making the context lock not recursive.
Could the programs arrange not to do that? So that they create a
__pmContext* one time, hold onto that, while they make calls into the
_pm*_ctx() functions?
It depends ... the complexity of the calling within libpcp is sufficiently
ugly and unexpected to make it impossible (at least for my poor brain) to
follow this approach without breaking the external PMAPI semantics.
Particularly in the case of the PMNS where well-defined PMAPI calls can be
made without any current context, ...



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16264): https://groups.io/g/pcp/message/16264
View All Messages In Topic (12): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-
Frank Ch. Eigler
2017-08-01 14:56:19 UTC
Permalink
Hi -
Post by Ken McDonell
Post by Frank Ch. Eigler
I haven't looked into this deeply, but it seems wrong for any
locks to be held across a callback into the application. Can
pmTraversePMNS work by snapshotting the results somewhere (while
holding the locks), and then iterating the snapshot with callbacks
(without holding locks)?
That's exactly what the code does now for all PMNS variants ... until the
introduction of pmns_lock, this was not so!
I don't quite understand then why you mentioned "because of the
potential (and highly likely) deep re-entry into PMAPI from the
pmTraversePMNS() callback". This would not be a problem if
pmTraversePMNS didn't call back while holding locks, but making then
traversing a temporary copy of data. Then PMNS locking would not be a
special case.
Post by Ken McDonell
Post by Frank Ch. Eigler
...
So -repeated- __pmHandleToPtr() calls against the same pmapi context#?
Without an unlock this would cause a deadlock, so that's not happening now
... but it was happening before I pushed the context down the call stack as
a precursor to making the context lock not recursive.
Yup, that makes sense.
Post by Ken McDonell
Post by Frank Ch. Eigler
Could the programs arrange not to do that? So that they create a
__pmContext* one time, hold onto that, while they make calls into the
_pm*_ctx() functions?
It depends ... the complexity of the calling within libpcp is sufficiently
ugly and unexpected to make it impossible (at least for my poor brain) to
follow this approach without breaking the external PMAPI semantics.
Particularly in the case of the PMNS where well-defined PMAPI calls can be
made without any current context, ...
Hm, ok, though none of the external PMAPI calls should expose or
discuss these locks, nor the _ctx functions, nor the PM_LOCK* macros.
It's only these funky low-level tools that bypass high level PMAPI
that are the problem. Could they -systematically- bypass PMAPI
instead of half-and-half?


- FChE

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#16311): https://groups.io/g/pcp/message/16311
View All Messages In Topic (13): https://groups.io/g/pcp/topic/5433585
Mute This Topic: https://groups.io/mt/5433585/174549
New Topic: https://groups.io/g/pcp/post
-=-=-
pcp mailing list
***@groups.io
https://groups.io/g/pcp/messages
-=-=-
Change Your Subscription: https://groups.io/g/pcp/editsub/174549
Group Home: https://groups.io/g/pcp
Contact Group Owner: pcp+***@groups.io
Terms of Service: https://groups.io/static/tos
Unsubscribe: https://groups.io/g/pcp/leave/354206/1187008340/xyzzy
-=-=-=-=-=-=-=-=-=-=-=-

Loading...