[c++-pthreads] concrete library-code example

Wed Jan 7 07:17:50 UTC 2004

A quick reply...

On Mon, Jan 05, 2004 at 11:57:32AM -0500, Dave Butenhof wrote:
> Nathan Myers wrote:
> >
> >Here is a more-or-less concrete example, for discussion purposes.
> >It's meant as a generic example of code written according to the 
> >existing contract offered by C libraries.
> 
> Correction: "... offered by C libraries that support POSIX 1003.1b-1993 
> or earlier."

Very few programmers can identify any POSIX definition by number.   
They write, and have long written, exception-safe library code that, 
at most, uses mutexes (wrapped carefully for portability!) to guard 
global state.  Few have even heard of cancellation.

Many millions of lines of such code have been running for years on 
millions of installations, worldwide.  It's good code.  To pretend 
that it's all suddenly worthless because it doesn't take into account 
new (or newly-deployed) standard revision 7834-stroke-"b"-slash-
667-stroke-"a" would simply make _us_ irrelevant.

> > int affect_world(struct state* s)
> > {
> >   int result;
> >   violate_invariants_or_claim_resources(s);
> >   result = c_function_or_system_call(s->member);
> >   if (result < 0) {
> >     clean_up(s, result);
> >     return result;
> >   }
> >   act_on_result(s, result);
> >   restore_invariants_and_release_resources(s);
> >   return 0;
> > }
> >
> >This pattern is extremely common in both C and C++ libraries.  If 
> >read() were to throw (or to "just ... not return"), the program state 
> >would be corrupted.  A redefinition of c_function_or_system_call 
> >semantics that breaks this code breaks many thousands of existing 
> >thread-safe C and C++ libraries.
> 
> If this code exists in a pure ANSI C/POSIX application using threads, 
> and if the thread running this code can be cancelled, then the 
> implementation of this function is broken because IT (not the 
> implementation, nor the cancellation) corrupts program state.

No.  The code was written to a documented interface.  Whoever changes 
the interface semantics without changing the interface name is 
responsible for corrupting the program state.  

> While I'm not at all trying to argue that the issue is at all as simple 
> as this, that's the facts all the same.

Sorry, that's simply disingenuous.  To argue that everybody should have
coded to an interface that you only just got around to documenting, 
implementing, and deploying, many years after the code was written,
borders on arrogant contempt.  

Such an attitude may be fine for the POSIX C committee, but I see no 
reason to match it here.  In any case, we have a great deal more 
already-thread-safe code to preserve, because thread-safety (by the
common definition) is the norm, in C++.

> Depending on propagation of error statuses is a really bad way to
> implement cancellation.   At least, given the primitive and limited
> concept of ANSI/POSIX error codes. Too much code ignores statuses in
> the first place, which is bad enough. But, worse, there are many
> legitimate reasons for library code to CONVERT return status values;
> e.g., I called read() and it returned some error but MY function only
> implicitly involves a read() and it simply wouldn't be useful or
> meaningful to return that error to my caller. Instead, I want to
> indicate that my function (say, synchronizing a database) failed, and
> so any (or at least most) failures of my "support calls" will result
> in my returning 'unable to synchronize database' (which often isn't an
> ANSI/POSIX error number in the first place, but even if it is, it's
> unlikely to be the value returned by read). The ECANCELLED some have
> proposed would be lost, and that's unacceptable. This is why we
> settled on exceptions to represent cancellation. And because POSIX and
> ANSI C don't have exceptions, we devised the simple "cleanup handler"
> mechanism that allowed a clean  and transparent implementation on top
> of exceptions, or a "hack" implementation private to the thread
> library where exceptions weren't available.

Again, that reasoning may be fine for C (did you really ask all those
C programmers?), but we need not be bound by it here.  

Since a cancellation error return swallowed up in library code must
surface again at the next cancellation point, eventually (given a 
well-written library) the failure must propagate upward to the point 
where it may be turned into an exception.  (A library that never 
propagates system-call failures to its caller isn't anything-safe, 
and needn't concern us.)

> >(The cancellation model described in
> > http://www.codesourcery.com/archives/c++-pthreads/msg00021.html
> >is designed to preserve libraries that contain code that follows 
> >this pattern.)
> >
> >Jason, do you not consider those libraries worth preserving?
> 
> If you're talking about a currently non-threaded library to which
> you'd like to transparently add thread support; well, I doubt that's
> possible, and this particular proposal isn't going to help.  When
> they're redesigned and recoded to be thread-safe, they can also be
> made cancel-safe. 

No.  I'm talking about the many millions of lines of existing 
thread-safe library code.  Ordinary thread-safety is the norm in C++ 
libraries, because it's the natural way to code, in C++.

> If you're talking about adding cancel support transparently 
> to an existing C++ library, I doubt this is sufficient unless there's 
> some standard requirement that all C++ libraries must pass through the 
> system failure code to the caller. (There isn't, can't be, and shouldn't 
> be.) And it also presupposes that the C++ library isn't exception-safe; 
> because if it is, then delivering cancellation as an exception would 
> seem "obviously" to be the most compatible and complete solution.

Exception-safety depends on identifying and guarding against documented
sources of possible exceptions.  System calls and C library functions 
are not among those.  Also, C++ libraries very frequently rely on 
underlying C libraries, and are written to depend on their documented 
behavior.  (None of my man(2) or man(3) pages mention unwinding, never 
mind throwing.)

Even if you claim that the threat of "unwinding" from system calls is
ancient, and that everything should have been written to assume it, 
a change to make them throw would be completely new.

> And I'm deliberately discounting the mention I've seen several times in 
> this list of "thread-safe" libraries that aren't "cancel-safe". Such 
> libraries are simply broken, from basic design.  Cancellation is a basic 
> and important part of the POSIX thread model, and if you're not safe 
> you're not safe. The only viable exclusion (there, I avoided using the 
> word "exception", though it took me a few moments of thought) is if you 
> can be guaranteed to be running only in threads that can never be 
> cancelled... and in that case the whole issue is irrelevant!

No offense intended, but disingenuity makes a poor substitute for 
responsible design.  We can afford to be more responsible here, because 
we have stronger language semantics to work with, and well-worked-out 
exception-safety standards.

Nathan Myers
ncm at cantrip.org