
Valgrind-developer notes, re the MacOSX port
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

JRS 22 Mar 09: re these comments in m_libc* and m_debuglog:

/* IMPORTANT: on Darwin it is essential to use the _nocancel versions
   of syscalls rather than the vanilla version, if a _nocancel version
   is available.  See docs/internals/Darwin-notes.txt for the reason
   why. */

when Valgrind does (for its own purposes, not for the client)
read/write/open/close etc syscalls, it really is critical to use the
_nocancel versions of syscalls rather than the vanilla versions.  This
holds throughout the entire code base: whenever V does a syscall for
its own purposes, we must use the _nocancel version if it exists.
This is of course most prevalent in m_libc* since all of our
own-purpose (non-client) syscalls should get routed through there.

Why?  Because on Darwin, pthread cancellation is done within the
kernel (unlike on Linux, iiuc).  And read/write/open/close and a whole
bunch of other syscalls to do with stream I/O are cancellation points.
So what can happen is, client informs the kernel that a given thread
is to be cancelled.  Then at the next (eg) VG_(printf) call by that
thread, which leads to a sys_write, the write syscall gets hit by the
cancellation request, and is duly nuked by the kernel.  Of course from
the outside it looks as if the thread had mysteriously disappeared off
the radar for no reason.

In short, we need to use _nocancel versions in order to ensure that
cancellation requests only take effect at the places where the client
does a syscall, and not the places where Valgrind does syscalls.

How observed: using the standard pipe-based implementation in
coregrind/m_scheduler/sema.c, none/tests/pth_cancel1 would hang
(compared to succeeding using native Darwin semaphores).  And if the
"pause()" call in said test is turned into a spin ("while (1) ;") then
the entire Valgrind run mysteriously disappears, rather than spinning
using native Darwin semaphores.

Because the pipe-based semaphore intensively uses sys_read/sys_write,
it is not surprising that it inadvertently was eating up cancellation
requests directed to client threads.  With abovementioned change in
force the pipe-based semaphore appears to work correctly.



Valgrind-developer notes, things removed from the original MacOSX port
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There was a broken debugstub implementation.  It was removed over several
commits: r9477, which removed most of it, and r9711, r9759, and r10012,
which cleaned up remaining bits.

There was machinery to read function names from Dwarf3 debug info.  But we
already read function names from the symbol tables, so this was duplicated
functionality.  Furthermore, a Darwin-specific hack was required in
storage.c to choose between symbol table names vs. Dwarf3 names.  So this
machinery was removed in r10155.


Mac OS X / macOS supported platforms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since January 2025 (and thus Valgrind 3.25) we have required a compiler
that supports C11. See 42eb19c0da3bb280de88b7f8fc9b7caaa44aedfem for
details. We also need an xcrun that supports --show-sdk-path (since
December 2025). I'm not sure when both of those became supported.
Definitely neither are on Mac OS 10.7 with XCode 4.6.3. And equally
definitely both are supported on macOS 10.13 with XCode 10.1. That
means that I don't know which is the oldest version of macOS that
will build Valgrind (as of December 2025). Somewhere between 10.8 (Darwin 12)
and 10.12 (Darwin 16).


Valgrind-developer notes, todos re the MacOSX port
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* m_syswrap/syswrap-darwin.c:
  - PRE(sys_posix_spawn) completely ignores signal issues, and
    also ignores the file_actions argument

* Cleanups: sort wrappers in syswrap-darwin.c and priv_syswrap-darwin.h
  alphabetically.  Also, some aren't properly implemented -- check and
  print warnings

* Cleanups: m_scheduler/sema.c: use pipe implementation
  (but this apparently causes none/tests/pth_cancel1 to hang.
  I have no idea why, despite quite some investigation).

* Cleanups: m_debugstub: move to attic

* syswrap-darwin.c: sys_{f,}chmod_extended: handling of ARG5 is way
  wrong

* Cleanups (Linux,AIX5): bogus launcher-path mangling logic in
  PRE(sys_execve)

* Cleanups (ALL PLATFORMS): m_signals.c: are the _MY_SIGRETURN
  assembly stubs actually necessary for anything?  I don't know.

* Cleanups: check that changes to VG_(stat) and VG_(stat64) have
  not broken 64-bit statting on 32-bit Linux

* Cleanups: #if !HAVE_PROC in m_main (to do with /proc/<pid>/cmdline

--------

m_main: relatedly, Darwin version does not collect/give out
initial debuginfo handles; hence ptrcheck won't work


m_main: Darwin port relies on blocking out big sections of address
space with mmap at startup.  We know from history that this is a bad
idea.  (It's also really slow on 64-bit builds, taking 3--4 seconds.)
Also, startup is not done on the interim startup stack -- why not?


VG_(di_notify_mmap): Linux version is also used for Darwin, and
contains some ifdeffery.  Clean up.


PRE(sys_fork), #ifdeffery


syswrap-generic.c: VG_(init_preopened_fds) is #ifdefd for Darwin


scheduler.c: #ifdeffery in VG_(get_thread_out_of_syscall)


look at notes in coregrind/Makefile.am re Mach RPC interface
definitions.  See if we can get rid of any more stuff now that
m_debugstub is gone.
