From: Andrea Arcangeli <andrea@suse.de>

I debugged a problem with CLONE_THREAD under strace generating zombies that
cannot be reaped by init.

Basically what's going on is that release_task is never called on the
clones, and in turn the parent thread will remain zombie forever because
thread_group_empty == 0 (it never notifies init).  the group can become
empty only after release_task has been called on all the clones.

What's going on is that if the clone happen to be under strace by the time
it exits its state will not be set to TASK_DEAD and nobody will ever call
wait4 on the clone because the parent is being killed at the same time. 
But the parent cannot go away until the clone goes away too.  I believe
strace needs as well a little race where it has the sigchld disabled but
what I'm discussing here is still a kernel bug generating zombie threads.

I think I could have fixed even with a strictier patch (adding a
exit_signal == -1 check just to cover that case), but I believe that it
makes no sense to leave ptrace enabled on a clone that is being killed, it
happens to be safe without a thread-group just because there will be always
init able to call wait4->release_task on it, that will call ptrace_unlink
later in release_task, same goes for the "leader" of the thread group, that
as well can be detached by ptrace via release_task).

Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/kernel/exit.c |    7 +++++++
 1 files changed, 7 insertions(+)

diff -puN kernel/exit.c~zombie-with-clone_thread kernel/exit.c
--- 25/kernel/exit.c~zombie-with-clone_thread	2004-06-29 23:04:07.787147704 -0700
+++ 25-akpm/kernel/exit.c	2004-06-29 23:04:07.791147096 -0700
@@ -730,6 +730,13 @@ static void exit_notify(struct task_stru
 		do_notify_parent(tsk, SIGCHLD);
 	}
 
+	/*
+	 * To allow the group leader of a thread group to be released
+	 * we must really go away synchronously if exit_signal == -1.
+	 */
+	if (unlikely(tsk->ptrace) && tsk != tsk->group_leader)
+		__ptrace_unlink(tsk);
+
 	state = TASK_ZOMBIE;
 	if (tsk->exit_signal == -1 && tsk->ptrace == 0)
 		state = TASK_DEAD;
_