goldfish_sync: fix stalls by avoiding early kfree()

When running for a long time, we get mysterious stall messages
or some impossible-looking kernel stack trace where
a single CPU accesses drivers/staging/android/sync.c's
sync_timeline_signal() and cannot get the spin lock.

This was found to be because the timeline wrapper objects
(goldfish_sync_timeline_obj) were not
being cleaned up properly for the (rare) case when a
timeline increment is still pending after the
timeline wrapper object is destroyed.

If the wrapper object is kfree()'ed too early, it may
point at garbage memory that can happen to line up
so that it looks like a sync timeline object that
currently holds a spin lock. In that case, we get
a stall due to sw_sync_timeline_inc being unable to
acquire that "zombie" spin lock.

This CL postpones timeline object destruction until
all pending increments have gone through, using a
reference-counting scheme (krefs).

Change-Id: I6f83a7bd61c174a8d99d83ea0f6e0972211337ee
Signed-off-by: Lingfeng Yang <lfy@google.com>
(cherry picked from commit 2d2c0829a38d4f0c4d2f42e88f838aaf5d33cefa)
1 file changed