Fix dangling ref in m_errormgr.c + report all uninit fields in a syscall param

Some syscall verification code is allocating memory to generate 
the string used to build an error, e.g. syswrap-generic.c verifying fields of
e.g socket addresses (pre_mem_read_sockaddr) or sendmsg/recvmsg args 
(msghdr_foreachfield)

The allocated pointer was copied in the error created by VG_(maybe_record_error).

This was wrong for 2 reasons:
1. If the error is a new error, it is stored in a list of errors,
   but the string memory was freed by pre_mem_read_sockaddr, msghdr_foreachfield, ...
   This causes a dangling reference. Was at least visible when giving -v, which
   re-prints all errors at the end of execution.
   Probably this could have some consequences during run while generating new errors,
   and comparing for equality with a recorded error having a dangling reference.
2. the same allocated string is re-used for each piece/field of the verified struct.
   The code in mc_errors.c that checks that 2 errors are identical was then wrongly
   considereing that 2 successive errors for 2 different fields for the same syscall
   arg are identical, just because the error string happened to be produced at
   the same address.
(it is believed that initially, the error string was assumed to be a static
string, which is not the case anymore, causing the above 2 problems).

Changes:
* The fix consists in duplicating in m_errormgr.c the given error string when
  the error is recorded. In other words, the error string is now duplicated similarly
  to the (optional) extra component of the error.

* memcheck/tests/linux/rfcomm.c test modified as now an error is reported
  for each uninit field.

* socketaddr unknown family is also better reported (using sa_data field name,
  rather than an empty field name.

* minor reformatting in m_errormgr.c, to be below 80 characters.

Some notes:
1. the string is only duplicated if the error is recorded
   (ie. printed or the first time an error matches a suppression).
   The string is not duplicated for duplicated errors or following errors
   matching the first (suppressed) error.
   The string is also not duplicated for 'unique errors' (that are printed
   and then not recorded).
2. duplicating the string for each recorded error is not deemed to
   use a lot of memory:
     * error strings are usually NULL or short (often 10 bytes or so).
     * we expect no program has a huge number of errors
   If ever this string duplicate would be significant, having a DedupPoolAlloc
   in m_errormgr.c for these strings would reduce this memory (as we expect to
   have very few different strings, even with millions of errors).



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@14214 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/NEWS b/NEWS
index 645bbf5..1231937 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@
   - new leak check heuristic 'length64' to detect interior pointers
     pointing at offset 64bit of a block, when the first 8 bytes contains
     the block size - 8. This is e.g. used by sqlite3MemMalloc.
+  - if a syscall param (e.g. bind struct sockaddr, sendmsg struct msghdr,
+    ...) has several fields not initialised, an error is now reported for
+    each field. Previously, an error was reported only for the first wrong
+    field.
 
 * Helgrind:
   - Race condition error message with allocated blocks also show
diff --git a/coregrind/m_errormgr.c b/coregrind/m_errormgr.c
index e7de4eb..3143d78 100644
--- a/coregrind/m_errormgr.c
+++ b/coregrind/m_errormgr.c
@@ -677,7 +677,8 @@
    All detected errors are notified here; this routine decides if/when the
    user should see the error. */
 void VG_(maybe_record_error) ( ThreadId tid, 
-                               ErrorKind ekind, Addr a, const HChar* s, void* extra )
+                               ErrorKind ekind, Addr a, 
+                               const HChar* s, void* extra )
 {
           Error  err;
           Error* p;
@@ -818,9 +819,17 @@
          break;
    }
 
+   /* copy the error string, if there is one.
+      note: if we would have many errors with big strings, using a
+      DedupPoolAlloc for these strings will avoid duplicating
+      such string in each error using it. */
+   if (NULL != p->string) {
+      p->string = VG_(arena_strdup)(VG_AR_CORE, "errormgr.mre.2", p->string);
+   }
+
    /* copy block pointed to by 'extra', if there is one */
    if (NULL != p->extra && 0 != extra_size) { 
-      void* new_extra = VG_(malloc)("errormgr.mre.2", extra_size);
+      void* new_extra = VG_(malloc)("errormgr.mre.3", extra_size);
       VG_(memcpy)(new_extra, p->extra, extra_size);
       p->extra = new_extra;
    }
@@ -870,7 +879,7 @@
       Then update the 'extra' part with VG_(tdict).tool_update_extra),
       because that can have an affect on whether it's suppressed.  Ignore
       the size return value of VG_(tdict).tool_update_extra, because we're
-      not copying 'extra'. */
+      not copying 'extra'. Similarly, 's' is also not copied. */
    (void)VG_TDICT_CALL(tool_update_extra, &err);
 
    su = is_suppressible_error(&err);
@@ -1514,10 +1523,11 @@
       // n_expanded >= n_ips_expanded.
 
       Int* n_offsets_per_ip;
-      // n_offsets_per_ip[i] gives the nr of offsets in fun_offsets and obj_offsets
-      // resulting of the expansion of ips[i].
+      // n_offsets_per_ip[i] gives the nr of offsets in fun_offsets and
+      // obj_offsets resulting of the expansion of ips[i].
       // The sum of all n_expanded_per_ip must be equal to n_expanded.
-      // This array allows to retrieve the position in ips corresponding to an ixInput.
+      // This array allows to retrieve the position in ips corresponding to 
+      // an ixInput.
 
       // size (in elements) of fun_offsets and obj_offsets.
       // (fun|obj)_offsets are reallocated if more space is needed
@@ -1570,9 +1580,10 @@
 }
 
 /* free the memory in ip2fo.
-   At debuglog 4, su (or NULL) will be used to show the matching (or non matching)
-   with ip2fo. */
-static void clearIPtoFunOrObjCompleter ( Supp  *su, IPtoFunOrObjCompleter* ip2fo)
+   At debuglog 4, su (or NULL) will be used to show the matching
+   (or non matching) with ip2fo. */
+static void clearIPtoFunOrObjCompleter ( Supp  *su, 
+                                         IPtoFunOrObjCompleter* ip2fo)
 {
    if (DEBUG_ERRORMGR || VG_(debugLog_getLevel)() >= 4) {
       if (su)
@@ -1672,7 +1683,8 @@
               i++) {
             ip2fo->obj_offsets[i] = ip2fo->names_free;
             if (DEBUG_ERRORMGR) 
-               VG_(printf) ("   set obj_offset %lu to %d\n", i, ip2fo->names_free);
+               VG_(printf) ("   set obj_offset %lu to %d\n", 
+                            i, ip2fo->names_free);
          }
       }
       ip2fo->names_free += VG_(strlen)(caller_name) + 1;
diff --git a/coregrind/m_syswrap/syswrap-generic.c b/coregrind/m_syswrap/syswrap-generic.c
index 39ee199..35a2b40 100644
--- a/coregrind/m_syswrap/syswrap-generic.c
+++ b/coregrind/m_syswrap/syswrap-generic.c
@@ -1074,8 +1074,13 @@
 #endif
 
       default:
-         VG_(sprintf) ( outmsg, description, "" );
-         PRE_MEM_READ( outmsg, (Addr) sa, salen );
+         /* No specific information about this address family.
+            Let's just check the full data following the family.
+            Note that this can give false positive if this (unknown)
+            struct sockaddr_???? has padding bytes between its elements. */
+         VG_(sprintf) ( outmsg, description, "sa_data" );
+         PRE_MEM_READ( outmsg, (Addr)&sa->sa_family + sizeof(sa->sa_family),
+                       salen );
          break;
    }
    
diff --git a/include/pub_tool_errormgr.h b/include/pub_tool_errormgr.h
index 26339ab..3ec673c 100644
--- a/include/pub_tool_errormgr.h
+++ b/include/pub_tool_errormgr.h
@@ -66,9 +66,15 @@
    seen before.  If it has, the existing error record will have its count
    incremented.
 
-   'tid' can be found as for VG_(record_ExeContext)().  The `extra' field can
-   be stack-allocated;  it will be copied by the core if needed (but it
-   won't be copied if it's NULL).
+   'tid' can be found as for VG_(record_ExeContext)().  The `s' string
+   and `extra' field can be stack-allocated;  they will be copied by the core
+   if needed (but it won't be copied if it's NULL).
+   Note that `ekind' and `s' are also used to generate a suppression.
+   `s' should therefore not contain data depending on the specific
+   execution (such as addresses, values) but should rather contain
+   e.g. a system call parameter symbolic name.
+   `extra' is also (optionally) used for generating a suppression
+   (see pub_tool_tooliface.h print_extra_suppression_info).
 
    If no 'a', 's' or 'extra' of interest needs to be recorded, just use
    NULL for them.  */
diff --git a/memcheck/tests/linux/rfcomm.c b/memcheck/tests/linux/rfcomm.c
index 02dcd7e..dbda1f5 100644
--- a/memcheck/tests/linux/rfcomm.c
+++ b/memcheck/tests/linux/rfcomm.c
@@ -3,6 +3,7 @@
 #include <stdlib.h>
 #include <sys/types.h>
 #include <sys/socket.h>
+#include "../../memcheck.h"
 
 /* user space headers might not be there, define things ourselves. */
 typedef struct {
@@ -34,20 +35,39 @@
     }
 
   struct vui_sockaddr_rc aAddr;
+
+  // Store correct values in aAddr but marking it undefined
+  // so as to generate errors. We need to have deterministic
+  // undefined values to have a reproducible test.
+  aAddr.rc_family = VUI_AF_BLUETOOTH;
+  aAddr.rc_bdaddr = *VUI_BDADDR_ANY;
+  aAddr.rc_channel = 5;
+  VALGRIND_MAKE_MEM_UNDEFINED(&aAddr, sizeof(aAddr));
+  // We re-assign below each piece one by one, so as to
+  // have the piece marked initialised.
+
+
   // Ignore return values.
 
-  // Missing family
+  // Everything uninit (family, ...)
+  bind(nSocket, (struct sockaddr *) &aAddr, sizeof(aAddr));
+
+  // Same but with an unknown family (we hope :)
+  aAddr.rc_family = 12345;
+  // (reset everything to uninit)
+  VALGRIND_MAKE_MEM_UNDEFINED(&aAddr, sizeof(aAddr));
   bind(nSocket, (struct sockaddr *) &aAddr, sizeof(aAddr));
 
   aAddr.rc_family = VUI_AF_BLUETOOTH;
-  // Missing bdaddr.
+  // uninit bdaddr and channel.
   bind(nSocket, (struct sockaddr *) &aAddr, sizeof(aAddr));
 
   aAddr.rc_bdaddr = *VUI_BDADDR_ANY;
-  // Missing channel.
+  // uninit channel.
   bind(nSocket, (struct sockaddr *) &aAddr, sizeof(aAddr));
 
   aAddr.rc_channel = 5;
+  // Everything correctly init.
   bind(nSocket, (struct sockaddr *) &aAddr, sizeof(aAddr));
 
   return 0;
diff --git a/memcheck/tests/linux/rfcomm.stderr.exp b/memcheck/tests/linux/rfcomm.stderr.exp
index 293ae2b..0c16048 100644
--- a/memcheck/tests/linux/rfcomm.stderr.exp
+++ b/memcheck/tests/linux/rfcomm.stderr.exp
@@ -1,24 +1,64 @@
 Syscall param socketcall.bind(my_addr.sa_family) points to uninitialised byte(s)
    ...
-   by 0x........: main (rfcomm.c:40)
+   by 0x........: main (rfcomm.c:53)
  Address 0x........ is on thread 1's stack
- in frame #1, created by main (rfcomm.c:25)
- Uninitialised value was created by a stack allocation
-   at 0x........: main (rfcomm.c:25)
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:45)
 
 Syscall param socketcall.bind(my_addr.rc_bdaddr) points to uninitialised byte(s)
    ...
-   by 0x........: main (rfcomm.c:44)
+   by 0x........: main (rfcomm.c:53)
  Address 0x........ is on thread 1's stack
- in frame #1, created by main (rfcomm.c:25)
- Uninitialised value was created by a stack allocation
-   at 0x........: main (rfcomm.c:25)
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:45)
 
 Syscall param socketcall.bind(my_addr.rc_channel) points to uninitialised byte(s)
    ...
-   by 0x........: main (rfcomm.c:48)
+   by 0x........: main (rfcomm.c:53)
  Address 0x........ is on thread 1's stack
- in frame #1, created by main (rfcomm.c:25)
- Uninitialised value was created by a stack allocation
-   at 0x........: main (rfcomm.c:25)
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:45)
+
+Syscall param socketcall.bind(my_addr.sa_family) points to uninitialised byte(s)
+   ...
+   by 0x........: main (rfcomm.c:59)
+ Address 0x........ is on thread 1's stack
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:58)
+
+Syscall param socketcall.bind(my_addr.sa_data) points to uninitialised byte(s)
+   ...
+   by 0x........: main (rfcomm.c:59)
+ Address 0x........ is on thread 1's stack
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:58)
+
+Syscall param socketcall.bind(my_addr.rc_bdaddr) points to uninitialised byte(s)
+   ...
+   by 0x........: main (rfcomm.c:63)
+ Address 0x........ is on thread 1's stack
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:58)
+
+Syscall param socketcall.bind(my_addr.rc_channel) points to uninitialised byte(s)
+   ...
+   by 0x........: main (rfcomm.c:63)
+ Address 0x........ is on thread 1's stack
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:58)
+
+Syscall param socketcall.bind(my_addr.rc_channel) points to uninitialised byte(s)
+   ...
+   by 0x........: main (rfcomm.c:67)
+ Address 0x........ is on thread 1's stack
+ in frame #1, created by main (rfcomm.c:26)
+ Uninitialised value was created by a client request
+   at 0x........: main (rfcomm.c:58)