Do not rewrite array subscripts if invalid sourceloc range
Fix #2739.
The issue exposed an issue that to rewrite
CODE1: __u8 byte = daddr->s6_addr[4];
will segfault and to rewrite
CODE2: __u8 byte = (daddr->s6_addr)[4];
will be okay.
For CODE1, the clang did not give enough information to find the text
which contains the left bracket "[", given base "daddr->s6_addr"
and subscript "4". For CODE2, the clang is able to get the information
successfuly.
I think if we really go inside the base "daddr->s6_addr" and gets to
its member field "s6_addr", we can find the needed information
for the text range containing "[". Let us fix the segfault first
and if really desirable, we can try to enhance later for CODE1 patterns.
Signed-off-by: Yonghong Song <yhs@fb.com>
diff --git a/src/cc/frontends/clang/b_frontend_action.cc b/src/cc/frontends/clang/b_frontend_action.cc
index 8ed8e69..bee7dd6 100644
--- a/src/cc/frontends/clang/b_frontend_action.cc
+++ b/src/cc/frontends/clang/b_frontend_action.cc
@@ -534,6 +534,20 @@
LangOptions opts;
SourceLocation lbracket_start, lbracket_end;
SourceRange lbracket_range;
+
+ /* For cases like daddr->s6_addr[4], clang encodes the end location of "base"
+ * as "]". This makes it hard to rewrite the expression like
+ * "daddr->s6_addr [ 4 ]" since we do not know the end location
+ * of "addr->s6_addr". Let us abort the operation if this is the case.
+ */
+ lbracket_start = Lexer::getLocForEndOfToken(GET_ENDLOC(base), 1,
+ rewriter_.getSourceMgr(),
+ opts).getLocWithOffset(1);
+ lbracket_end = GET_BEGINLOC(idx).getLocWithOffset(-1);
+ lbracket_range = expansionRange(SourceRange(lbracket_start, lbracket_end));
+ if (rewriter_.getRewrittenText(lbracket_range).size() == 0)
+ return true;
+
pre = "({ typeof(" + E->getType().getAsString() + ") _val; __builtin_memset(&_val, 0, sizeof(_val));";
pre += " bpf_probe_read(&_val, sizeof(_val), (u64)((";
if (isMemberDereference(base)) {
@@ -549,11 +563,6 @@
* a method to retrieve the left bracket, replace everything from the end of
* the base to the start of the index. */
lbracket = ") + (";
- lbracket_start = Lexer::getLocForEndOfToken(GET_ENDLOC(base), 1,
- rewriter_.getSourceMgr(),
- opts).getLocWithOffset(1);
- lbracket_end = GET_BEGINLOC(idx).getLocWithOffset(-1);
- lbracket_range = expansionRange(SourceRange(lbracket_start, lbracket_end));
rewriter_.ReplaceText(lbracket_range, lbracket);
rbracket = "))); _val; })";