[Toybox] memcpy overlap in ps

Evgenii Stepanov eugenis at google.com
Thu Oct 6 14:15:30 PDT 2016


Hi Rob,

thanks for the explanation. This is definitely not a false positive -
the report even contains dst and src ranges for the memcpy() call, and
they indeed overlap. Should be possible to reproduce w/o ASan by
checking the addresses in the code.

I've added some debug printfs to this code, and got this:

j = 5
ptb 0x0 tb 0x5564dfb750 buf 0x5564dfb961
USER       PID  PPID     VSZ    RSS WCHAN              PC S NAME
root         1     0   10628   1272 SyS_epoll_     4d805c S init
j = 5
ptb 0x0 tb 0x5564dfb750 buf 0x5564dfb94b

The change (the one that requires a google login) simply replaces
memcpy with memmove, which sounds like a bad idea at this point.


On Wed, Oct 5, 2016 at 3:15 AM, Rob Landley <rob at landley.net> wrote:
> On 10/03/2016 07:22 PM, enh wrote:
>> from the AOSP gerrit (fixing internal bug 30074257). i've been meaning
>> to look at this more closely for a couple of months, but haven't found
>> the time. i too wasn't sure whether switching to memmove was actually
>> the fix or just papering over a real problem...
>
> Since it took me a little to work through it too, and you're "not sure",
> here's a refresher on the logic. (Feel free to skip down to the =======
> where I start talking about the code directly setting up the memcpy this
> patch would change.)
>
> The code this patch touches (line numbers are circa commit 703c49e0cb97)
> initializes the miscellaneous string slots, which store data we didn't
> get out of /proc/$PID/stat but came from somewhere else under /proc/$PID.
>
> This code lives in the function get_ps(). The miscellanous string
> processing depends on an array of structures called fetch[] at the start
> of get_ps() (line 601) listing various files under the current
> /proc/$PID, each with a set of field bits that tell us when we need this
> data (so we don't open and read from files our -o BLAH,BLAH,BLAH list
> isn't displaying). Down around line 728 we iterate through the fetch[]
> array, opening and reading each file and doing the additional processing
> it requires. Each of these files produces variable length string data,
> which we append to tb->str and store an array of tb->offset[] values to
> where each one starts. (Well, after the first; the first is at offset
> zero so we don't need to store it.)
>
> All this code lives in function get_ps(), which reads all the /proc data
> for a given process into "toybuf" in one gulp, and then at the end of
> the function (line 850) either displays it immediately (if we haven't
> got -k) or copies into into a malloced buffer attached tree we later
> turn into an array and sort before displaying it. Either way, we filter
> out the nodes we don't want (line 693) and return about halfway through
> get_ps() if it's not a process we eventually want to display.
> This means that all the data we read about each process, including the
> miscelaneous string slots, has to collectively fit in 4096 bytes.
>
> The variable name "tb" is short for "toybuf" bcause it's the structure I
> was was overlaying on toybuf to store the data in, and at the top of the
> function (line 610) it's pointed to toybuf (and on line 621 toybuf is
> memset back to all zeroes), so that's what all its fields are scribbling
> over as we read in data: "struct carveup *tb" is how we carve up toybuf
> into individual fields.
>
> Before we read the miscelaneous string data, lines 621 through 726 read
> a bunch of other stuff into toybuf first (starting with all the numeric
> data from the file "stat" going into the long long tb->slot[SLOT_count]
> array, which is a TAGGED_ARRAY as described in the checkin comment of
> commit f96bb3d8e7ec). Then we read other stuff out of the files
> "status", "io", "statm", "exe", and android's get_sched_policy().
>
> By the time we start iterating through that fetch[] array to initialize
> the miscelanous string slots, we've used maybe 1/4 of toybuf's 4096
> bytes, and a lot of these strings can be arbitrarily long (on modern
> kernels argv[] data can be gigabytes), so we give each one a size budget
> (see line 740). The theory is each field we HAVEN'T read yet has a
> minimum reserved size (256 bytes), but any space that wasn't used by
> fields we've already read gets added to the current field's budget, so
> if one of them needs to be 2k it can without totally squeezing out the
> others. (This gives us more space for string matching or displaying in
> -w mode.)
>
> (Note: there's a design tension here: processes can go away at any time,
> they can exit _while_ we're reading them, so we snapshot the data for
> each process all at once (and if we fail halfway through we discard the
> process; if it had gone away a fraction of a second earlier we wouldn't
> have seen it at all). The only AFTER we've read everything do we sort
> and/or display it. We never go back and read more data from a process
> after get_ps() returns, to avoid race conditions. But we don't want
> unlimited memory usage (imagine a process with 1000 threads and a
> megabyte-long command line), hence the 4k cap per process on data we
> store. Alas, this means if your /proc/self/exe symlink is 3000 bytes
> long, it's trucated to the space budget we had for reading that
> miscellaneous string slot.)
>
> The "buf" pointer advances through toybuf as we use it, and it always
> points to the next unused byte in toybuf. So on line 742 we set "len" to
> how much space this miscellaneous string field is allowed to use, and
> use it as temporary scratch space to write the path to the filename we
> want to look at into buf. ("fd" is an open filehandle to /proc so we can
> openat() or readlinkat() paths relative to that. The upcoming read or
> readlink command will overwrite the filename data with the data it fetches.)
>
> Then we special case the fetch[] entries were we DON'T just open a file
> and read all its contents, but instead do something like readlink() it,
> and that's slots 0, 3 and 5. For slot 3 we're doing a
> readlink("/proc/$PID/exe"), slot 0 does funky tty processing (most of
> which Elliott wrote, doing a stat() on fd/* and comparing the major and
> minor numbers with /proc/tty/drivers, which should possibly be cached so
> we're not reopening it so much; but the reason it's in miscellaneous
> string processing is the result is a string of unknown length that we
> need to store until we're ready to display it).
>
> Slot 5 is a funny one having to do with threads. A thread has to look at
> its parent process in order to get some of its data (its own stat entry
> has some blank fields), but the display logic is only ever looking at
> ONE process's stored block of "struct carveup" toybuf data. By the time
> the need for one process to look at ANOTHER process to fill out its tb
> fields came up, the ps design really really didn't want to do that.
>
> So what I did was add a global variable (TT.threadparent, see line 751)
> that lets get_ps() access the parent of the thread its currently looking
> at. (And know that when this variable is NULL, then it's not looking at
> a thread.) Only the data reading function has this, the display function
> can't because we only store the processes we're displaying, the filter
> function may have discarded that other process before the display
> function runs. If we're not showing the thread parent, we can't access
> its data at display time. So for this extra thread data, we have to copy
> fields from one process into another in the reading function, and that's
> what "ptb" is for: parent tb.
>
> If you're trying to make sense of this logic, sometimes to figure out
> what should be IN these fields it helps to look at the consumer, I.E.
> the display logic. At display time, the function string_field() (which
> converts a field to a string) uses the typos[] array (see line 337, and
> that TAGGED_ARRAY git commit mentioned earlier) to match each -o THINGY
> field in sequence to figure out what we're displaying (lines 1178 to
> 1193 load the TT.fields list). Then there's a big if/else staircase
> providing additional behavior to them (starting on line 426). The "slot"
> field in each typos[] entry can be negative, which means display one of
> the miscellaneous strings.
>
> This means fetch[5] becomes -7 in typos[x].slot, first because you can't
> have -0 so -1 means fetch[0], and second because the first entry is at
> offset 0 into the tb->str data so -1 doesn't need a slot to record the
> constant 0 -- that's what the comment on line 443 is talking about-- and
> is loaded outside the miscellaneous string loop (it's the "real command
> name" we read in from "stat" when we were initializing the slot[] array,
> see line 643), so in this case fetch[5] is actually -7, which is the
> value used in the -o "NAME" entry in typos (line 349). This also matches
> the fetch bits entry on line 608. According to the help text on line 96,
> NAME is "Process name (argv[0] of $PID)", meaning if we're a thread it's
> our parent's argv[0], and if not it's ours. When we copy it we're
> stripping the path components off (line 761) and only keeping the
> filename part.
>
> ========= back to the action
>
> So what we're doing around line 758 is copying everything after the last
> '/' from fetch[4] (which is "cmdline" on line 607) into fetch[5], and
> those can't overlap even if ptb = tb because tb->offset[5] is after the
> end of tb->offset[4]. (Note, the "256 bytes of reserved space" is not
> just a nice round number, it's also the maximum size a filename can
> have, with null terminator, as enforced by the VFS.)
>
> So the thread stuff we're doing for fetch[5] is fine, but fetch[3] also
> has the "ptb vs tb" check with the possible need to copy data out of the
> parent. We only readlinkat0() (which is just a readlinkat() that's
> guaranteed to null terminate) when ptb is  null, otherwise we copy our
> parent's data (because the thread hasn't got a file to readlink).
>
> But when j == 3, we only go into the else case with the memcpy() when
> ptb _isn't_ null, in which case s = ptb->str + offset and that's a
> pointer to our parent's struct carveup tb memory block. So the memcpy is
> copying from a malloc to toybuf, which can't overlap.
>
> The other potential hiccup is SLOT_argv0len, ala:
>
>           if (!ptb || tb->slot[SLOT_argv0len]) ptb = tb;
>           i = ptb->slot[SLOT_argv0len];
>           s = ptb->str+ptb->offset[4];
>
> Could we be using an improperly initialized argv0len? When ptb is NULL
> but this thread's SLOT_argv0len isn't (which means it's not a thread but
> a process), then we grab "cmdline" and use the argv0len set on line 842
> (by the previous iteration of this loop when j==4). But since
> fetch[4].bits has _PS_NAME in it (line 607), then when we're processing
> fetch[5] we _already_ processed fetch[4] for the same process (I.E. we
> didn't skip it) and thus it set SLOT_argv0len properly for us. (It's the
> last entry in the list with default processing, so as the comment on
> line 841 says it sets the argv0len value that stays. Of course we have
> to be _after_ it in the list to use the value it set when we're _not_
> operating on a thread parent but instead copying from ourselves, which
> is why we're fetch[5].
>
> What argv0len _not_ being 0 says is /proc/self/cmdline existed and
> wasn't empty, in which case we want to use its value.
>
> So I don't see the codepath the address sanitizer's complaining about
> where we copy over ourselves? (I thought it was because we were
> potentially doing an in-place basename where we might copy the filename
> after the / over the earlier path components, and that might overlap.
> But the source is ptb->str plus either ptb->offset[3] or ptb->offset[4],
> and even when ptb _is_ pointing to tb both of those should lower than
> ptb->offset[5] and the length (i) should end before the start of
> ptb->offset[5] (either because the offset[3] strlen() finds the null
> terminated string or because we used slot[SLOT_argv0len] which should be
> less than or equal to the strlen of the "cmdline" data in fetch[4]. (And
> no it's not us failing to account for the null terminator, line 846 adds
> the +1 when advancing buf for that reason.)
>
> So if anybody can tell me what the address sanitizer is complaining
> about, I'm all ears? That said, I can still replace the memcpy() with
> memmove() to avoid triggering what _seems_ like a false positive in your
> address sanitizer, if you like...
>
>> Evgenii Stepanov has uploaded a new change for review.
>>
>> ( https://googleplex-android-review.git.corp.google.com/1504922 )
>
> Which wants a google corporate login.
>
> Rob



More information about the Toybox mailing list