[Aboriginal] Merry christmas, I have found two bugs

Bjørn Forsman bjorn.forsman at gmail.com
Wed Dec 26 14:23:06 PST 2012


On 26 December 2012 22:36, Rob Landley <rob at landley.net> wrote:
> On 12/26/2012 02:55:04 PM, Bjørn Forsman wrote:
[...]
>> And now I think I've found two bugs. Bug 1 is
>> a segmentation fault during system image boot:
>>
>> $ cd build/system-image-avmv5l
>> $ ./run-emulator.sh
>> [...]
>> Freeing init memory: 96K
>> Segmentation fault
>> 8139cp 0000:00:0c.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
>> Not using distcc.
>> Type exit when done.
>> (armv5l:1) /home #
>>
>> Not sure what exactly is segfaulting, but segfault == bug to me. And
>> it only happens with ./run-emulator.sh, not ./dev-environment.sh.
>
> There are somewhat different code paths in the init.sh shell script so it's
> probably something only running in the first case. I need to rebuild the
> armv5l target to test this, that'll take a few minutes...
>
>> And bug 2 is a floating point exception in toybox "ls":
>> $ ./dev-environment.sh
>> [...]
>> Freeing init memory: 96K
>> 8139cp 0000:00:0c.0 eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
>> EXT4-fs (sdb): mounting ext3 file system using the ext4 subsystem
>> EXT4-fs (sdb): recovery complete
>> EXT4-fs (sdb): mounted filesystem with ordered data mode. Opts: (null)
>> Distcc acceleration enabled.
>> Type exit when done.
>> (armv5l:1) /home # ls
>> Floating point exception
>> (armv5l:1) /home # toybox ls
>> Floating point exception
>> (armv5l:1) /home # ls -l
>> total 24
>> drwx------ 2 root root 12288 2012-12-26 20:39 lost+found
>> (armv5l:1) /home #
>
> *boggle* That's... very strange.
>
> (Also, I didn't think ls was using floating point? Quick check says it's
> not...)

Neither did I. But wait... I just read up on SIGFPE and it's not
necessarily a floating point error, it's an arithmetic error. For
example, division by zero yields SIGFPE (just checked it).

>> The really strange thing (at least to me) is that the floating point
>> exception only appears in ./dev-environment.sh, NOT ./run-emulator.sh.
>> And it's only "ls" that is broken, "ls -l" is ok.
>
> Very wierd. Lemme see if I can reproduce that... yes I can. And wow, you're
> right, it's only happening for dev-environment.sh which should be TOTALLY
> unrelated...
>
>   wget http://landley.net/aboriginal/strace-armv5l

That returns 404 error code for me (but ignore that).

>   chmod +x strace-armv5l
>   ./strace /bin/ls
>
> execve("/bin/ls", ["/bin/ls"], [/* 9 vars */]) = 0
> brk(0)                                  = 0x932000
> brk(0x9324b0)                           = 0x9324b0
> set_tls(0x932490, 0x5124c, 0, 0x1, 0x65fac) = 0
> ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,
> {B38400 opost isig icanon echo ...}) = 0
> ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,
> {B38400 opost isig icanon echo ...}) = 0
> getuid32()                              = 0
> geteuid32()                             = 0
> brk(0x9334b0)                           = 0x9334b0
> brk(0x934000)                           = 0x934000
> umask(0)                                = 022
> umask(022)                              = 0
> ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS,
> {B38400 opost isig icanon echo ...}) = 0
> ioctl(0, TIOCGWINSZ, {ws_row=0, ws_col=0, ws_xpixel=0, ws_ypixel=0}) = 0
> ioctl(1, TIOCGWINSZ, {ws_row=0, ws_col=0, ws_xpixel=0, ws_ypixel=0}) = 0
> ioctl(2, TIOCGWINSZ, {ws_row=0, ws_col=0, ws_xpixel=0, ws_ypixel=0}) = 0
> newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=1024, ...}, 0) = 0
> open(".", O_RDONLY|O_LARGEFILE)         = 3
> dup(3)                                  = 4
> fstat64(4, {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
> fcntl64(4, F_GETFL)                     = 0x20000 (flags
> O_RDONLY|O_LARGEFILE)
> getdents64(4, /* 4 entries */, 1024)    = 120
> newfstatat(4, ".", {st_mode=S_IFDIR|0755, st_size=1024, ...},
> AT_SYMLINK_NOFOLLOW) = 0
> newfstatat(4, "strace-armv5l", {st_mode=S_IFREG|0755, st_size=329448, ...},
> AT_SYMLINK_NOFOLLOW) = 0
> newfstatat(4, "..", {st_mode=S_IFDIR|0755, st_size=163, ...},
> AT_SYMLINK_NOFOLLOW) = 0
> newfstatat(4, "lost+found", {st_mode=S_IFDIR|0700, st_size=12288, ...},
> AT_SYMLINK_NOFOLLOW) = 0
> getdents64(4, /* 0 entries */, 1024)    = 0
> close(4)                                = 0
> gettid()                                = 49
> tgkill(49, 49, SIGFPE)                  = 0
> --- SIGFPE {si_signo=SIGFPE, si_code=SI_TKILL, si_pid=49, si_uid=0} ---
> +++ killed by SIGFPE +++
> Floating point exception
>
> So it's getting a ways into ls. Looks like it manages the whole dirtree and
> then dies on the way back. Hmmm...
>
> Thanks for the heads up, I'm going to chew on these for a bit...

Cool. And thanks for the very quick response!

I tried building toybox for my development host. No problem with "ls" there.

Now when I get toybox rebuilt with debug symbols I can figure out
(from the core file) where the bug is. Could it be some wrapping of
integer types on arm (that do not happen in x86) that cause division
by zero?! I'm curious....

Best regards,
Bjørn Forsman

 1356560586.0


More information about the Aboriginal mailing list