[Aboriginal] [Qemu-devel] [PATCH] Fixing sh4 serial abort

Rob Landley rob at landley.net
Fri Jul 27 10:16:05 PDT 2012


On 07/27/2012 09:32 AM, Peter Maydell wrote:
> On 27 July 2012 14:45, Rob Landley <rob at landley.net> wrote:
>> I.E. sci_getreg(port, SCFCR) move to before checking whether or not
>> we'll ever possibly use the result. SCFCR is 0x18 and QEMU calls abort()
>> on an attempt to read from an unimplemented register.
>>
>> I can patch the kernel to work around this (and probably will for this
>> release), but the _proper_ fix is to get qemu not to abort on a register
>> read that works fine if it just returns 0.
> 
> The thing this analysis is missing is any examination of the question
> "what is the hardware we are modelling documented to do?".

Given that 3.3, 3.4, and 3.5 kernels have already shipped with this, I'm
guessing "not immediately crash"?

Then again I can't really criticize sh4 for multiple kernel releases not
working under qemu and nobody noticing when _arm_ currenty has a similar
problem. The arm versatile board's scsi emulation was broken by linux
commit 4d5fc58dbe34 in 3.4 (yanking the versatile's mach/io.h when the
default one breaks PCI), and then before _that_ got reverted (in commit
9b0f7e399238c6) this commit went in:

commit 1bc39ac5dab265b76ce6e20d6c85f900539fd190
Author: Russell King <rmk+kernel at arm.linux.org.uk>
Date:   Sat Mar 10 11:32:34 2012 +0000

    ARM: PCI: versatile: fix PCI interrupt setup

    This is at odds with the documentation in the file; it says pin 1 on
    slots 24,25,26,27 map to IRQs 27,28,29,30, but the function will
    always be entered with slot=0 due to the lack of swizzle function.
    Fix this function to behave as the comments say, and use the
    standard PCI swizzle.

    Signed-off-by: Russell King <rmk+kernel at arm.linux.org.uk>

So the arm maintainer noticed a place where the code didn't match the
documentation, changed the code to match the docs, and result doesn't
work under qemu's versatile board emulation. But at least 3.5 doesn't
work for a _different_ reason than 3.4 didn't work, so there's that.

I dunno if this is a qemu bug or a kernel bug, but I _do_ know that I'm
far more interested in getting it to work with existing software than
getting it to match the docs.

I hadn't reported this one yet because I still haven't root caused it,
just bisected to find the break. I know reverting the IRQ assignment
line in 3.5 doesn't fix it, which implies the "swizzle" bit is to blame
(which seems ot have something to do with PCI bridges), and thus calling
the default function instead of calling no function breaks qemu's
versatile board emulation.

But that part isn't trivially reverted in 3.5 becauase they "cleaned up"
the code so you now get the default behavior without asking and I
haven't dug deep enough to figure out what's actually going on yet.

> If the hardware
> is documented to have a readable register here, QEMU should be fixed.
> If it is not then the kernel is buggy and should be fixed.

The sh4 Linux architecture is currently maintained by Renesas
developers, and they're just as good a maintainer of the dreamcast-era
stuff as Oracle is of Sun workstations and HP/Compaq was of the Microvax.

They have explicitly told me they aren't interested in anyone who isn't
a Renesas customer, and that since attempting to run sh4 under qemu or
on a dreamcast isn't what they're paid to support, they therefore do not
care about it.

Pointing out to them that they broke older versions of stuff has never
worked for me. You're welcome to try to engage them, but personally I
treat the sh4 changes as read only at this point and carry local patches
to revert the really stupid bits.

> As it happens, the "SH7214 Group, SH7216 Group User’s Manual: Hardware"
> (which I think is the right doc for this) says the register is r/w,
> so I think your suggested patch is correct.

Yay!

> (Aborting is a little unfriendly but our logging infrastructure
> for "guest did something wrong" is not great, unfortunately.)

My first fix was just replacing the abort with assigning the value to 0,
but that left debug output spewing a bit.

> There are an awful lot of "#if 0"s in that source file...

This one was put in by Thiemo Suefer. He didn't get to finish cleaning
up before he died.

I'm aware there's not a huge amount of interest in Linux on sh4, but my
aboriginal linux project is aimed at getting the same basic native
development environment working on as many different architectures as
possible, for things like automated cross-platform regression testing.
(I was thinking "userspace packages", but if I can set up a cron job
doing nightly builds of git maybe some of this crud would be caught
earlier. Pity I have NO free time. Writing this at work over lunch.)

So if qemu emulates it, I'm interested in getting a linux running on it
with a native compiler. Which means I'm perpetually out of my depth, but
oh well...

> -- PMM

Thanks for the review,

Rob
-- 
GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code.
Either it's "mere aggregation", or a license violation.  Pick one.



More information about the Aboriginal mailing list