[Toybox] [New Toys] - fstype, blkid
Rob Landley
rob at landley.net
Tue Oct 8 10:20:52 PDT 2013
Catching up by <strike>burning the candle</strike> reading the email at
both ends...
On 10/07/2013 06:06:47 AM, Conroy, Bradley Quentin wrote:
> I finally figured out the NTFS labels after reading a rant on how
> UTF-8 rocks
> and how MS switched to UTF16 or UCS1 or whatever.
I read that article. (It's a small twitter stream... :)
> The reason I couldn't grep for the label (mine was "myntfs") was
> that it is stored as "m\0y\0n\0t\0f\0s\0\0" - found another good
> use for hexdump :)
I should add it to toybox. And make -C mode the default (ala diff -u).
And make it share code with hexedit and possibly od.
(My first todo item in that area is figuring out why od gets the
indentation wrong.)
> Notes:
> I only have x86 to test on,
Allow me to introduce you to aboriginal linux system images. Go to:
http://landley.net/aboriginal/bin
Download a system-image of your choice (mips and powerpc are big
endian), extract the tarball, run "./dev-environment.sh", and at the
shell prompt wget source and compile it.
(Note that mips networking is broken with qemu 1.6, you'd need to use
qemu 1.5 for that. Should work on powerpc though.)
More random documation-like stuff at
http://landley.net/aboriginal/about.html
> so there are a couple of places that may need bswap_{16,32} for
> endianness.
My limiting factor here is actually lack of test filesystem images.
> I used a 65k buf instead of toybuf (4k) for simplicity, but tried to
> organize
> it for toybuf if wanted.
Half the file is #defines, and then the first line of actual C code is
a typedef. There may be some more extensive modifications coming than
that.
Ok, convert tabs to two spaces and check that in.
Oh wow. You're making me pull out my tab conversion sed. Haven't used
that in a while...
Ok, yank the typedef. Make function definitions match K&R like
everybody else for the past 30 years, ala:
type function(args)
{
}
(Yes, we don't do that anywhere else but that's because this is
creating a new function and anywhere else isn't.)
You don't need #if CFG_BLKID because blkid.c only gets compiled if
CFG_BLKID is enabled. (If the name of a *.c file under toys/ matches
the name of a config symbol, the C file's inclusion is controlled by
that config symbol.)
You have an if() statement at the left edge, not indented at all within
its function, and then the function ends with:
}else /* fstype */
write(1,fstype,strlen(fstype)); /* avoid printf overhead in
fstype */
putchar('\n');
}
And the _reason_ that works is there's no curly bracket on the else so
the write() belongs to the else but the putchar doesn't. Otherwise the
function wouldn't end. Ouch.
The way to make an alias for a command is the OLDTOY() macro.
If you feed loopfiles() zero arguments, it reads from stdin. So calling
blkid with no arguments hangs awaiting user input instead of printing
its usage message. (Probably you don't want NULL optstring, you want
"<1", at least for the moment.)
Let's see, what have I got lying around:
$ ./toybox blkid ~/qemu/images/tccboot.iso
$ ./toybox blkid ~/qemu/images/rh9.img
$
iso9660 it doesn't know but ext2 it _should_. Oh, duh, that one's a
partitioned image, and it doesn't recognize the partition table. Let's
see...
$ ./toybox blkid ~/system-image-armv5l/hda.sqf
$
Squashfs? Hello?
Sigh. What did I break? Check the previous version... that didn't work
either, and all I did to that was delete the fstype at the end that was
breaking the build. Ah, maybe the "type punned pointer" warnings
actually matter with this compiler version? Lemme build for i686...
Nope, _still_ not identifying squashfs.
By the way, in terms of your 64k buffer (66k buffer, actually): no sane
filesystem is going to have its identifying info straddle 4k blocks, so
we should be able to read 4k chunks and iterate over the list for
offsets in range. (This even avoids lseek, although I'm not sure why
that would be an issue...)
Right, continuing to clean this up until I can make it work. What the
HECK is this nest of MATCH macros calling each other for? (That's where
the type punned pointer warnings come from, anyway...) Ah, it's only
used for ext2/3/4. Because treating ext2, ext3, and ext4 as three
separate filesystems just wouldn't do.
You don't need to strcmp toys.which->name with "blkid", you can just
compare the first character to 'b'. (There are only two options...)
Alright, let's turn this giant stack of #defines and if/else staircase
into a table with a loop iterating over it. Lets make the magic a
uint64_t so we're not ignoring the second half of the btrfs magic
you've got listed there, and let's just use the hex numbers like the
kernel does, ala:
fs/btrfs/ctree.h:#define BTRFS_MAGIC 0x4D5F53665248425FULL /* ascii
_BHRfS_M, no null */
Hmmm, you have a CRAMFS_MAGIC2 but your code doesn't seem to be using
it. (The if is using a MATCH() macro instead of MATCH2().) Ah, the
kernel header says that's the same number at the other endianness.
If JFS isn't even in /usr/linux/include/magic.h is it really an
important filesystem to autodetect?
For NTFS, you have 8 as the label length (well, -8) but toutf8 fills
out a 16 byte buffer? (And it doesn't actually have a length, it just
keeps going until it hits a null terminator which there's no guarantee
the file will have...)
Also, the NTFS label isn't _really_ alternating ascii and NUL bytes.
It's horrible 16 bit wide character stuff that involves "codepages" and
actually displaying labels from japan or korea just isn't going to work
here. (Doing full windows internationalization isn't an option either.
The question is, does the special case for ascii make sense or should
we just not support labels here at all? I'm balancing "2/3 of the
planet does not speak english" with "does android care about legacy
windows crap that's this generation's version of punched cards?" Eh, I
guess "windows was english only, the future is UTF8" is a reasonable
compromise...)
However, add to that the fact that ntfs is the only filesystem that has
a label in a different 4k block than the ID info, and special casing
this really sounds like more trouble than it's worth. Are there a lot
of thumb drives formatted NTFS out in the wild? (I'll add code to deal
with a real world problem, my question is whether this is a real world
problem? No idea.)
Also... ntfs has an 8 bit uuid? What? (It's the only one that does...)
Hang on, this thing doesn't identify vfat? (Which most external USB
devices are formatted with?) Hmmm, I know microsoft's documentation
says not to use the "FAT16" and "FAT32" strings for filesystem
identification, but I don't care.
Ok, printing out the uuid there's three different possible bit-patterns
for where the "-" go, one for 16 (the default), one for 4 (vfat), and
one for 8 (ntfs, no dashes). I think rather than having a separate uuid
length field that's usually 16 I'll encode the non-16 values in the top
few bits of the offset, since I've got an int. (Offset already won't
fit in a short.)
Hmmm, in testing FAT's uuid bytes are presented in reverse order from
the tool ubuntu's using. But ext2 isn't...
Need test images. Lots and lots of test images...
> I have info on more fs types, to patch with after review.
I don't know what fs types count as "interesting". You have BFS which
isn't in /usr/include/linux/magic.h, but don't have fat16 or fat32.
> blkid does output for all devices if 0 args -> read /proc/partitions?
Possibly. (You can run the other one under strace to see what it's
doing.)
Rob
1381252852.0
More information about the Toybox
mailing list