[Toybox] [New Toys] - fstype, blkid

Conroy, Bradley Quentin bconroy at uis.edu
Wed Oct 9 00:40:07 PDT 2013


Thanks for the feedback Rob.

> Allow me to introduce you to aboriginal linux system images. Go to:

I've used it before with x86 qemu, I'll have to build qemu v1.5

> My limiting factor here is actually lack of test filesystem images.

That was mine too, but now I have ext{2,3,4}, f2fs, minix, msdos, ntfs,
reiser3, squash{3,4} and vfat.  They compress down to a few kb with
bz2.

>Ok, convert tabs to two spaces and check that in.

done

>Ok, yank the typedef. Make function definitions match K&R like
>everybody else for the past 30 years, ala:

Damn, that's how I had it to begin with.

>You don't need #if CFG_BLKID because blkid.c only gets compiled if
>CFG_BLKID is enabled. (If the name of a *.c file under toys/ matches
>the name of a config symbol, the C file's inclusion is controlled by
>that config symbol.)

I modified it so blkid depends on fstype

>You have an if() statement at the left edge, not indented at all within
>its function, and then the function ends with:
>...
>And the _reason_ that works is there's no curly bracket on the else so
>the write() belongs to the else but the putchar doesn't. Otherwise the
>function wouldn't end. Ouch.

I added {} to the fstype write, the putchar is shared, blkid needs it too,
but I can add it to the end of the TYPE="" and enclose the putchar in
the fstype section

>The way to make an alias for a command is the OLDTOY() macro.
>...
>If you feed loopfiles() zero arguments, it reads from stdin. So calling
>blkid with no arguments hangs awaiting user input instead of printing
>its usage message. (Probably you don't want NULL optstring, you want
>"<1", at least for the moment.)

USE_FSTYPE(NEWTOY(grep, "<1" , TOYFLAG_BIN))
USE_BLKID(OLDTOY(blkid,fstype, OPTSTR_fstype, TOYFLAG_BIN))

> Squashfs? Hello?

I'll test it against your aboriginal images.

>By the way, in terms of your 64k buffer (66k buffer, actually): no sane
>filesystem is going to have its identifying info straddle 4k blocks, so
>we should be able to read 4k chunks and iterate over the list for
>offsets in range. (This even avoids lseek, although I'm not sure why
>that would be an issue...)

NTFS label does but it is also has other "p\0r\0o\0b\0l\0e\0m\0s\0"
 - Add it as a config option?

>Right, continuing to clean this up until I can make it work. What the
>HECK is this nest of MATCH macros calling each other for? (That's where
>the type punned pointer warnings come from, anyway...) Ah, it's only
>used for ext2/3/4. Because treating ext2, ext3, and ext4 as three
>separate filesystems just wouldn't do.

I had everything already in #defines and all of the ops were bit ops or
casts, so a function didn't make sense at the time.  After considering
the different endianess and strict aliasing, I am thinking it may be
simpler to: if (!memcmp(&toybuf, magic, sizeof(magic))) ... possibly
using the SWAP_LE*(x) macros, (will have to test on qemu)

>You don't need to strcmp toys.which->name with "blkid", you can just
>compare the first character to 'b'. (There are only two options...)

Oops, thats kind of embarassing.

>Alright, let's turn this giant stack of #defines and if/else staircase
>into a table with a loop iterating over it.

If it were just magic @ offsets, that would work, but then you have
ext 2/3/4 that use the same magic at the same offset and then there's
vfat/fat32/fat16/fat12/msdos that has at least 14 different possible 
magic values.

> Lets make the magic a uint64_t so we're not ignoring the second
> half of the btrfs magic
>you've got listed there, and let's just use the hex numbers like the
>kernel does, ala:
>
>fs/btrfs/ctree.h:#define BTRFS_MAGIC 0x4D5F53665248425FULL /* ascii
>_BHRfS_M, no null */

the superblock magic for btrfs in magic.h is at 64k IIRC, which brings
up a point, should I just #include <linux/magic.h> or are we trying
to stay platform neutral

I was splitting the fstype data apart from the blkid-only data so uuid and
label data would compile out if only fstype was configured.  So that would
leave us with:
{
  char * type;
  unsigned char magic[4];
  unsigned mag_off:12;
  unsigned uuid_off:12;
  unsigned lab_off:12;
  unsigned lab_sz:6;
  unsigned mag_sz:4;
  unsigned uuid_t:2;
}
Thats only 14 bytes each, only 4 bytes is extra for blkid, but some types
would need multiple entries

If I were a bit more clever, I could probably pull off layering a union of
padded structs over toybuf like:
union fs_t{
  struct adfs_t adfs;
  ...
  struct zfs_t zfs;
} *fs = (union fs_t *)&buf

and pad out each fs struct like:
struct adfs {
  unsigned char pad[0xc00];
  unsigned short magic;
  /* more padding and uuid, label if they exist */
}

Am I incorrect in assuming that this would take 0 extra RAM?

this would simplify things to:
read (fd, &toybuf, sizeof(struct fs_t));
if (fs->adfs.magic==ADFS_SUPER_MAGIC){
  fstype="adfs";
  if(CFG_BLKID) //...
}else if ... //similar for most of the rest
lseek, read repeat

I already have most of this work done, but discarded the idea because
figuring out the padding for structs in 65k was a bit tedious, 4k is much
more manageable.

>Hmmm, you have a CRAMFS_MAGIC2 but your code doesn't seem to be using
>it. (The if is using a MATCH() macro instead of MATCH2().) Ah, the
>kernel header says that's the same number at the other endianness.

I actually meant to use MATCH2 for that.

>If JFS isn't even in /usr/linux/include/magic.h is it really an
>important filesystem to autodetect?

My process at 5AM is pretty random, I was actually meaning to find jffs,
but by the time I realized they weren't the same thing it was already
done.

>For NTFS, you have 8 as the label length (well, -8) but toutf8 fills
>out a 16 byte buffer? (And it doesn't actually have a length, it just
>keeps going until it hits a null terminator which there's no guarantee
>the file will have...)

$#@7 all the '\0's after adapting a simple strcpy will add  i<16 &&

>Also, the NTFS label isn't _really_ alternating ascii and NUL bytes.
>It's horrible 16 bit wide character stuff that involves "codepages" and
>actually displaying labels from japan or korea just isn't going to work
>here. (Doing full windows internationalization isn't an option either.
>The question is, does the special case for ascii make sense or should
>we just not support labels here at all? I'm balancing "2/3 of the
>planet does not speak english" with "does android care about legacy
>windows crap that's this generation's version of punched cards?" Eh, I
>guess "windows was english only, the future is UTF8" is a reasonable
>compromise...)
>
>However, add to that the fact that ntfs is the only filesystem that has
>a label in a different 4k block than the ID info, and special casing
>this really sounds like more trouble than it's worth. Are there a lot
>of thumb drives formatted NTFS out in the wild? (I'll add code to deal
>with a real world problem, my question is whether this is a real world
>problem? No idea.)
>
>Also... ntfs has an 8 bit uuid? What? (It's the only one that does...)

I don't know if wprintf would work for that or not, never really used it.
Since I am nixing the buf for toybuf and ntfs spans over 4k, I think I'll
leave out the ntfs label chunks in the next revision and plan to add it
as a config option.  I think there are 1 or 2 others that use 8 byte uuids.

>Hang on, this thing doesn't identify vfat? (Which most external USB
>devices are formatted with?) Hmmm, I know microsoft's documentation
>says not to use the "FAT16" and "FAT32" strings for filesystem
>identification, but I don't care.

If you are good with that, I am.  That's what I have as a stub, but
there are a ton of other checks that seem superfluous and haven't
had the time/resources to track down.  All of my thumb drives seem
to work with that.  

>Ok, printing out the uuid there's three different possible bit-patterns
>for where the "-" go, one for 16 (the default), one for 4 (vfat), and
>one for 8 (ntfs, no dashes). I think rather than having a separate uuid
>length field that's usually 16 I'll encode the non-16 values in the top
>few bits of the offset, since I've got an int. (Offset already won't
>fit in a short.)

I get tripped up with bswap*, but all those %X values could probably be
reduced to 4 (or 5 I guess, unless there is a 6 byte type)

>Hmmm, in testing FAT's uuid bytes are presented in reverse order from
>the tool ubuntu's using. But ext2 isn't...

>Need test images. Lots and lots of test images...

I will attach a 30kb tarball with the ones I have in a separate post.

>> blkid does output for all devices if 0 args -> read /proc/partitions?
>
>Possibly. (You can run the other one under strace to see what it's
>doing.)

strace says it is looking in /dev and running getdents64... I am guessing
that even embedded is using devtmpfs these days, but I know puppy
linux (some versions) still has a static /dev, so blkid takes a while.  My
gut says /proc/partitions is the better way but then again procfs is a
configurable option.  blkid /dev/sd* (or whatever) should work for them. 

Rob

 1381304407.0


More information about the Toybox mailing list