[Toybox] implementation of stat for toybox

Rob Landley rob at landley.net
Fri Dec 7 16:14:48 PST 2012


On 12/06/2012 04:24:59 AM, Kamlendu Shekhar wrote:
> Hi Rob,
> 
> I tried to implement stat command as per toybox's standards.
> Please find it as attachment.
> 
> It has been tested on x86 CentOS 32 bit and ubuntu 64 bit systems.  
> Option
> "-Z" has not been implemented as of now. I am working on that.
> 
> Can you please have a look if it can be accepted by community.
> Please share your comment if you want me to change it.

Well first of all I compiled it and it didn't work, possibly because  
this C file isn't including sys/vfs.h and neither is toys.h so there's  
no struct statfs defined. I can fix that up, I'm just wondernig how it  
worked for you. (Centos headers auto-including stuff from each other, I  
guess?)

Let's see:

------quoted attachment "stat.c"------
> /* vi: set sw=4 ts=4
> *
> * stat.c : display file or file system status
> * anand.sinha85 at gmail.com
> * Copyright 2012 <warior.linux at gmail.com>
> *

What's the first email address for? And neither actually has your name  
(which is kinda important in a copyright statement, it's saying who  
it's by).

> * See  
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/stat.html

File not found.

According to the roadmap this one is from the 'development' group  
meaning it's in neither posix nor LSB but linux from scratch needs it  
to finish building. That's why
http://landley.net/toybox/status.html links to a man page for it, not a  
standards page.

> USE_STAT(NEWTOY(stat, "LZfc", TOYFLAG_BIN))
> config STAT
>     bool st
>     default y
>     help
>         Usage: stat [OPTION] FILE...
>         display file or file system status
>         -Z, --context
>              print the security context information if available

Toybox hasn't got any selinux support yet. I don't know if selinux  
support is even a good idea. (I note it's creeped into android,  
presumably because the idiots at Red Hat gave somebody money and they  
overrode all the engineers.)

SELINUX is a bit like spraying a boat with WD-40 and calling it a  
submarine.

>         -f, --file-system
>             display file system status instead of file status
>         -c  --format=FORMAT
>             use the specified FORMAT instead of the default; output a  
> newline after each use of FORMAT
>         --help display this help and exit
>         The valid format sequences for files (without --file-system):
>         %a     Access rights in octal
>         %A     Access rights in human readable form
>         %b     Number of blocks allocated (see
>         %B     The size in bytes of each block reported by
>         %d     Device number in decimal
>         %D     Device number in hex
>         %f     Raw mode in hex
>         %F     File type
>         %G     Group name of owner
>         %h     Number of hard links
>         %i     Inode number
>         %n     File name
>         %N     Quoted file name with dereference if symbolic link
>         %o     I/O block size
>         %s     Total size, in bytes
>         %t     Major device type in hex
>         %T     Minor device type in hex
>         %u     User ID of owner
>         %U     User name of owner
>         %x     Time of last access
>         %X     Time of last access as seconds since Epoch
>         %y     Time of last modification
>         %Y     Time of last modification as seconds since Epoch
>         %z     Time of last change
>         %Z     Time of last change as seconds since Epoch
> */

Remarkably complicated command.

> #define FOR_stat
> #include "toys.h"
> #define SIZE_DATE_TIME_STAT 36
> #define access_string(x, s, i)  if((x&7) & 1)           \
>                                 s[9 - i * 3] = 'x';       \
>                            else                             \
>                                s[9 - i * 3] = '-';             \
>                            if(((x&7) >> 1) & 1)                 \
>                                s[9 - (i * 3 + 1)] = 'w';            \
>                            else                     \
>                                s[9 - (i * 3 + 1)] = '-';            \
>                            if(((x&7) >> 2) & 1)     \
>                                s[9 - (i * 3 + 2)] = 'r';            \
>                            else                     \
>                                s[9 - (i * 3 + 2)] = '-';

I am not happy with the existence of this macro.

I'm also confused by the use of it: get_access_str() defines a local  
variable with the same name as the macro (this apparently works because  
the macro only triggers when it's got parentheses after it), and then  
there's exactly one use of the macro, so it might as well just be  
inline...?

> static char * check_type_file(mode_t, size_t);
> static char * get_access_str(unsigned long, mode_t);
> static char * date_stat_format(time_t );
> static int do_stat(const char *);
> static int do_statfs(const char *);
> inline void print_stat_format(char *, int);
> 
> GLOBALS(
> 	char * access_str;
> 	char * file_type;
> 	struct passwd * user_name; struct group * group_name; struct tm  
> *time_toy;
> 	struct stat * toystat;
> 	struct statfs * toystatfs;
> 	int toy_obj_file_arg;
> )
> 
> 
> static int do_stat(const char * file_name){
>     TT.toystat = (struct stat*)malloc(sizeof(struct stat));
>     if(stat(file_name, TT.toystat) < 0){
>         perror_msg("Error: unable to get information about the file,  
> stat\n", file_name);

I note that perror_msg() will do something like:

   printf("%s:%s:%s", command_name, your_message, strerror(errno))

So you get something like ":invalid access" appended to the end  
(appropriately localized into the native language). This means we  
probably don't have to say we couldn't access the file because there's  
going to be a message about that, we just have to give the filename.

I've been keeping the hardwired english messages intentionally terse  
because of this, and trying to stick to words like "bad" which are A)  
extremely short, B) easy to remember even for a non-english speaker, C)  
repeated often enough that people will only have to look up a small  
number of words.

>         toys.exitval = EXIT_FAILURE;
>     }
>     return 0;
> }
> 
> static int do_statfs(const char * file_name){
>     TT.toystatfs = (struct statfs *)malloc(sizeof(struct statfs));
>     if(statfs(file_name, TT.toystatfs) < 0){
>         perror_msg("Error: unable to get information about the file,  
> statfs\n", file_name);

Same as above. And it's more useful to say the filename here if there's  
three or four on the command line and only one fails, or if it fails in  
a script...

   perror_msg("statfs '%s'", file_name);

(In this case the command has an awful lot of hardwired english text  
built into it.

>         toys.exitval = EXIT_FAILURE;
>     }
>     return 0;
> }
> 
> static char * check_type_file(mode_t mode, size_t size){
>     if(S_ISREG(mode)){
>         if(size)
>             return "regular file";
>             return "regular empty file";
>     }
>     if(S_ISDIR(mode))
>         return "directory";
>     if(S_ISCHR(mode))
>         return "character device";
>     if(S_ISBLK(mode))
>         return "block device";
>     if(S_ISFIFO(mode))
>         return "FIFO (named pipe)";
>     if(S_ISLNK(mode))
>         return "symbolic link";
>     if(S_ISSOCK(mode))
>         return "socket";
> }

I'm tempted to try to reuse the endtype() code from ls here, possibly  
have something in lib/lib.c that returns a type number 0-6 and then  
have ls use that to index a string of "/@*|=" and have this use an  
array of char **blah ={"file", "directory"...

But that's cleanup for later...

> static char * get_access_str(unsigned long pernission, mode_t mode){
>     static char access_string[10];
>     int i;
>     if(S_ISDIR(mode))
>         access_string[0] = 'd';
>     else
>         access_string[0] = '-';
>     for(i = 0; i < 3; i++){
>         access_string(pernission >> (i * 3) & 7, access_string, i);
>     }
>     access_string[10] = '\0';
>     return access_string;
> }

Mentioned above. And now that I look more closely, this is the same  
general mode logic as ls -l so I should definitely move that to  
lib/lib.c and combine the uses...

> static char * date_stat_format(time_t time){
>     static char buf[SIZE_DATE_TIME_STAT];
>     strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S.000000000",  
> localtime(&time));
>     return buf;
> }

The host implementation doesnt' do .000000000, it actually shows  
nanoseconds. I believe that's %N these days.

> inline void print_stat_format(char *format, int flag){
>     format++;
>     switch(*format){
>         case 'a':
>             if(flag)
>                 xprintf("%lu\n", TT.toystatfs->f_bavail);
>             else
>                 xprintf("%04lo\n",TT.toystat->st_mode &  
> (S_ISUID|S_ISGID|S_ISVTX|S_IRWXU|S_IRWXG|S_IRWXO));
>             break;
>         case 'A':
>             xprintf("%s\n",TT.access_str);
>             break;
>         case 'b':
>             if(flag)
>                 xprintf("%lu\n", TT.toystatfs->f_blocks);
>             else
>                 xprintf("%llu\n", TT.toystat->st_blocks);
>             break;
>         case 'B':
>             xprintf("%lu\n", TT.toystat->st_blksize);
>             break;
>         case 'c':
>             if(flag)
>                 xprintf("%lu\n", TT.toystatfs->f_files);
>             break;
>         case 'C':
>             xprintf("Currently feature is not supported\n");
>             break;
>         case 'd':
>             if(flag)
>                 xprintf("%lu\n", TT.toystatfs->f_ffree);
>             else
>                 xprintf("%ldd\n", TT.toystat->st_dev);
>             break;
>         case 'D':
>             xprintf("%llxh\n", TT.toystat->st_dev);
>             break;
>         case 'f':
>             if(flag)
>                 xprintf("%lu\n", TT.toystatfs->f_bfree);
>             else
>                 xprintf("%lx\n", TT.toystat->st_mode);
>             break;
>         case 'F':
>             xprintf("%s\n", TT.file_type);
>             break;
>         case 'g':
>             xprintf("%lu\n", TT.toystat->st_uid);
>             break;
>         case 'G':
>             xprintf("%8s\n", TT.user_name->pw_name);
>             break;
>         case 'h':
>             xprintf("%lu\n", TT.toystat->st_nlink);
>             break;
>         case 'i':
>             if(flag)
>                 xprintf("%d%d\n", TT.toystatfs->f_fsid.__val[0],  
> TT.toystatfs->f_fsid.__val[1]);
>             else
>                 xprintf("%llu\n", TT.toystat->st_ino);
>             break;
>         case 'l':
>             if(flag)
>                 xprintf("need to implement\n");
>             break;
>         case 'n':
>             xprintf("%s\n", toys.optargs[TT.toy_obj_file_arg]);
>             break;
>         case 'N':
>             xprintf("`%s\n'", toys.optargs[TT.toy_obj_file_arg]);
>             break;
>         case 'o':
>             xprintf("%lu\n", TT.toystat->st_blksize);
>             break;
>         case 's':
>             if(flag)
>                 xprintf("%d\n", TT.toystatfs->f_frsize);
>             else
>                 xprintf("%llu\n", TT.toystat->st_size);
>             break;
>         case 'S':
>             if(flag)
>                 xprintf("%d\n", TT.toystatfs->f_bsize);
>             break;
>         case 't':
>             if(flag)
>                 xprintf("%lx\n", TT.toystatfs->f_type);
>             break;
>         case 'T':
>             if(flag)
>                 xprintf("Needs to be implemented\n");
>             break;
>         case 'u':
>             xprintf("%lu\n", TT.toystat->st_uid);
>             break;
>         case 'U':
>             xprintf("%8s\n", TT.user_name->pw_name);
>             break;
>         case 'x':
>             xprintf("%s\n", date_stat_format(TT.toystat->st_atime));
>             break;
>         case 'X':
>             xprintf("%llu\n", TT.toystat->st_atime);
>             break;
>         case 'y':
>             xprintf("%s\n", date_stat_format(TT.toystat->st_mtime));
>             break;
>         case 'Y':
>             xprintf("%llu\n", TT.toystat->st_mtime);
>             break;
>         case 'z':
>             xprintf("%s\n", date_stat_format(TT.toystat->st_ctime));
>             break;
>         case 'Z':
>             xprintf("%llu\n", TT.toystat->st_ctime);
>         default:
>             xprintf("%c\n", *format);
>             break;
>     }
>     exit(0);
> }

Ok, there has _got_ to be a way to turn that into table driven logic.

But again, that's a later cleanup.

Um... exit(0)? That ignores toys.exitval, the stat command can take  
more than one input file, and... I think that's leftover debug code?

> void stat_main(void){
>     int stat_flag_Z = 0, stat_flag_f = 0, stat_flag_c = 0,  
> stat_format = 0;
>     if(toys.optargs){
>         if(toys.optflags & 1){

There are automatically generated FLAG_ macros now. See commit 674,  
which included an update to code.html describing them:

   http://landley.net/hg/toybox/rev/674#l55.71

>             stat_flag_c = 1;
>             TT.toy_obj_file_arg = 1;
>             stat_format = 1;
>         }
>         if(toys.optflags & (1 << 1)){
>             stat_flag_f = 1;
>             if(do_statfs(toys.optargs[TT.toy_obj_file_arg]) != 0)
>                 xprintf("Error STATFS\n");
>         }else
>             if(do_stat(toys.optargs[TT.toy_obj_file_arg]) != 0)
>                 xprintf("Error STAT\n");
>         if(toys.optflags & (1 << 2)){
>             stat_flag_Z = 1;
>             xprintf("SELinux feature has not been implemented so  
> far..\n");
>         }
>     }

These xprintf() calls A) go to stdout instead of stderr (the x part  
just aborts if stdout can't accept the output due to -EPIPE or device  
full or something), B) don't set the error return value, C) don't stop  
the program.

I've pondered having perror_msg() set the error return value itself, I  
just haven't yet because there might be instances where we _don't_ want  
that. It's one of those "at some point I need to spend half an hour on  
this" things that tend to accumulate on my todo list...

> // function to check the type/mode of file
>     if(!stat_flag_f){
>         TT.file_type = check_type_file(TT.toystat->st_mode,  
> TT.toystat->st_size);
> // check user and group name
>         TT.user_name = getpwuid(TT.toystat->st_uid);
>         TT.group_name = getgrgid(TT.toystat->st_gid);
> // function to get access in human readable format
>         TT.access_str = get_access_str((TT.toystat->st_mode &  
> (S_ISUID|S_ISGID|S_ISVTX|S_IRWXU|S_IRWXG|S_IRWXO)),  
> TT.toystat->st_mode);
>         TT.time_toy = gmtime(&(TT.toystat->st_atime));
>     }

Why are these globals if you're just going to print them later in the  
same function?

>     if(!(stat_flag_f |stat_flag_Z)){
>         if(stat_format)
>             print_stat_format(toys.optargs[0], stat_flag_f);
>         xprintf(" File: `%s'\n", toys.optargs[TT.toy_obj_file_arg]);
>         xprintf(" Size: %llu\t Blocks: %llu\t IO Blocks: %lu\t",  
> TT.toystat->st_size, TT.toystat->st_blocks, TT.toystat->st_blksize);
>         xprintf("%s\n", TT.file_type);
>         xprintf("Device: %llxh\t Inode: %llu\t Links: %lu\n",  
> TT.toystat->st_dev, TT.toystat->st_ino, TT.toystat->st_nlink);
>         xprintf("Access: (%04lo/%s)\tUid: (%lu/%8s)\tGid:  
> (%lu/%8s)\n", (TT.toystat->st_mode &  
> (S_ISUID|S_ISGID|S_ISVTX|S_IRWXU|S_IRWXG|S_IRWXO)), TT.access_str,  
> TT.toystat->st_uid, TT.user_name->pw_name, TT.toystat->st_gid,  
> TT.group_name->gr_name);
>         xprintf("Access: %s\nModify: %s\nChange: %s\n",  
> date_stat_format(TT.toystat->st_atime),  
> date_stat_format(TT.toystat->st_mtime),  
> date_stat_format(TT.toystat->st_ctime));
>     }else if(stat_flag_f){
>         // implementation of statfs -f, file system
>         if(stat_format)
>             print_stat_format(toys.optargs[0], stat_flag_f);

First two lines of both if statements are identical, might as well move  
them outside the if and have one. (I'm aware that stat_flag_Z means  
produce no output, but I'm not sure that's intentional.)

>         xprintf(" File: \"%s\"\n", toys.optargs[TT.toy_obj_file_arg]);

Also the same as above, unless the single quote vs double quote is  
significant?

>         xprintf("   ID: %d%d Namelen: %ld    Type: %lx\n",  
> TT.toystatfs->f_fsid.__val[0], TT.toystatfs->f_fsid.__val[1],  
> TT.toystatfs->f_namelen, TT.toystatfs->f_type);
>         xprintf("Block Size: %d    Fundamental block size: %d\n",  
> TT.toystatfs->f_bsize, TT.toystatfs->f_frsize);
>         xprintf("Blocks: Total: %lu\t", TT.toystatfs->f_blocks);
>         xprintf("Free: %lu\t", TT.toystatfs->f_bfree);
>         xprintf("Available: %lu\n", TT.toystatfs->f_bavail);
>         xprintf("Inodes: Total: %lu\t", TT.toystatfs->f_files);
>         xprintf("\tFree: %d\n", TT.toystatfs->f_ffree);

I suspect agglomerating this into a single xprintf() might be more  
readable.

We're grabbing stuff out of files called __val? Really? (Checks man  
page... Ok, __SWORD_TYPE is _long_. There's a standard for this, LP64,  
that says long and pointer are the same on all Linux systems, which is  
what this does. [Writes email to Michael kerrisk to correct man  
page...])

Rob
 1354925688.0


More information about the Toybox mailing list