From: "Zach, Yoav" I work in a group that works on enabling the IA-32 Execution Layer (http://www.intel.com/pressroom/archive/releases/20040113comp.htm) on Linux. In a few words - this is a dynamic translator for IA-32 binaries on IPF platform. Following David Mosberger's advice - we use the binfmt_misc mechanism for the invocation of the translator whenever the user tries to exec an IA-32 binary. The EL is meant to help in the migration path from IA-32 to IPF. From our beta customers we learnt that at first stage - they tend to keep their environment mostly intact, using the legacy IA-32 binaries. Such an environment has, naturally, setuid and non-readable binaries. It will be useless to ask the administrator to change the settings of such an environment - some of them are very complex, and the administrators are reluctant to make any changes in a system that already proved itself to be robust and secure. So, our target with these patches is not to enhance the support for scripts but rather to allow a translator to be integrated into a working environment that is not (and should not be) aware to the fact it's being emulated. As I said before - it is practically hopeless to expect an administrator of such a system to change it so that it will suit the current behavior of binfmt_misc. But, even if we could do that, I'm not sure it would be a good idea - these changes are likely to be less secure than the suggested patches - - In order to execute non-readable binaries the binary will have to be made readable, which is obviously less secure than allowing only a trusted translator to read it - There will be no way for the translator to calculate the accurate AT_SECURE value for the translated process. This might end up with the translated process running in a non-secured mode when it actually needs to be secured. I prepared a patch that solves a couple of problems that interpreters have when invoked via binfmt_misc. currently - 1) such interpreters cannot open non-readable binaries 2) the processes will have their credentials and security attributes calculated according to interpreter permissions and not those of the original binary the proposed patch solves these problems by - 1) opening the binary on behalf of the interpreter and passing its fd instead of the path as argv[1] to the interpreter 2) calling prepare_binprm with the file struct of the binary and not the one of the interpreter The new functionality is enabled by adding a special flag to the registration string. If this flag is not added then old behavior is not changed. A preliminary version of this patch was sent to the list on 9/1/2003 with the title "[PATCH]: non-readable binaries - binfmt_misc 2.6.0-test4". This new version fixes the concerns that were raised by the patch, except of calling unshare_files() before allocating a new fd. this is because this feature did not enter 2.6 yet. Arun Sharma says: We were going through an internal review of this patch: http://marc.theaimsgroup.com/?l=linux-kernel&m=107424598901720&w=2 which is in your tree already. I'm not sure if this line of code got sufficient review. + /* call prepare_binprm before switching to interpreter's file + * so that all security calculation will be done according to + * binary and not interpreter */ + retval = prepare_binprm(bprm); The case that concerns me is: unprivileged interpreter and a privileged binary. One can use binfmt_misc to execute untrusted code (interpreter) with elevated privileges. One could argue that all binfmt_misc interpreters are trusted, because only root can register them. But that's a change from the traditional behavior of binfmt_misc (and binfmt_script). (Update): Arun pointed out that calculating the process credentials according to the binary that needs to be translated is a bit risky, since it requires the administrator to pay extra attention not to register an interpreter which is not intended to run with root credentials. After discussing this issue with him, I would like to propose a modified patch: The old patch did 2 things - 1) open the binary for reading and 2) calculate the credentials according to the binary. I removed the riskier part of changing the credentials calculation, so the revised patch only opens the binary for reading. It also includes few words of warning in the description of the 'open-binary' feature in binfmt_misc.txt, and makes the function entry_status print the flags in use. As for the 'credentials' part of the patch, I will prepare a separate patch for it and send it again to the LKML, describe the problem and ask for people comments. --- 25-akpm/Documentation/binfmt_misc.txt | 23 ++++- 25-akpm/fs/binfmt_misc.c | 141 ++++++++++++++++++++++++++++------ 25-akpm/fs/exec.c | 4 25-akpm/include/linux/binfmts.h | 4 4 files changed, 146 insertions(+), 26 deletions(-) diff -puN Documentation/binfmt_misc.txt~non-readable-binaries Documentation/binfmt_misc.txt --- 25/Documentation/binfmt_misc.txt~non-readable-binaries Mon May 3 15:04:21 2004 +++ 25-akpm/Documentation/binfmt_misc.txt Mon May 3 15:04:21 2004 @@ -15,7 +15,7 @@ First you must mount binfmt_misc: mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc To actually register a new binary type, you have to set up a string looking like -:name:type:offset:magic:mask:interpreter: (where you can choose the ':' upon +:name:type:offset:magic:mask:interpreter:flags (where you can choose the ':' upon your needs) and echo it to /proc/sys/fs/binfmt_misc/register. Here is what the fields mean: - 'name' is an identifier string. A new /proc file will be created with this @@ -34,6 +34,21 @@ Here is what the fields mean: The mask is anded with the byte sequence of the file. - 'interpreter' is the program that should be invoked with the binary as first argument (specify the full path) + - 'flags' is an optional field that controls several aspects of the invocation + of the interpreter. It is a string of capital letters, each controls a certain + aspect. The following flags are supported - + 'P' - preserve-argv[0]. Legacy behavior of binfmt_misc is to overwrite the + original argv[0] with the full path to the binary. When this flag is + included, binfmt_misc will add an argument to the argument vector for + this purpose, thus preserving the original argv[0]. + 'O' - open-binary. Legacy behavior of binfmt_misc is to pass the full path + of the binary to the interpreter as an argument. When this flag is + included, binfmt_misc will open the file for reading and pass its + descriptor as an argument, instead of the full path, thus allowing + the interpreter to execute non-readable binaries. This feature should + be used with care - the interpreter has to be trusted not to emit + the contents of the non-readable binary. + There are some restrictions: - the whole register string may not exceed 255 characters @@ -83,9 +98,9 @@ If you want to pass special arguments to write a wrapper script for it. See Documentation/java.txt for an example. -Your interpreter should NOT look in the PATH for the filename; the -kernel passes it the full filename to use. Using the PATH can cause -unexpected behaviour and be a security hazard. +Your interpreter should NOT look in the PATH for the filename; the kernel +passes it the full filename (or the file descriptor) to use. Using $PATH can +cause unexpected behaviour and can be a security hazard. There is a web page about binfmt_misc at diff -puN fs/binfmt_misc.c~non-readable-binaries fs/binfmt_misc.c --- 25/fs/binfmt_misc.c~non-readable-binaries Mon May 3 15:04:21 2004 +++ 25-akpm/fs/binfmt_misc.c Mon May 3 15:04:21 2004 @@ -39,6 +39,7 @@ static int enabled = 1; enum {Enabled, Magic}; #define MISC_FMT_PRESERVE_ARGV0 (1<<31) +#define MISC_FMT_OPEN_BINARY (1<<30) typedef struct { struct list_head list; @@ -102,10 +103,15 @@ static Node *check_file(struct linux_bin static int load_misc_binary(struct linux_binprm *bprm, struct pt_regs *regs) { Node *fmt; - struct file * file; + struct file * interp_file = NULL; + struct file * binary_file = NULL; char iname[BINPRM_BUF_SIZE]; char *iname_addr = iname; int retval; + int fd_binary = -1; + char fd_str[32]; + char * fdsp = fd_str; + int is_open_bin; retval = -ENOEXEC; if (!enabled) @@ -120,33 +126,91 @@ static int load_misc_binary(struct linux if (!fmt) goto _ret; - allow_write_access(bprm->file); - fput(bprm->file); - bprm->file = NULL; + is_open_bin = (fmt->flags & MISC_FMT_OPEN_BINARY) ? 1 : 0; + + if (is_open_bin) { + /* if the binary should be opened on behalf of the + * interpreter than keep it open and assign descriptor + * to it */ + fd_binary = get_unused_fd (); + if (fd_binary < 0) { + retval = fd_binary; + goto _ret; + } + snprintf (fd_str, sizeof(fd_str) - 1, "%d", fd_binary); + } else { + allow_write_access (bprm->file); + fput (bprm->file); + bprm->file = NULL; + } /* Build args for interpreter */ if (!(fmt->flags & MISC_FMT_PRESERVE_ARGV0)) { remove_arg_zero(bprm); } - retval = copy_strings_kernel(1, &bprm->interp, bprm); - if (retval < 0) goto _ret; - bprm->argc++; - retval = copy_strings_kernel(1, &iname_addr, bprm); - if (retval < 0) goto _ret; - bprm->argc++; + + if (is_open_bin) { + /* make argv[1] be the file descriptor of the binary */ + retval = copy_strings_kernel (1, &fdsp, bprm); + } else { + /* make argv[1] be the path to the binary */ + retval = copy_strings_kernel (1, &bprm->interp, bprm); + } + if (retval < 0) + goto _error; + bprm->argc ++; + retval = copy_strings_kernel (1, &iname_addr, bprm); + if (retval < 0) + goto _error; + bprm->argc ++; bprm->interp = iname; /* for binfmt_script */ - file = open_exec(iname); - retval = PTR_ERR(file); - if (IS_ERR(file)) - goto _ret; - bprm->file = file; + interp_file = open_exec (iname); + retval = PTR_ERR (interp_file); + if (IS_ERR (interp_file)) + goto _error; + + binary_file = bprm->file; + bprm->file = interp_file; retval = prepare_binprm(bprm); - if (retval >= 0) - retval = search_binary_handler(bprm, regs); + + if (retval < 0) + goto _error; + + if (is_open_bin) { + /* if the binary is not readable than enforce mm->dumpable=0 + regardless of the interpreter's permissions */ + if (permission (binary_file->f_dentry->d_inode, MAY_READ, NULL)) { + bprm->interp_flags |= BINPRM_FLAGS_ENFORCE_NONDUMP; + } + /* install the binary's fd. it is done at the latest possible point + * because once it is installed it will need to be sys_close()ed + * in case of error. + */ + fd_install (fd_binary, binary_file); + } + + retval = search_binary_handler (bprm, regs); + + if (retval < 0) + goto _error_close_file; + _ret: return retval; + +_error_close_file: + if (fd_binary > 0) { + sys_close (fd_binary); + fd_binary = -1; + bprm->file = NULL; + } +_error: + if (fd_binary > 0) + put_unused_fd (fd_binary); + bprm->interp_flags = 0; + goto _ret; + } /* Command parsers */ @@ -191,6 +255,29 @@ static int unquote(char *from) return p - from; } +static inline char * check_special_flags (char * sfs, Node * e) +{ + char * p = sfs; + int cont = 1; + + /* special flags */ + while (cont) { + switch (*p) { + case 'P': + p++; + e->flags |= MISC_FMT_PRESERVE_ARGV0; + break; + case 'O': + p++; + e->flags |= MISC_FMT_OPEN_BINARY; + break; + default: + cont = 0; + } + } + + return p; +} /* * This registers a new binary format, it recognises the syntax * ':name:type:offset:magic:mask:interpreter:' @@ -293,10 +380,8 @@ static Node *create_entry(const char *bu if (!e->interpreter[0]) goto Einval; - if (*p == 'P') { - p++; - e->flags |= MISC_FMT_PRESERVE_ARGV0; - } + + p = check_special_flags (p, e); if (*p == '\n') p++; @@ -346,6 +431,7 @@ static void entry_status(Node *e, char * { char *dp; char *status = "disabled"; + const char * flags = "flags: "; if (test_bit(Enabled, &e->flags)) status = "enabled"; @@ -357,6 +443,19 @@ static void entry_status(Node *e, char * sprintf(page, "%s\ninterpreter %s\n", status, e->interpreter); dp = page + strlen(page); + + /* print the special flags */ + sprintf (dp, "%s", flags); + dp += strlen (flags); + if (e->flags & MISC_FMT_PRESERVE_ARGV0) { + *dp ++ = 'P'; + } + if (e->flags & MISC_FMT_OPEN_BINARY) { + *dp ++ = 'O'; + } + *dp ++ = '\n'; + + if (!test_bit(Magic, &e->flags)) { sprintf(dp, "extension .%s\n", e->magic); } else { diff -puN fs/exec.c~non-readable-binaries fs/exec.c --- 25/fs/exec.c~non-readable-binaries Mon May 3 15:04:21 2004 +++ 25-akpm/fs/exec.c Mon May 3 15:04:21 2004 @@ -838,7 +838,8 @@ int flush_old_exec(struct linux_binprm * flush_thread(); if (bprm->e_uid != current->euid || bprm->e_gid != current->egid || - permission(bprm->file->f_dentry->d_inode,MAY_READ, NULL)) + permission(bprm->file->f_dentry->d_inode,MAY_READ, NULL) || + (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP)) current->mm->dumpable = 0; /* An exec changes our domain. We are no longer part of the thread @@ -1087,6 +1088,7 @@ int do_execve(char * filename, bprm.file = file; bprm.filename = filename; bprm.interp = filename; + bprm.interp_flags = 0; bprm.sh_bang = 0; bprm.loader = 0; bprm.exec = 0; diff -puN include/linux/binfmts.h~non-readable-binaries include/linux/binfmts.h --- 25/include/linux/binfmts.h~non-readable-binaries Mon May 3 15:04:21 2004 +++ 25-akpm/include/linux/binfmts.h Mon May 3 15:04:21 2004 @@ -35,9 +35,13 @@ struct linux_binprm{ char * interp; /* Name of the binary really executed. Most of the time same as filename, but could be different for binfmt_{misc,script} */ + unsigned long interp_flags; unsigned long loader, exec; }; +#define BINPRM_FLAGS_ENFORCE_NONDUMP_BIT 0 +#define BINPRM_FLAGS_ENFORCE_NONDUMP (1 << BINPRM_FLAGS_ENFORCE_NONDUMP_BIT) + /* * This structure defines the functions that are used to load the binary formats that * linux accepts. _