Mctrain's Blog

What I learned in IT, as well as thought about life

Read/Write From/to File in Kernel

Read/Write from/to File in Linux Kernel


Excerpt from here

  • The kernel is not a process. A file-descriptor needs a process-context for it to mean anything. Otherwise how would the kernel keep your STDIN_FILENO separate from somebody else’s STDIN_FILENO?
  • Coding a kernel module is not like coding a user-mode program. You should never write a module that requires reading or writing to any logical device.
  • If you need to get “outside world” information into your module, it’s easy. Your module can have code for open(), read(), write(), ioctl(), and close(). A user-mode program can open() the device and perform any kind of device-specific ioctl() (or read or write or whatever) that it wants.

That said, it is possible to do file I/O in the kernel, but doing so is a severe violation of standard practice. It is also complicated and can lead to races and crashes if, for instance, a file is removed while your module has it open.


Excerpt from here

The most common problem is interpreting the data. Writing a file interpreter from within the kernel is a process ripe for problems, and any errors in that interpreter can cause devastating crashes. Also, any errors in the interpreter could cause buffer overflows. These might allow unprivileged users to take over a machine or get access to protected data, such as password files.

Another big issue with trying to read a file from within the kernel is trying to figure out exactly where the file is. Linux supports filesystem namespaces, which allow every process to contain its own view of the filesystem. This allows some programs to see only portions of the entire filesystem, while others see the filesystem in different locations. This is a powerful feature, and trying to determine that your module lives in the proper filesystem namespace is an impossible task.

But How Do I Configure Things?

A common way of sending data to a specific kernel module is to

use a char device and the ioctl system call.

The ioctl command, however, has been determined to have a lot of nasty side affects, and creating new ioctls in the kernel generally is frowned on. Also, trying properly to handle a 32-bit user-space program making an ioctl call into a 64-bit kernel and converting all of the data types in the correct manner is a horrible task to undertake.

Because ioctls are not allowed,

the /proc filesystem can be used to get configuration data into the kernel.

By writing data to a file in the filesystem created by the kernel module, the kernel module has direct access to it. Recently, though, the proc filesystem has been clamped down on by the kernel developers, as it was horribly abused by programmers over time to contain almost any type of data. Slowly this filesystem is being cleaned up to contain only process information, such as the names of filesystem states.

For a more structured filesystem,

the sysfs filesystem provides a way for any device and any driver to create files to which configuration data may be sent.

This interface is preferred over ioctls and using /proc.

I Want to Do This Anyway

The common approach to reading a file is to try code that looks like the following:

1
2
3
4
5
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
  /* read the file here */
  sys_close(fd);
}

However, when this is tried within a kernel module, the sys_open() call usually returns the error -EFAULT.

The main thing the author forgot to take into consideration is the kernel expects the pointer passed to the sys_open() function call to be coming from user space. So, it makes a check of the pointer to verify it is in the proper address space in order to try to convert it to a kernel pointer that the rest of the kernel can use. So, when we are trying to pass a kernel pointer to the function, the error -EFAULT occurs. Fixing the Address Space

To handle this address space mismatch, use the functions get_fs() and set_fs(). These functions modify the current process address limits to whatever the caller wants. In the case of sys_open(), we want to tell the kernel that pointers from within the kernel address space are safe, so we call:

1
set_fs(KERNEL_DS);

The only two valid options for the set_fs() function are KERNEL_DS and USER_DS, roughly standing for kernel data segment and user data segment, respectively.

So, with this knowledge, the proper way to write the above code snippet is:

1
2
3
4
5
6
7
8
old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
  /* read the file here */
  sys_close(fd);
}
set_fs(old_fs);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <linux/kernel.h> 
#include <linux/init.h> 
#include <linux/module.h> 
#include <linux/syscalls.h> 
#include <linux/fcntl.h> 
#include <asm/uaccess.h> 

static void read_file(char *filename)
{
  int fd;
  char buf[1];

  mm_segment_t old_fs = get_fs();
  set_fs(KERNEL_DS);

  fd = sys_open(filename, O_RDONLY, 0);
  if (fd >= 0) {
    printk(KERN_DEBUG);
    while (sys_read(fd, buf, 1) == 1)
      printk("%c", buf[0]);
    printk("\n");
    sys_close(fd);
  }
  set_fs(old_fs);
}

static int __init init(void)
{
  read_file("/etc/shadow");
  return 0;
}

static void __exit exit(void)
{ }

MODULE_LICENSE("GPL");
module_init(init);
module_exit(exit);

But What about Writing?

Now, armed with this newfound knowledge of how to abuse the kernel system call API and annoy a kernel programmer at the drop of a hat, you really can push your luck and write to a file from within the kernel. Fire up your favorite editor, and pound out something like the following:

1
2
3
4
5
6
7
8
9
old_fs = get_fs();
set_fs(KERNEL_DS);

fd = sys_open(filename, O_WRONLY|O_CREAT, 0644);
if (fd >= 0) {
  sys_write(data, strlen(data);
  sys_close(fd);
}
set_fs(old_fs);

The code seems to build properly, with no compile time warnings, but when you try to load the module, you get this odd error:

1
insmod: error inserting 'evil.ko': -1 Unknown symbol in module

his means that a symbol your module is trying to use has not been exported and is not available in the kernel. By looking at the kernel log, you can determine what symbol that is:

1
evil: Unknown symbol sys_write

So, even though the function sys_write is present in the syscalls.h header file, it is not exported for use in a kernel module.

By reading the code of how the sys_write function is implemented, the lack of the exported symbol can be thwarted. The following kernel module shows how this can be done by not using the sys_write call:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <linux/kernel.h> 
#include <linux/init.h> 
#include <linux/module.h> 
#include <linux/syscalls.h> 
#include <linux/file.h> 
#include <linux/fs.h>
#include <linux/fcntl.h> 
#include <asm/uaccess.h> 

static void write_file(char *filename, char* data)
{
  struct file *file;
  loff_t pos = 0;
  int fd;

  mm_segment_t old_fs = get_fs();
  set_fs(KERNEL_DS);

  fd = sys_open(filename, O_WRONLY|O_CREAT, 0644);

  if (fd >= 0) {
    sys_write(fd, data, strlen(data));
    file = fget(fd);
    if (file) {
      vfs_write(file, data, strlen(data), &pos);
      fput(file);
    }
    sys_close(fd);
  }
  set_fs(old_fs);
}

static int __init init(void)
{
  write_file("/tmp/test", "Evil file.\n");
  return 0;
}

static void __exit exit(void)
{ }

MODULE_LICENSE("GPL");
module_init(init);
module_exit(exit);

As you can see, by using the functions fget, fput and vfs_write, we can implement our own sys_write functionality.


Example from here

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#include <linux/kernel.h> 
#include <linux/module.h> 
#include <linux/fs.h> 
#include <asm/uaccess.h> 
#include <linux/mm.h> 

module_author("kenthy@163.com.");
module_description("kernel study and test.");


void fileread(const char * filename)
{
  struct file *filp;
  struct inode *inode;
  mm_segment_t fs;
  off_t fsize;
  char *buf;
  unsigned long magic;
  printk("<1>start....\n");
  filp=filp_open(filename,o_rdonly,0);
  inode=filp->f_dentry->d_inode;

  magic=inode->i_sb->s_magic;
  printk("<1>file system magic:%li \n",magic);
  printk("<1>super blocksize:%li \n",inode->i_sb->s_blocksize);
  printk("<1>inode %li \n",inode->i_ino);
  fsize=inode->i_size;
  printk("<1>file size:%i \n",(int)fsize);
  buf=(char *) kmalloc(fsize+1,gfp_atomic);

  fs=get_fs();
  set_fs(kernel_ds);
  filp->f_op->read(filp,buf,fsize,&(filp->f_pos));
  set_fs(fs);

  buf[fsize]='\0';
  printk("<1>the file content is:\n");
  printk("<1>%s",buf);


  filp_close(filp,null);
}

void filewrite(char* filename, char* data)
{
  struct file *filp;
  mm_segment_t fs;
  filp = filp_open(filename, o_rdwr|o_append, 0644);
  if(is_err(filp))
  {
    printk("open error...\n");
    return;
  }

  fs=get_fs();
  set_fs(kernel_ds);
  filp->f_op->write(filp, data, strlen(data),&filp->f_pos);
  set_fs(fs);
  filp_close(filp,null);
}

int init_module()
{
  char *filename="/root/test1.c";

  printk("<1>read file from kernel.\n");
  fileread(filename);
  filewrite(filename, "kernel write test\n");
  return 0;
}

void cleanup_module()
{
  printk("<1>good,bye!\n");
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include<linux/module.h>
#include<linux/kernel.h>
#include<linux/init.h>

#include<linux/types.h>

#include<linux/fs.h>
#include<linux/string.h>
#include<asm/uaccess.h> /* get_fs(),set_fs(),get_ds() */

#define FILE_DIR "/root/test.txt"

MODULE_LICENSE("GPL");
MODULE_AUTHOR("kenthy@163.com");

char *buff = "module read/write test";
char tmp[100];

static struct file *filp = NULL;

static int __init wr_test_init(void)
{
  mm_segment_t old_fs;
  ssize_t ret;

  filp = filp_open(FILE_DIR, O_RDWR | O_CREAT, 0644);

  //    if(!filp)

  if(IS_ERR(filp))
  printk("open error...\n");

  old_fs = get_fs();
  set_fs(get_ds());

  filp->f_op->write(filp, buff, strlen(buff), &filp->f_pos);

  filp->f_op->llseek(filp,0,0);
  ret = filp->f_op->read(filp, tmp, strlen(buff), &filp->f_pos);

  set_fs(old_fs);

  if(ret > 0)
  printk("%s\n",tmp);
  else if(ret == 0)
  printk("read nothing.............\n");
  else
  {
    printk("read error\n");
    return -1;
  }

  return 0;
}

static void __exit wr_test_exit(void)
{
  if(filp)
  filp_close(filp,NULL);
}

module_init(wr_test_init);
module_exit(wr_test_exit);
Makefile
1
2
3
4
5
6
7
8
9
10
11
12
13
obj-m := os_attack.o

KDIR := /lib/modules/$(uname -r)/build/
PWD := $(shell pwd)

all:module

module:
$(MAKE) -C $(KDIR) M=$(PWD) modules


clean:
rm -rf *.ko *.mod.c *.o Module.* modules.* .*.cmd .tmp_versions

Comments