Mctrain's Blog

What I learned in IT, as well as thought about life

Note About Madvice

| Comments

Today I’ve tried to learn and test the madvice() system call.

One problem about mmap() system call is that it simply setups the mapping between the disk file and the virtual memory, but delay the actual data loading whenever the application requires. So if you want to read a large file sequentially, there will be many page faults happening in this process, which may cause low performance I/O.

To solve such problem, we can use madvice() system call together with mmap(). As you can see from the madvice(2) manpage, you can see that:

madvise - give advice about use of memory

The prototype is as follows:

#include <sys/mman.h>
int madvise(void *addr, size_t length, int advice);

As the manpage says, madvice() can give kernel the advices about how to handle paging in the address range beginning at address addr and with size length bytes. There’re many advice arguments as shown in the manpage, for example,

MADV_SEQUENTIAL can tell kernel that we will read the page in sequential order, so that pages in the given range can be aggressively read ahead, and may be freed soon after they are accessed.

MADV_WILLNEED can tell kernel that we are expected to access the memory in the near future, hence, it might be a good idea to read some pages ahead.

And so on…

Meanwhile, madvice() system call should not influence the semantics of the application, but may influence its performance (mostly used to improve the performance).

Based on the concept, I just write two simple program, to simply test whether it may improve the I/O performance.

Here is the code:

madvice.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>

#include <time.h>
#include <sys/time.h>

#define TIMER(val) do { \
  struct timeval tm; \
  gettimeofday(&tm, NULL); \
  val = tm.tv_sec * 1000 + tm.tv_usec/1000; \
} while(0)


int doprocess(char *p)
{
  long starttime, endtime;
  TIMER(starttime);
  int nSum = 0;
  int i;
  for (i = 0; i < FILE_LENGTH; i++) {
    nSum += *p;
    p++;
  }
  TIMER(endtime);
  return (endtime - starttime);
}
void readWithoutMadvise(char *path)
{
  int fd = open(path, O_RDWR | O_EXCL);
  if (fd == -1) {
    perror("open error in readWithoutMadvise");
    return;
  }
  void *p = mmap(NULL, FILE_LENGTH, PROT_READ, MAP_SHARED, fd, 0);
  if (p == MAP_FAILED) {
    perror("map error in readWithoutMadvise");
    return;
  }
  close(fd);
  fd = -1;
  int interval = doprocess((char *)p);
  printf("read without madvise: %d\n", interval);
  if (munmap(p, FILE_LENGTH) == -1) {
    perror("unmap error in readWithoutMadvise");
    return;
  }
}
void readWithMadvise(char *path)
{
  int fd = open(path, O_RDWR | O_EXCL);
  if (fd == -1) {
    perror("open error in readWithoutMadvise");
    return;
  }
  void *p = mmap(NULL, FILE_LENGTH, PROT_READ, MAP_SHARED, fd, 0);
  if (p == MAP_FAILED) {
    perror("map error in readWithoutMadvise");
    return;
  }
  close(fd);
  fd = -1;
  if (madvise(p, FILE_LENGTH, MADV_WILLNEED | MADV_SEQUENTIAL) == -1) {
    perror("madvise error");
    return;
  }
  int interval = doprocess((char *)p);
  printf("read without madvise: %d\n", interval);
  if (munmap(p, FILE_LENGTH) == -1) {
    perror("unmap error in readWithoutMadvise");
    return;
  }
}
int main(int argc, char* arvg[])
{
  char path1[100] = "/path/to/large/file";
  if (strtol(argv[1], NULL, 10) == 1)
    readWithMadvise(path1);
  else
    readWithoutMadvise(path1);
}

This code is quite simple, I first mmap a large file (~3GB) from disk to memory:

1
2
3
4
5
6
7
8
9
10
11
  int fd = open(path, O_RDWR | O_EXCL);
  if (fd == -1) {
    perror("open error in readWithoutMadvise");
    return;
  }
  void *p = mmap(NULL, FILE_LENGTH, PROT_READ, MAP_SHARED, fd, 0);
  if (p == MAP_FAILED) {
    perror("map error in readWithoutMadvise");
    return;
  }
  close(fd);

And then I can use madvice() to advice the kernel:

1
2
3
4
  if (madvise(p, FILE_LENGTH, MADV_WILLNEED | MADV_SEQUENTIAL) == -1) {
    perror("madvise error");
    return;
  }

In the doprocess() function I just read the file byte by byte to sum their values and calculate the total time elapsed:

1
2
3
4
5
6
7
8
9
10
11
12
13
int doprocess(char *p)
{
  long starttime, endtime;
  TIMER(starttime);
  int nSum = 0;
  int i;
  for (i = 0; i < FILE_LENGTH; i++) {
    nSum += *p;
    p++;
  }
  TIMER(endtime);
  return (endtime - starttime);
}

In the main method I can control whether to use madvice or not:

1
2
3
4
  if (strtol(argv[1], NULL, 10) == 1)
    readWithMadvise(path1);
  else
    readWithoutMadvise(path1);

It noted that between the executions, you need to clean the cache, for me I just use:

$ sync && echo 3 > /proc/sys/vm/drop_caches

So let’s test this code, first compile the code:

$ gcc -o madvice madvice.c
$ cat test.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/bin/bash

echo "====== test begin ======"

sync && echo 3 > /proc/sys/vm/drop_caches
echo "test mmap without madvice"
sleep 1
./madvice 0

sync && echo 3 > /proc/sys/vm/drop_caches
echo "test mmap with madvice"
sleep 1
./madvice 1

sync && echo 3 > /proc/sys/vm/drop_caches
echo "test mmap without madvice again"
sleep 1
./madvice 0

sync && echo 3 > /proc/sys/vm/drop_caches
echo "test mmap with madvice again"
sleep 1
./madvice 0

echo "====== test end ======"

Then run the test

$ chmod +x ./test.sh
$ ./test.sh
====== test begin ======
test mmap without madvice
1059
test mmap with madvice
668
test mmap without madvice again
1064
test mmap with madvice again
669
====== test end ======

You can see the performance is quite different.

Comments