Netlink (2.6.8)

References:
man 3 netlink http://linuxreviews.org/man/netlink/
man 7 netlink
K. Kaichuan He, "Why and how to use netlink socket", Linux Journal, Feb. 2005 (This article has been a great source of inspiration for this page, and the related exercises).
W.R. Stevens, Unix Network Programming, Vol. 1, Prentice Hall, 1998 (p. 358-362)

Netlink is a socket-based interprocess communication mechanism. User programs can communicate each other with netlink sockets. Programs can use the netlink via the library libnetlink or can directlyinterface to the low-level kernel api. Here i describe how to use the latter.

The netlink "protocol" is message oriented, and the programs must send/receive data encapsulated in netlink messages. The system calls sendmsg.2 and recvmsg.2 are used to send and receive these messages. Their syntax is

ssize_t sendmsg(int s, const struct msghdr *msg, int flags);
ssize_t recvmsg(int s, struct msghdr *msg, int flags);

Here s is a socket descriptor, obtained with the socket.2 system call, with PF_NETLINK type, and a netlink protocol, a number between 0 and 32. Protocols are listed in the include file linux/netlink.h. A user program can specify a listed protocol or can pick an unused value.

The struct msghdr contains the following fields:

Netlink in user space

To use the netlink, the first entry in the message iovec must be a netlink message header, struct nlmsghdr, which contains

To send a message, a program sets the peer netlink sockaddr in the message msg_name, fills in a iovec with first entry a proper netlink message header, and calls sendmsg().

To receive a message, a program either binds the socket to the receiving address or set the peer netlink sockaddr in the message header, and calls recvmsg(). The first mechanism is useful when a server sets up a receiving netlink and advertise the address so that clients can send it their messages. The second mechanims applies when sender and receiver are in a peer-to-peer relation and know each other netlink addresses.

The header file netlink.h defines a few utility macroes. Among the others,

After this brief overview of netlink from the user point of view, you should be able to do the following exercises.
Exercise 1. Write a server that advertise a netlink address with its pid, and a client that sends it a message. Display the content of the message by the server to make sure that the proper data are received. Include a payload in the message.
Exercise 2. Write a pair of sender and receiver that communicate on a peer-to-peer basis. Check the content of the message and include a payload.

Here is a sample user program and a makefile. Just do make test to compile two test programs, netlink-cs and netlink-p2p out of the same source. Run netlink-cs with no args to start a server. It prints its pid. Run it again with arg the server pid, to start the client. For netlink-p2p run with no args to start the server (it uses a fixed netlink pid),then run it with anything as arg to start the client (which uses ianother fixed netlink pid).

A library is available for programming with netlink. It is included in ftp://ftp.inr.ac.ru/ip-routing/iproute2*. I have not checked it out yet, so i cannot say about it at the moment.

Netlink in kernel space

The kernel sources for the netlink are in the subdirectory net/netlink.

The netlink supports a fixed number (MAX_LINKS = 32) of "protocols". All the sock's for a given "protocol" are linked in a list. The heads of these lists form the netlink table nl_table array. The table is protected by a lock (nl_table_lock) and nl_table_users counts the users of the table. Functions are provided to

The netlink register with the socket layer invoking sock_register() which sets the protocol family (PF_NETLINK) of the netlink, and install the function used to create a netlink instance, netlink_create(). The netlink is removed by unregistering with the socket layer (function sock_unregister()).

netlink_create() takes two parameters: a socket pointer and a netlink protocol number (called "unit"). The protocol must not exceed the maximum protocl number. The socket must be of type either RAW or DGRAM; other types are not supported. Its operations are sets to the netlink operations (see below). A new sock is allocated, and initialized (core/sock.c::sock_init_data()). In particular: its queues (receive, write, and error) are initialized, its sk_socket is set to the parameter socket, ... Furthermore sk_protinfo (netlink options) is allocated (but cleared), sk_destruct is set to netlink_sock_destruct, and sk_protocol is set to the netlink protocol number. Finally it increments the global netlink socket number netlink_sock_nr.

The sock netlink options contain

The socket destructor netlink_sock_destruct() purge the receive queue, cheks that the socket has the SOCK_DEAD flag, frees the sk_protinfo, and decrement the global netlink socket number netlink_sock_nr.

The file af_netlink.c defines the netlink operations:

Polling is demanded to the generic datagram_poll(). Other socket operations are not permitted (set to the default sock_no_xxx):

Netlink kernel programming

Netlink can be used also as a user-kernel communication channel. A user program sends and receive a netlink message to the kernel by specifying a pid 0.

It is necessary that a kernel component (a module) creates a netlink unit ("protocol") with a input handling function. The source in af_netlink.c exports the function netlink_kernel_create(unit, input) which takes the unit number, and the "input" callback function. This function is invoked when a netlink message arrives for this unit. It has signature

void input( struct sock * sk, int len );

It should dequeue socket buffers from the sk_receive_queue of sk, and process the netlink message contained in their data. The socket buffer should be freed afterwards.

The input function is called in the context of sendmsg (of the sending process). If the message processing is slow, it is better to defer it to a kernel thread, and use input to wake up the thread.

The kernel module sends netlink messages to user programs calling the functions

int netlink_unicast(struct sock *sk, struct sk_buff *skb, u32 pid, int nonblock)
int netlink_broadcast(struct sock *sk, struct sk_buff *skb, u32 pid, u32 group, int allocation)

The first is used for sending a message to a single process, the second for multicasting. The sk parameter is the netlink sock returned by netlink_kernel_create. The socket buffer skb should contain the message data (netlink header plus payload), and pid is the receiving process pid. For broadcast pid is zero and group is a bitmask of the ORed receiving groups.

The flag nonblock corresponds to MSG_DONTWAIT. If it is set and no process is ready to receive the message the function returns an error. An error is returned also if there is no process attached to the netlink with that pid.

The kernel writes the source and destination data with the macro NETLINK_CB( skb ), provided in netlink.h, which returns the address of the socket buffer control block (cast to a pointer to netlink_skb_parms). The important fields to set are

When the kernel module is removed its netlink socket should be removed. Netlink does not export the function netlink_release(), but the module can call sock_release() with the netlink socket contained in the field sk_socket of the kernel netlink sock obtained with netlink_kernel_create. This function (see net/socket.c) besides calling netlink_release decreases the counter socket_in_use (and if the socket has a file iput-s it).

Exercise 1
Write a kernel module that creates a netlink and receive messages from user programs and echo then back.

Install the kernel module

# insmod knetlink.ko

You can see the module with

lsmod | grep knetlink
, and check that the module has created a netlink "protocol" with unit 17,

$ cat /proc/net/netlink
sk       Eth Pid    Groups   Rmem     Wmem     Dump     Locks
09fdbe00 0   0      00000000 0        0        00000000 2
...
09fdb800 17  0      00000000 0        0        00000000 2

The module is removed with the command

rmmod knetlink
.

Here you can find the kernel module and a test program. Compile them with the same makefile used for the user-level sample programs. make, with the target, compiles the kernel module knetlink.ko. make test compiles the user test program netlink-u2k. Install the kernel module, and run netlink-u2k.

Marco Corvi - 2005