Video for Linux

References:
A. Cox,Video4Linux Programming, 2000, available on www.kernelnewbies.org

Video for Linux (V4L) is an abstract layer that presents a common interface for media devices, such as cameras, tvs, audios, and tuners. It sits in the drivers/media/video subdirectory of the kernel tree.

V4L is compiled in the kernel with the CONFIG_VIDEO_DEV option. proc filesystem support is further added with the

CONFIG_VIDEO_PROC_FS option.

The main structure is video_device define in the header file include/linux/videodev.h. It contains

owner, the module that own this structure;
name, the video device name (32 char max);
type, the video device type, which can be one of VFL_TYPE_GRABBER, VFL_TYPE_VBI, VFL_TYPE_RADIO, VFL_TYPE_VTX. This field is used by V4L to choose the driver's minor from the proper range.
hardware, the hardware type. This too is not used by V4L;
priv, a pointer to private data;
busy, a guard to ensure that the driver is open only by one process at a time;
minor, the minor number assigned by V4L;
devfs_handle, the handle in devfs returned by devfs_register().

as well as the pointers to the methods used by V4L to dispatch the user application calls: open(), close(), read(), write(), poll(), mmap(), ioctl, and initialize. This last one is called by V4L at initialization after it has finished with its own initializations.

A driver that want to export its functionalities to userland via V4L must call video_register_device(), which takes three args: a pointer to the driver's video_device, the type of the driver, and the requested minor (-1 for no request). When the driver wants to detach from V4L it calls video_unregister_device() with the pointer to its video_device structure. By registering with V4L the driver exposes to user processes only the function video_fops, ie, open(), release(), poll(), read(), write(), ioctl(), and mmap(). llseek() is no_llseek(). The others are forwarding to the driver's related functions.

The most complex API is probably ioctl.

VIDIOCGCAP is used to get the driver's capabilities. The capability includes a name, the interface type (VID_TYPE_CAPTURE, VID_TYPE_TUNER, etc.), the number of audio/tv channels, the number of audios devices if appropriate, and the maximum and minimum width and height.
Capture cards that put the data on a frame buffer must be told the address of the buffer, its size and how it is organized. The ioctl calls are VIDIOCGFBUF and VIDIOCSFBUF, and the structure used is video_buffer which contains a base address pointer, width, height, depth and the number of bytesperline.
The capture window is ruled by VIDIOCGWIN and VIDIOCSWIN. The struct video_window contains the window position coordinates x and y, the width and height, the chromakey, additional capture flags, a list of clips (clipping rectangles) and their number (clipcount).
Overlay capture is used to instruct the hardware to capture the frames onto the frame buffer. The device must be of type VID_TYPE_OVERLAY, and initialized for overlay, with , and the refresh area can be clipped with a list of rectangles (where the hardware does not refresh), or with a mask which uses a chromakey, a colour that indicates no refresh. In the first case the video_device must declare VID_TYPE_CLIPPING in the flags. In the second it must be of type VID_TYPE_CHROMAKEY. If the video capture goes directly into video memory the device is of type VID_TYPE_FRAMERAM. Before the driver can overlay the images the video window must be set with VIDIOCSWIN. There is also an ioctl that gets the video window informations. Overlay capturing is activated or deactivated by the ioctl VIDIOCCAPTURE with arg 1 or 0 respectively.
When there are several channels, it is possible to select one with VIDIOCSCHAN and VIDIOCGCHAN. See the struct video_channel.
Image properties are described by the video_picture structure: brightness, hue, colour, contrast, whiteness, depth, and palette. This last one can be for example VIDEO_PALETTE_GREY, VIDEO_PALETTE_RGB24, VIDEO_PALETTE_RGB32. Several other palettes are defined.
Tuners are described by the structure video_tuner and controlled with VIDIOCGTUNER, VIDIOCSTUNER.
Audio inputs are described by the structure video_audio and controlled with VIDIOCGAUDIO, VIDIOCSAUDIO.

Reading is possibble with the system call read(). V4L calls the driver's read() if there is one, otherwise returns an error (EINVAL), The video device may support also the write() system call. If not V4L return 0 on write().

Another way to read images from the device is through the mmap interface, which must first be set up with the mmap() system call. Next the application gets the size of the memory buffer and offset of each frame in the buffer with the ioctl VIDIOCMBUF call. The application issues acquisition commands with the ioctl VIDIOCMCAPTURE call, specifying the requested frame index. This call starts the acquisition and returns immediately. The application waits for the completion of the acquisition with the ioctl VIDIOCSYNC. Therefore the sequence of calls is something like

   int index = 0;
   int newindex;
   ioctl( fd, VIDIOCMCAPTURE, &index);
   while ( 1 ) {
     newindex = (index+1) % FRAME_NUMBER;
     ioctl( fd, VIDIOCMCAPTURE, &index);
     ioctl( fd, VIDIOCSYNC, &index);
     // process frame "index"
     index = newindex;
   }

Write a memory-based video camera driver that registers with V4L subsystem. Check that the memory-based video camera operates properly, by writing a graphical program that displays the frames.
Here is a solution for the driver: mmvideo.c mmvideo.h and the Makefile. The driver provides images either through the read() system call, or the user application can mmap() the frame buffers. The driver policy is to try to serve the mmapped frames first, and then the reading processes.

Finally here is a simple application that loads an image onto the camera memory, and displays the frames generated by the camera: mmvideotest.cpp. It needs a few helper classes, mmvideoutil.tgz (the image comes from the source Documentation) and libjpeg and libX11, which you probably already have. Check /proc/video/mmvideo while the application is running. Enjoy!

The interface to VideoForLinux2 has changed a little in the kernel 2.6. Mostly to keep into account the new driver class model, which offers a better support for managing components lifecycle.

A video device registers with the V4L2 core with video_register_device(video_device *, type, nr). The first parameter is apointer to the video device structure. The second is the type of the device, eg, VFL_TYPE_GRABBER.

The video_device struct describes a video device and is defined in videodev.h. It contains

dev a pointer to a device;
name the device name (32 char max);
type the video device type for V4L1;
type2 the video device type for V4L2;
hardware ???
minor the device minor number;
fops, a pointer to the file operations;
release(video_device*), the special release method called by the sysfs when the device is unloaded and no longer used;
owner, pointer to the owning module, usually this_module;
priv, pointer to private stuff.

Video buffer types are listed in the enum v4l2_buf_type. For example, V4L2_BUF_TYPE_VIDEO_CAPTURE.

The types of memory are listed in the enum v4l2_memory: V4L2_MEMORY_MMAP, V4L2_MEMORY_USERPTR, and V4L2_MEMORY_OVERLAY.

Video capabilities are described with the struct v4l2_capability, which contains descriptive strings: the driver name, the card description, and the bus information bus_info. It has a version which should be set to the kernel version, and an integer for the device capabilities. The capabilities related to video are

V4L2_CAP_VIDEO_CAPTURE, the device can capture images/video
V4L2_CAP_VIDEO_OUTPUT, the device is a video output device
V4L2_CAP_VIDEO_OVERLAY, can overlay video on video memory
V4L2_CAP_VBI_CAPTURE, (vertical blanking interval) capture device
V4L2_CAP_VBI_OUTPUT,

The ioctl VIDIOC_QUERYCAP is used to query device capabilities.

The struct v4l2_pix_format contains the image width and height, the pixelformat, the bytesperline, the image size sizeimage, and field and colorspace. The field (of enum v4l2_field) describes how the frame fields are captured: for example V4L2_FIELD_NONE means that the device has no fields, V4L2_FIELD_TOP means that the driver capture only the top fields, etc. (see the include file videodev2.h for all the possibilities). colorspace (of enum v4l2_colorspace) describes the color-space, V4L2_COLORSPACE_SRGB is probably good start for RGB color-spaces.

Pixel format

Pixel formats are described by the struct v4l2_fmtdesc, which contains an index (the format number), the video buffer type, flags, a description string, and the four-char pixelformat. Several four-char formats are defined, eg, V4L2_PIX_FMT_RGB24, V4L2_PIX_FMT_RGB565, etc.

Pixel format appear in the r/w ioctl VIDIOC_ENUM_FMT Controls

V4L2 defines structures and ioctl to support video controls (see videodev2.h). A v4l2_control has an id and a value.

The query-control struct, v4l2_queryctrl, contains the relevant information about a control, and is used in the ioctl VIDIOC_QUERYCTRL. It has

id the numerical id of the control;
type, the type of the control, for example V4L2_CTRL_TYPE_INTEGER. It can be integer, boolean, menu, or button;
name, the descriptive name of the control;
minimum, maximum, step and default_value;
flags, ???
and a reserved array of two elements.

Video specific control id's are V4L2_CID_BRIGHTNESS, V4L2_CID_CONTRAST, V4L2_CID_SATURATION, V4L2_CID_HUE, V4L2_CID_BLACK_LEVEL, V4L2_CID_RED_BALANCE, V4L2_CID_BLUE_BALANCE (white balance is a combination of red and blue balances), V4L2_CID_AUTO_WHITE_BALANCE, V4L2_CID_DO_WHITE_BALANCE, V4L2_CID_GAMMA, V4L2_CID_WHITENESS, V4L2_CID_EXPOSURE, V4L2_CID_AUTOGAIN, V4L2_CID_GAIN, V4L2_CID_HFLIP, V4L2_CID_VFLIP, V4L2_CID_HCENTER, and V4L2_CID_VCENTER. There are also audio related control id's.

ioctl

V4L2 has a large number of ioctl:

VIDIOC_QUERYCAP, the user process reads the device capability. The driver should fill the driver name, the capabilities, the kernel version, card, and bus_info.
VIODIOC_ENUMFMT,
VIDIOC_G_FTM, the driver fills in the format the buffer type (eg, V4L2_BUF_TYPE_VIDEO_CAPTURE), and the image format fmt.pix.
VIDIOC_S_FMT;
VIDIOC_REQBUFS, ...
VIDIOC_QUERYBUF, has argument a v4l2_buffer ...
VIDIOC_G_FBUF and VIDIOC_S_FBUF ...
VIDIOC_OVERLAY ...
VIDIOC_QBUF queue a v4l2_buffer
VIDIOC_DQBUF dequeue a v4l2_buffer
VIDIOC_STREAMON and VIDIOC_STREAMOFF ...
VIDIOC_G_PARM and VIDIOC_S_PARM
VIDIOC_G_STD and VIDIOC_S_STD
VIDIOC_ENUMSTD
VIDIOC_ENUMINPUT
VIDIOC_G_CTRL and VIDIOC_S_CTRL ...
VIDIOC_QUERYCTRL
VIDIOC_QUERYMENU
VIDIOC_CROPCAP
VIDIOC_G_CROP and VIDIOC_S_CROP
VIDIOC_QUERYSTD
VIDIOC_G_JPEGCOMP and VIDIOC_S_JPEGCOMP

There are also ioctl for the audio/tuner, and some odl ioctl temporarily kept for backward compatibility only.

The driver

This is a description of the actions of a driver, based on the meteor frame grabber. It is only an example and it does not mean that all the video grabber should be made in the same way.

Module init/exit

init() should

do all the hardware discovery (PCI) and initialization;
initialize the high memory for the video buffers
allocate and initialize the dynamic structures
allocate a videodev, video_device_alloc() for each device;
initialize it (name, type or type2, release, fops) and set the private data by calling video_set_drvdata(videodev *, void *)

register with V4L2, video_register_device(videodev *, type, -1) complete the structure initialization (create workqueues).

exit() should unregister from V4L2, v4l2_unregister_device(videodev *) clear dynamic structures (destroy workqueue). release the high memory File operations open() retrieve the videodev from the file, using video_devdata(file), and its private data, video_get_drvdata() find (or allocate and configure) an unused struct for the new open device; initialize it if this is the first open device do the dynamic initialization of the driver: request IRQ, initialize lists, reset capture fields close() (ie, file operation release()) retrieve the open device struct from the file clear it if this is the last close, release dynamic resourses: disable interrupts (and wake up waiting processes), release IRQ write() returns EINVAL lseek() returns ESPIPE read() retrieve the open device from the file do some safety checks (status of the device, and the video decoder) tell the hardware to capture a frame wait for it (if not O_NONBLOCKING) read the new frame to userspace mmap() retrieve the open device from the file if the driver has already high memory buffers return them to high memory get a high memory address and remap the vma to start at this address; set the vma_ops poll() retrieve the open device from the file check the lists of streaming buffers to see if there is one ready wait Marco Corvi - 2003

Video for Linux (2.4)

Video for Linux 2 (2.6)

The driver