Every new video camera comes with a baffling list of bullet points, specifications and features to explain why it’s better than anything else that’s gone before, but what do they all mean, and how important are they actually?
Many of the things you look for in a stills camera are important for video too, notably things like sensor size and ISO range. But video capture introduces a whole range of other aspects and technicalities beyond this, so here’s a guide to some of the jargon you’re likely to encounter, and what it means in real terms.
• See the Digital Camera World A-Z Dictionary of photography jargon
12K, 8K, 6K, C4K, 4K UHD, FHD, HD… there are so many different resolutions now! HD is the old ‘standard HD’ with a frame width of 1280 pixels, while FHD is ‘FullHD’ with the larger and more widely used 1920 pixel width. After that, the number indicates the approximate frame width in pixels, so 4K is video 4,000 pixels wide, 6K is video 6,000 pixels wide and so on. Despite the headline stories about 6K, 8K and beyond, 4K is the most universal current resolution and still very high for most purposes.
Aspect ratio is the video frame’s width versus its height. It’s especially important for video, where you want the video proportions to match the screen or display it’s being shown on. By far the most common aspect ratio is 16:9 (16 units wide by 9 high). This is used by almost all common video modes on cameras, by domestic TV sets and computer monitors. There are wider ratios than this and used in cinema productions. Cinema 4K (C4K or DCI 4K) has a slightly wider aspect ratio than regular 4K UHD and is offered on some cameras, and there are much wider cinematic ratios than this that are only used in the movie industry.
This is a special kind of lens used widely when movies were shot on fixed film sizes, to squash a wide image horizontally to fit on a narrower film area. Another anamorphic lens would be used to project or otherwise display the movie and stretch it back out to its proper proportions. Anamorphic lenses are starting to make a comeback in digital video as they offer a way to capture much wider scenes than would otherwise fit on the camera sensor. They also produce optical effects, such as eliptical bokeh shapes, and streaky flare that many cinematographers love.
Crops and sensor sizes
Sensor size is as important for video as it is for stills photography because larger sensors produce better quality, especially in low light, and give shallower depth of field for a more ‘cinematic’ effect. But a camera may not always be able to use the full width of the sensor when shooting video, depending on the sensor resolution, the frame rate being used and the camera’s processing capabilities. Some full frame cameras, for example, can only shoot ‘cropped’ video using a smaller, central area of the sensor. There is a general move towards full frame sensors in video, where previously the standard was the Super 35 format (roughly APS-C size) which remains in widespread use and is often provided on full frame cameras as a ‘crop’ mode for the reasons explained above.
Pixel binning vs oversampling
Some video cameras are made with sensors that have exactly the same resolution in pixels as the video they capture. With cameras that shoot both stills and video, however, the sensor resolution will often be a lot higher than is needed for video – you only need a 12MP sensor to capture 4K, for example. This leaves manufacturers with three choices:
1) Capture cropped video using a smaller area of the sensor with the same pixel dimensions as the video – this needs least processing but reduces the angle of view of your lenses.
2) Use ‘pixel binning’ or ‘line skipping’ to combine or discard unwanted pixels – this is generally seen as a rather unsatisfactory, low tech approach.
3) Use ‘oversampling’ to capture video frames at the sensor’s full resolution and then resample them on the fly down to the required video resolution – this is regarded as giving the best quality and does not crop the video frame, but does require more processing power.
Interlaced versus progressive
In the old days of broadcast TV, when signal bandwidth was limited, interlacing was used to transmit video frames in two parts, one with odd lines and one with even lines only, then ‘interlace’ them on the TV screen. It worked well enough, but you can easily see the interlacing in freeze-frames or digitised old TV programs. In the early days of digital video, many cameras still used interlacing to make the best of the limited processing power available at the time. Now, though, almost all video is ‘progressive’, where each video frame is captured in full. The quality is much better and you no longer get the horrible striped interlacing effect. When you see a ‘p’ after a video format, e.g. 1080p, or a frame rate of 30p, it means progressive capture, whereas ‘i’ means interlaced capture.
For video to look smooth it needs a recording and playback speed of 24-30fps. 24fps is popular in cinematography, 25fps is used for broadcast TV and playback devices in the UK and many European territories and part of the old PAL standard, and 30fps in the US and other territories as in the old NTSC standard. Videographers will often choose the frame rate to match the territory they are in, though with digital distribution and playback, the differences are becoming less important, as are the differences between the PAL and NTSC systems. Cameras offer multiples of these frame rates for slow motion effects. 60fps recording and 30fps playback will give a 2x slow motion effect, for example. High frame rates are a selling point for video cameras but are processor intensive and may come with lower video resolutions.
Bitrate indicates the maximum data capture speed of a video camera and are related to the video resolution and quality settings. The higher the camera’s bitrate, the better the quality it can capture, broadly. 100Mbps (megabits per second) is good, but high-end video cameras might offer 500Mbps. Bitrates don’t tell you everything you need to know about a video camera’s quality, but they do give you an idea of its professional status and its recording ‘horsepower’.
Video footage is very data-heavy and is recorded using a variety of compression techniques. One of these is chroma (or color) subsampling, where the two chroma channels of the video signal are compressed compared to the luma channel (the first of the three numbers). The technical explanations are complex, but there are two figures in common use right now. 4:2:0 sampling gives good quality and good compression and is the standard setting for most cameras capturing video to an internal memory card. 4:2:2 sampling gives better quality but needs either an external recorder or one of the latest, more advanced video cameras – some of these can capture 4:2:2 video internally.
8-bit vs 10-bit
Video footage is usually captured using 8-bit color – the same bit-depth used for JPEG still images. This is usually fine, unless the video needs heavy editing (‘grading’) later. Heavy editing can cause the tones to start to break up or posterise, losing their smooth gradations. Some cameras can capture 10-bit video, either internally or when connected to an external recorder. This produces video which can withstand editing much better, which is especially important if you use Log modes (below).
All-I vs Long GOP
In order to keep video file sizes manageable, further compression is used within each frame or even between frames. All-I (all-intra) uses intra-frame compression alone, so that although each frame is compressed, each frame is nevertheless a complete image. Long GoP (long Group of Pictures) compression identifies significant keyframes (full frames) with big differences, and ‘delta’ frames in between where there are only small differences – and records only these differences, not the full frame. Long GoP compression produces smaller video files, but All-I compression produces better quality, especially for editing.
Raw video is like raw images in stills photography. The camera records video as raw, unprocessed data rather than as processed, viewable movie files. However, raw video capture places MUCH bigger demands on processing power and storage capacity, especially at high resolutions like 4K and above, so only very high end video cameras can capture raw footage internally, and even quite advanced models may need an external recorder.
Log modes are a kind of half way house towards the flexibility of raw video. They capture video which is tonally very flat but captures an extra-high brightness range to work with later when editing, or ‘grading’ the video. Log modes are an important pro feature in more advanced cameras and generally increase the price – though sometimes makers add Log modes via a firmware update. When you grade Log footage you will need to apply a profile to ‘correct’ the tones and colors to a normal appearance, or a LUT (lookup table) to create a specific cinematic color palette or ‘look’.
Codecs vs formats
This can be confusing because it often sounds as if people are treating them as the same thing, but they are not. Essentially, the video ‘format’ is the ‘container’ for the video, audio and assorted metadata to tie it all in together, while the ‘codec’ is purely the video compression system used. For example, the Sony A7S III uses an XAVC HS format which uses the H.265 video codec. The XAVC HS format is proprietary to Sony, but other makers use the same H.265 video codec for their own video formats. The common MPEG-4 format (also known as MP4 after its .mp4 file extension) uses the equally common H.264 video codec. You need containers (formats) for video because they also have to store audio files and metadata needed by the maker to offer proprietary video features for playback and editing.