An overview to show the logical structure of VOB files, and help you see where all the pieces fit.

MPEG-2 System Stream

A VOB file is an MPEG-2 system stream. This means that it complies 100% with the MPEG-2 system level standard, ISO 13818-1. However, VOB files are a very strict subset of the standard. So while all VOB files are MPEG-2 system streams, not all MPEG-2 system streams comply with the definition for a VOB file.

Pack/sector size

DVD sectors contain 2048 bytes of data, this is also the size of one pack. In MPEG-2 packs are used mainly to group together elements (such as audio and video) that are to be presented simultaneously, and their size is variable. The pack header can also contain timing information used for synchronization.
In DVD-Video each sector is one pack. This adds some overhead, but makes random access to the stream much easier.

Pack contents

Each pack begins with a pack header and contains one or two packets, and no more. The information in one pack is all of one kind, which may be navigation data (a NAV Pack), video, audio, or subpicture.
The NAV pack contains the system header and two fixed length packets called Presentation Control Information (PCI) and Data Search Information (DSI).
The video, audio, and subpicture packs contain only the Packetized Elementary Stream (PES) for the content, and, if needed, a padding packet.

Non-standard stuff

The MPEG-2 system fortunately left provisions for non-standard data in the form of private streams. There are two private stream types, only one has timing information in the form of Presentation Time Stamps (PTS) and Decoder Time Stamps (DTS). The actual content of a private stream is determined by the application, in our case, DVD-Video.
Private Stream 1 is the one that has the timing information, and so DVD-Video uses this stream for subpictures and all the additional audio systems (AC3, DTS, LPCM, etc) which are not MPEG. The actual content of each private stream packet is determined by the sub-stream number.
The other stream, Private Stream 2, is used for the navigation packets found in the NAV pack.


The next higher logical structure is called the Video OBject Unit, or VOBU. Each VOBU starts with a NAV pack and contains approximately half a second of the program. The size of the VOBU is determined by the video coding unit called a Group Of Pictures (GOP). A VOBU will contain one or more complete GOP, as needed. The last video pack in each VOBU is padded if needed with either a padding stream or stuffing bytes. Audio and subpictures with DTS values within the same range of values as the video are included in each VOBU. Audio is not padded until the end of the cell, therefore audio frames can span VOBUs.

The Cell

Cells are the next higher logical structure, containing any number of whole VOBUs. Their length and placement is entirely arbitrary and depends on the overall organization of the program (movie). Chapters, multiple angles, titles, and even how the "prev" and "next" buttons on a remote act all dictate the placement of cells.


The VOB is a collection of one or more cells. An entire title could use just one VOB, but they usually use more. Sometimes the use is arbitrary, usually along the lines of a new VOB for each chapter, and within the VOB cells for each scene. This is not a requirement. In fact, there is only one place where seperate VOBs are required, and that is multiple angles.

Several 1GB files

All the content for one title set (VTS) is contiguous on the DVD, but broken up into 1GB files in the computer compatible file systems for the convenience of the various operating systems. You can see that there really is no break by examining the second or later file and looking at the Logical Block Address (LBA), contained in NAV packs.
The files are broken up without regard to content, which is why it is difficult to process any file but the first, since it most likely will not start at a VOBU (start with a NAV pack). The usual split point is at 524,287 sectors (1,048,574 KB, 1,073,739,776 bytes). In hexadecimal this is 7FFFF sectors (219-1), or 3FFFF800 bytes.
