Appendix A: QuickTime Confidential
Time in QuickTime
Unlike most hardware-based media, QuickTime does not use a frame-based time system. Instead, QuickTime uses a system based on timescale and time values. Timescale is an integer value that indicates how many time values make up one second of a movie. By default, a new QuickTime movie has a timescale of 600—meaning that there are 600 time values per second.
Frame rate is determined by the concept of interesting time —times where the movie changes. If a movie changes every 40 time values, then it has an effective frame rate of 15 frames per second (since 600 divided by 40 equals 15). When Jitter determines how many frames are in a movie with the getframecount
message, it scans the movie for interesting times, and reports their number. The frame
and jump
messages move the play head between different points of interesting time.
For more information on the relationship between timescale and frame rate, refer to Tutorial 4: Controlling Movie Playback.
In Jitter, recording operations permit you to specify a frame rate, and Jitter takes care of the calculations for you as it constructs a new movie. Editing operations, in contrast, use time values to determine their range of effect.
Optimizing Movies for Playback in Jitter
Although Jitter will gladly play any movie you throw at it, there are some guidelines you can follow to improve performance. Sadly, there is no precise recipe for perfection—performance in a real-time application such as Jitter is the result of the interaction between a movie's track codecs (which affect data bandwidth and processor load), movie dimensions, frame rate and, to some extent, the complexity of the media being processed.
Codecs
Visual media, in particular, contain large amounts of data that must be read from disk and processed by your computer before they can be displayed. Codecs, or co mpressor/ dec ompressors, are used to encode and decode data. When encoding, the goal is generally to thin out the data so that less information has to be read from disk when the movie plays back. Reading data from disk is a major source of overhead when playing movies.
When decoding, the goal is to return the data to its pre-encoded state as quickly as possible. Codecs, by and large, are lossy, which means that some data is lost in the process. As users, our goal is to figure out which codec offers the greatest quality at the greatest speed for our particular application.
Audio Codecs
Codecs are available for both video and audio tracks. For online distribution, you might want to use an MPEG 2 Level 3 (.mp3) or AAC to create smaller files. In Jitter, however, if you are playing movies with video and audio tracks, you'll achieve the best results with uncompressed audio (PCM audio) simply because there will be no audio codec decompression overhead.
Video codecs
Video codecs may be handled in hardware—you may have a special video card that provides hardware compression and decompression of a particular codec, like MPEG or Motion-JPEG—or more typically in software. In Jitter, hardware codec support is only relevant to video output components. Movie playback will always use a software codec, with the important exception of the jit.movie object's direct to video output component feature (see Tutorial 22 Video Output Components and the Object Reference entry for the jit.movie object for more information).
Video codecs generally use one or both of the following schemes: spatial and temporal compression.
Spatial compression is probably familiar to you from the world of still images. JPEG, PNG and PICT files each use types of spatial compression. Spatial compression schemes search a single image frame for patterns and repetitions that can be described in a simpler fashion. Most also simplify images to ensure that they contain these patterns. Nevertheless, more complex images are harder to compress, and will generally result in larger files. Spatial compression does not take time into account—it simply compresses each frame according to its encoding algorithm.
Temporal compression is unique to the world of moving images, since it operates by creating a description of change between consecutive frames. In general, temporal compression does not fully describe every frame. Instead, a temporally compressed movie contains two types of frames: keyframes, which are fully described frames (usually spatially compressed, as well), and regular frames, which are described by their change from the previous keyframe.
For applications where a movie will be played from start to finish, temporal compression is quite useful. Codecs like Sorenson use temporal compression to create extremely small files that are ideal for web playback. However, temporal compression is not a good choice if you need to play your movie backwards, since the order of the keyframes is vital to properly describing the sequence of images. If we play a temporally compressed movie backwards, the change descriptions will be processed before the keyframe that describes the initial state of the frame! Additionally, the Sorenson codec is quite processor-intensive to decompress. Playback of a Sorenson-compressed movie will be slower than playback of a movie compressed using a lighter method.
For Jitter playback, we recommend using a video codec without temporal compression, such a Photo-JPEG or Motion-JPEG (Photo- and Motion-JPEG compression use the same compression method, but Motion-JPEG is optimized for special hardware support [see note above]). At high quality settings, JPEG compression offers a nice compromise between file size and image quality. It's also relatively simple to decode, so your processor can be put to better use than decompressing video.
If image quality is of principle importance, the Animation codec looks better than Photo-JPEG, but creates much larger files.
Different versions of QuickTime support different audio and video codecs. For instance, QuickTime 5 doesn't support the MPEG-4 codec, although QuickTime 6 does. (Quicktime 10 has dropped many codecs. For use of legacy codecs, you can still get QuickTime 7 from Apple.com) You should experiment with different codec settings to find the best match for your particular use of Jitter.
Movie Dimensions and Frame Rate
Compared to codec, movie dimensions and frame rate are more straightforward factors in Jitter performance. Simply put, a bigger image frame or a higher frame rate indicates that Jitter has more data to process each second.
A 640x480 movie produces 1,228,800 individual values per frame for Jitter to process (640 times 480 times 4 (separate values for alpha, red, green and blue channels). A 320x240 movie produces a mere 307,200 individual values per frame, one quarter the size of the larger movie. On most machines, 640X480 movies will give fine performance with one or two processes. If your patch is elaborate, you will have to change to the smaller size.
If you are working with DV media, your movies are recorded at 29.97 frames per second (in NTSC) or 25 frames per second (in PAL). Even using a 360x240 movie, Jitter has to process 10,357,632 values per second in NTSC and 8,640,000 values per second in PAL. Thinning this data by reducing the frame rate to 15 or 20 frames per second will improve performance significantly if you are using Jitter for heavy processing.
Our Favorite Setting
We've found that, for most movies, the following parameters yield consistently good results:
-
320x240 frame size
-
30 frames per second
-
Video tracks: Photo-JPEG codec, using a medium to high spatial quality setting
-
Audio tracks: no compression
Summary
QuickTime's time model is somewhat different from the standard frame-based model, relying on a timescale value to determine the number of time values in a second of QuickTime media. The jit.movie object allows for playback navigation using both frame and timescale models. All editing functions in the jit.movie object use the timescale model to determine the extent of their effect.
Determining the ideal settings for a movie used in Jitter is an inexact science. Actual performance depends on a number of factors, including codec, dimensions, frame rate and media complexity. For best results on current hardware, we recommend using 320x240, 15 fps, Photo-JPEG compressed movies