CPU-based calculation of the optical flow can take up to two days on the test workstation. The quality of results can vary widely. They depend on the input video material, the painting style selected and the corresponding neural-style parameters. Abstract style images, such as Lissitzky’s PROUN works or Kandinsky’s Black and Violet (used for test purposes) quickly lead to ostensibly amazing results. However, in the case of monochrome areas such as blue skies or in the case of quick movements or motion blur, artefacts and stylistic/colour shading can occur. This problem also arises with stylistic templates of a sketch nature.
Kagemusha: colour distribution
The example with two consecutive film frames from Kagemusha shows that the transitions in the foreground and middle ground are very consistent, while colours and areas in the monochrome blue of the sky are rather distributed at random. This can only be fixed by masking and replacing the sky in After Effects. For this purpose, the movement of the camera is tracked and transferred to a newly calculated still image of the sky. There are similar problems for The Cabinet of Dr Caligari and Pan’s Labyrinth where dark, shadow areas, in particular, lead to non-consistent transitions. In some areas, the foregrounds and backgrounds must be calculated separately and recombined in After Effects.
The Cabinet of Dr Caligari: animated masks
For a duration of one minute with a frame rate of 24 frames per second, a total of 1440 single images would have to be calculated. On the test system, this process would take around 20 minutes per FullHD image. The processing time per minute would thus be almost 480 hours, or 20 days. There are several options to reduce this time. One is using several Titan-X graphics cards in parallel for the GPU computation, the other is lowering the quality of the stylistic transfer using another model (NIN-Imagenet) or another optimiser (adam instead of lbfgs) thereby accelerating the computation. Justin Johnson has since introduced a quicker torch implementation on GitHub, which he calls fast-neural-style. However, it was not available for use on the present clips.
The examples shown here were calculated using images with a horizontal length of 800 pixels and then upscaled in After Effects (detail-preserving upscale) or VirtualDub (Lanczos / WarpResize). Upscaling using WarpResize in particular results in sharply defined edges. One disadvantage, however, is the associated scaling of the brush strokes, which means the painting style is not always transferred true-to-scale. However, this issue can be concealed by reworking the scaled images in Photoshop, After Effects (ToonIt), Studio Artist or FotoSketcher using the corresponding painting styles and then superimposing them over the original images. The best results by far, however, come from calculating the images in original size, which is the most computing-intensive method.
Line drawings for superimposing the original image
The frame rate faces similar problems in that 15 fps has to be resampled to 24 fps. For older films, such as Battleship Potemkin or Metropolis, this is relatively unproblematic since frame duplication is hardly noticeable. For movement-intensive clips, such as Pans Labyrinth or Frida, however, the situation is different in that the missing intermediate images have to be extracted or interpolated using tools, such as Timewarp (After Effects) or Twixtor.
Installation of neural-style
Working with neural-style
Style Transfer for Video Clips
Post-production
Alternative Techniques