DHQ: Digital Humanities Quarterly
Preview
2020
Volume 14 Number 4
Preview  |  XML |  Discuss ( Comments )

The Phenomenon of Interwar City Symphonies: A Combined Methodology of Digital Tools and Traditional Film Analysis Methods to Study Visual Motifs and Structural Patterns of Experimental-Documentary City Films

Abstract

The 1920s and 1930s saw the rise of the city symphony, a film phenomenon of experimental-documentary city films that took the modern metropolis as their protagonist and presented it in its multiple facets and kaleidoscopic nature. While Ruttmann’s Berlin. Die Sinfonie der Grosstadt (1927) and Vertov’s Chelovek s kinoapparatom (1929) gained canonical status and exert a considerable influence on filmmaking to this very day, the city symphony has remained a somewhat downplayed and neglected phenomenon in film history. Indeed, besides the handful of well-known titles, there have been more than eighty films made around the globe, many of which fell into oblivion and have received little or no scholarly attention at all. Moreover, even the canonical examples have never systematically and structurally been analyzed in detail in a comparative way. The result is the lack of a sophisticated and clear definition of the city symphony, based on both the canonical films and the greater city symphony corpus. My research sets in at this point, addressing the conspicuous absence of an analysis of city symphony characteristics. It does so by making use of a combined methodology of traditional film analysis methods and digital tools. This article reflects on the methodology applied and tools chosen and put into practice to study and analyze the visual material of city symphonies, which with their highly complex structure, dense imagery, and themes as well as their experimental techniques and striking editing patterns, are a perfect fit for a computational film analysis. Particularly, the digital film/video annotation software ELAN, the Cinemetrics method, a bar chart representation of shot lengths, and the grid visualizations of multiple film frames created with ImageJ play a role in this case study, which also broaches the aspect and benefits of manual digital analysis.

City Symphonies and a (Half‑)Blind Spot in Film History

This article is based on my PhD dissertation Rewinding the City Symphony: Historiography, Visual Motifs and Structural Patterns of Interwar City Symphony Films (Ghent University, 2018) and reflects on the methodology applied for the analysis of visual material. In this regard, what follows can be considered as a research report – with a special focus on the methods chosen and put into practice to study and analyze city symphonies, making use of digital humanities tools in a film studies research. Particularly, the digital film/video annotation software ELAN, the Cinemetrics method, a bar chart representation of shot lengths, and the grid visualizations of multiple film frames created as image montages with ImageJ play a role in this case study.
City symphonies are experimental-documentary city films of the 1920s and 1930s that take the modern metropolis as their protagonist and present it in its multiple facets and kaleidoscopic nature. Often described as cross-section or montage films (e.g. Kracauer, 1947; Weihsmann, 1997; Koeck and Roberts, 2010), they combine a great number of visual urban motifs and themes with a highly complex editing structure. The most famous examples are Walther Ruttmann’s Berlin. Die Sinfonie der Grosstadt (Berlin, Symphony of a Great City, 1927) and Dziga Vertov’s Chelovek s kinoapparatom (Man with a Movie Camera, 1929). To this day, these films exert a considerable influence on filmmaking and the cinematic language in general. Recent city films such as London Symphony (Alex Barrett, 2017), Symphony of Now (Johannes Schaff, 2018), or Mark Cousins’s I am Belfast (2015) and Stockholm My Love (2017) are just the most obvious indications for this impact and popularity.
However, despite the ongoing (or reviving) influence on the production of urban films and the canonical status of Ruttmann and Vertov’s films, the city symphony has remained a somewhat downplayed and neglected phenomenon in film history. While, on the one hand, in fact, it is considered as a crucial part of canonical film history today and as a basis for the fast-growing scholarship on city films in general (see e.g. Bordwell and Thompson, 2019; Nowell-Smith, 1997; Koeck and Roberts, 2010; Mazierska and Rascaroli, 2003), on the other hand, it has never really been more than “a colorful footnote to film history” [Kinik 2008, 16]. Indeed, besides the handful of well-known titles, which include also Manhatta (Paul Strand and Charles Sheeler, 1921) and Rien que les Heures (Nothing But Time, Alberto Cavalcanti, 1926), there have been more than eighty films made around the globe in the interwar period.[1] Many of these films fell into oblivion and have received little or no scholarly attention at all. Hence, the city symphony phenomenon was much more complex, widespread and diverse than generally assumed and acknowledged in film historiography. Moreover, even the canonical examples have never systematically and structurally been analyzed in detail in a comparative way. While there have been studies on individual titles, publications putting the city symphony in context of the broader field of the city film, and a few attempts mentioning and discussing a group of films together and highlighting some features (rough descriptions and short characterizations), a comprehensive analysis concerning shared visual and structural qualities and the central characteristics of the city symphony has never been made.[2] The result is the lack of a sophisticated and clear definition of the city symphony, based on both the canonical films and the greater city symphony corpus, that is still missing after decades of film studies. This blind spot is striking since it does not concern obscure films from the edge of film history, but a vast international phenomenon and films that were highly celebrated at their time and – once more – whose popularity and influence are recognizable to this very day.
What constitutes the city symphony looking at shared elements of the canon films Berlin, Man with a Movie Camera, Rien que les Heures and Manhatta as well as the bigger corpus? What are the visual motifs, formal and structural features of the city symphony? My research sets in at this point, addressing the conspicuous absence of an analysis of city symphony characteristics, as it delves deeper into the phenomenon by examining the films themselves.

A Perfect Fit: City Symphonies and Computational Analysis Methods

With their highly complex structure, dense imagery and themes as well as their experimental techniques and striking editing patterns, city symphonies are a perfect fit for a computational film analysis, which in segmentations and annotations allows to structurally, precisely and in detail identify individual elements (such as motifs), break down the complexity of these films into separate component parts, and makes it possible to see how different aspects are linked to each other. Moreover, city symphonies do not only share the fractured nature and overwhelming, chaotic impressions with the modern city[3] – their very subject matter – they also relate to the database logic of the computer. According to Manovich (2001), the database as a cultural form introduced the “correlate” to the linearity and cause-and-effect trajectory of the logic of the narrative and a new way of structuring our experiences, presenting the world as a list of items and refusing to order this list. Every item possesses the same significance as any other. In addition, Manovich explicitly describes Man with a Movie Camera as “perhaps the most important example of a database imagination in modern media art,” which anticipated the database logic of digital media in presenting an almost “linear printout [...] of a database” and “a catalog of subjects that one could expect to find in a city of the 1920s,” supplemented with “the most amazing catalog of film techniques.” [Manovich 2001, 241][4] Similarly (and following Manovich’s ideas), Cowan integrates Ruttmann’s Berlin and the city symphony form in general in the same tradition of the database logic as it emphasizes “paradigmatic over syntagmatic relations, presenting the world as an inventory of possible choices rather than a causal chain of narrative events” [Cowan 2014, 60]. Moreover, apart from their form underlining and embracing the database logic,[5] both Man with a Movie Camera and Berlin were based on actual databases documenting city life. Vertov, as we can see in his film, arranged his film reels on shelves according to keywords, such as “city traffic,” “factory,” “machines,” and “market,” while Ruttmann used a card catalogue system for organizing his footage (see Figure 1 and Figure 2). In this regard, a computational study of city symphonies also comprises a self-reflexive moment as it shares its database logic with its very subject of research and aims to build a catalogue or database of city symphony features itself by looking at individual component parts of the films as equal items.
Figure 1. 
Vertov's film reels archive in Man with a Movie Camera
Figure 2. 
Example of Ruttmann's “Kartothek” used for Berlin, published in Film-Kurier (1926, September 11)
Finally, city symphonies are a perfect case study for exploring the possibilities and applications of computational methods for film analysis as they allow for a variety of research questions, including the issue of film style (editing patterns, experimental techniques etc.), questions of content (motifs, themes), and specific aspects concerning the representation of people and the city.
Figure 3. 
Analysis scheme for the city symphonies research

Film Analysis Step 1: ELAN

For the study of visual motifs and structural patterns of interwar city symphonies, I made use of the digital video annotation software ELAN.[6] More precisely, ELAN formed the center and very fundament of the bottom-up analysis, based on which further analysis steps followed in terms of data evaluation and visualizations (see Figure 3).
Created by the Language Archive of the Max Planck Institute for Psycholinguistics in the Netherlands, ELAN was originally developed for a community of linguists and communication scholars. Nevertheless, this free and open-source tool can also be used in film studies and for visual and structural analysis, as it belongs to a group of professional video annotation programs that support manual annotation tasks, based on a tiered or layered timeline approach to segmentation and annotation. Put differently, a media stream can be divided into multiple segments grouped in categories. ELAN supports this via horizontal rows (or tiers) organized vertically under each other. Each tier corresponds to an annotation type or category the user can define herself according to specific needs and research questions. Within the individual tiers, segments can be defined and annotated, which appear in a horizontal, time-based distribution on the particular tier (see Figure 4).[7]
Figure 4. 
ELAN interface with tiers and annotations (example Berlin)
As single stand-alone platform, ELAN facilitates the analysis of visual material by providing the digital video file (the film), the timeline, segmentation, and annotations in one and the same platform, thereby directly linking these elements to each other so that jumping in the timeline (respectively a tier) brings one to the corresponding fragment and vice versa. What you see is directly where and what you segment and annotate. Moreover, it provides a visualization of the film, its timeline, and the segments and annotations, thus making visible in the same platform the film picture and a visualization of its analysis.
There are other digital video annotation tools, including Anvil, Advene, and Lignes de temps, which follow a comparable approach as ELAN and allow for similar analysis operations and options.[8] However, a brief exercise with these four software programs demonstrated that ELAN was the most convenient and most suitable for my research endeavor. This concerned the media player with its playback rate possibilities, its media control panel, and viewing options as well as the handling and viewing possibilities of the timeline, tiers, and annotation sections. More precisely, ELAN facilitates to speed up the playback rate of the video up to 200% and slow it down to 10%. Moreover, the media control panel supports to jump ahead or back in the video (and the timeline) from one second down to one millisecond,[9] while the video image can also be enlarged and presented in a separate window, unlocked from the control panel and annotation sections. In addition, the tiers can be zoomed in and out, which makes also the segmentation and annotation of extremely small or short segments in a detailed and precise way possible. Indeed, ELAN was also the most convenient of the four tested tools in the manner how it facilitates the creation of segments and the adding of annotations and comments in a very detailed way. As the focus of the analysis in my case lay both on the visual content and structural patterns of city symphonies, the latter for which montage plays a key role, a shot-based segmentation and annotation was the desired goal. In this regard, with its slow playback options of the video and zoom-in display of the timeline and tiers, ELAN allowed to identify and mark also extremely short takes of only a few (or even a single) frame(‑s) and analyze sections of extremely fast editing. In fact, city symphonies are often marked by such a fast and rhythmic editing, for which Berlin’s opening sequence with a train approaching the city and Man with a Movie Camera’s cross-cutting of telephone operators and a factory worker putting cigarettes into boxes are emblematic examples (see Figure 5 and Figure 6). ELAN’s option to set a shot change on the timeline linked to the picture (while the film is playing or pausing) in a very detailed and controlled way made it possible to create a shot-based segmentation also of these more challenging parts of city symphonies that one can go back to and replay at every point on the timeline as often as necessary.[10]
Figure 5. 
ELAN screen shot of Berlin's opening train-ride sequence
Figure 6. 
ELAN screen shot of Man with a Movie Camera's telephone-operators-and-cigarette-boxes sequence (reel 4)
Moreover, its flexibility of defining multiple tiers according to one’s own research purposes, the option to use the same (self-defined) template of tiers as a blueprint for various films, the possibility of creating and using predefined vocabularies for individual tiers,[11] and the combination of these aspects were advantages of ELAN. Another benefit is the tool’s search function, which allows to easily find a specific fragment – without scrolling, clicking, or reeling through the film – and get a list of all annotations, shots or segments including a particular search word. In this way, if a certain consistency is applied in the annotation process, one can, for instance, find all the shots containing a traffic policeman in Berlin in a single search request (see Figure 7). Finally, ELAN’s complete and detailed user manual, its support of various digital video file formats and codecs, its export options, and its computer system requirements as an actively supported and maintained tool contributed to the decision to work with this software.
Figure 7. 
ELAN screen shot of search results for shots including a traffic policeman in Berlin
For the analysis of city symphonies, I manually segmented the four films Berlin, Man with a Movie Camera, Rien que les Heures, and Manhatta and added descriptions, words, and keywords to the segments in a number of tiers focusing on motifs and visual elements, bigger thematic sequences, people, the cityscape, specific locations, and filmic techniques such as fades, multiple exposures, split screens, and fast motion (see Figure 4).[12] As mentioned above, segmenting in this case meant a very fine-grained segmentation at shot level as the basis for adding annotations to the individual tiers.[13] Hence, I identified all shots and marked all shot changes,[14] and used the same self-defined tier template for segmenting and annotating all four films in order to make a structural and precise comparison of their data possible.[15]

Film Analysis Step 2: Word and Shot Lists, Excel Sheets, Cinemetrics, and ImageJ

After the segmenting and annotating operation in ELAN, a time-consuming activity that took four weeks for the four canon films and required a high level of concentration to guarantee consistency and accuracy, further analysis steps followed. Basically, there were two tasks that required a couple of additional steps. The four data sets from ELAN (each corresponding to one of the films) had to be merged into one, allowing for a direct comparison and combined data evaluation. Following my research questions, I was interested in the shared features of the city symphony films rather than in detailed analyses of individual titles. Moreover, this data merging into a combined data set of city symphony films should happen according to the foci on visual motifs and structural patterns.[16] Consequently, the analysis should result in two lists: one for visual motifs, one for structural patterns. In this regard, annotations from the tier “imagery/visual shot elements” were used for the motif analysis, while the very segmentation, the tier “shot list,” as well as the tier “filmic techniques,” and the timeline visualization in ELAN formed the basis for the structural analysis (see Figure 3). Nevertheless, it needs to be said that both parts of the analysis, lists, and sections also stand in dialogue with each other and cannot be split that strictly. Indeed, it is also not desirable to do so as visual motifs and structural patterns, content and form are interrelated.[17]

Motif Analysis

For the analysis of visual motifs, I exported the tier on visual elements from ELAN. More precisely, ELAN provides an export option called “List of Words,” for which one can select a single or several tiers. The software compiles a wordlist in alphabetic order from all annotations in the selected tier(‑s) in a text file, optionally accompanied by the occurrences of the words. In this way, I was able to get an overview of all annotated visual shot elements of the four canon films.
Moreover, the occurrences gave me an idea about important aspects. However, while an appearance of trams in at least 68 shots in Berlin hints at a rather important and recurring visual element, the amount of only one shot presenting a street cleaning vehicle does not necessarily lead to the opposite conclusion of a minor motif. In fact, theoretically, all trams could have been visible in the background of the frame, whereas the street cleaning vehicle in a single shot of three minutes could have been in the very center of attention. In this regard, the street cleaning vehicle would form an essential visual motif too, even though it occurs only in a single shot. This, of course, depends also on the way and detail of analysis and the guidelines the scholar defines and applies in the annotation process.[18] Nevertheless, the example demonstrates that the occurrences exported from ELAN can be considered as indications in terms of importance rather than as absolute entities. Indeed, quantitative and qualitative aspects have to go together, since the trained eye and viewing experience of the film scholar can identify nuances and decide about emphases, thereby enriching, correcting, or verifying results from a strictly formal analysis of visual shot elements.[19]
Based on the exported word lists of the four canon films, I clustered the individual elements per film to motif groups and added a number of further aspects from the tiers relating to questions of people, elements of the cityscape (e.g. streetscape, industrial site, specific building or monument), and greater narrative sequences and themes, such as commute, morning routines or lunch break. Subsequently, these processed word lists were merged manually into a master list of visual motifs, indicating if a specific motif identified in this bottom-up approach could be found in one, several, or all four canon films as shared feature of city symphonies (see Figure 8).
Figure 8. 
Excerpt of motifs list

Structure Analysis

The analysis of structural patterns and filmic (montage) techniques was more complex than the motif analysis as it required further sub-steps and introduced two additional computational analysis tools: Cinemetrics (or better: the Cinemetrics approach) and ImageJ in combination with the plug-in ImageMontage. Before explaining what these two analysis methods comprise, there is another step to mention, which forms the fundament for both of these methods.
While for the motif analysis I exported alphabetic word lists from ELAN, for the structure analysis I exported shot lists with timecodes. In fact, when identifying shot boundaries on the timeline with the cursor during the segmentation process, ELAN notes the timecodes, which can be exported (via the “tab-delimited text” option and text files) into Excel spreadsheets as timecoded shot lists.[20] Hence, the fine-grained segmentation on shot level was the basis for this export option and, subsequently, for a film rhythm and montage analysis both in the Cinemetrics tradition and with the software ImageJ. Both of these tools rely on timecodes and shot durations, supplemented, in the case of ImageJ, by a precise indication of frame numbers.
Yuri Tsivian and Gunars Civjans (later joined by Daria Khitrova) launched Cinemetrics in 2005, a software that supports the measurement of films by counting the length, (type), and number of shots and presenting the results of these measurements in bar charts – as visual representation of the rhythm and editing patterns of films.[21] In these charts, the shots are arranged according to their succession in the film on the x-axis, while the height of the individual bars on the y-axis visualizes the duration of each shot (see Figure 9). Such an analysis and representation of shot durations is especially valuable for city symphonies, since these films – as montage films – are concerned with both the tempo of the city and the rhythm of the films themselves, a rhythm created purely by visual means and by translating musical guidelines into the visual language of cinema.
Figure 9. 
Cinemetrics shot length and film rhythm analysis of Man with a Movie Camera
Figure 10. 
Shot length and rhythm analysis of Rien que les Heures in Excel (with ELAN data)
In this regard, I combined an analysis in the Cinemetrics tradition with data compiled in ELAN. The emphasis lies on tradition (or the Cinemetrics approach) since I did not use the very Cinemetrics software itself, but took the data exported from ELAN, the start times and durations of shots, to compile comparable bar charts of shot lengths in Excel (see Figure 10). The motivation to work with Excel and ELAN instead of Cinemetrics was to combine the analysis work in one tool and avoid segmenting and analyzing the same material multiple times in different programs. Moreover, Excel in combination with ELAN guaranteed a higher degree of accuracy, as shot changes can be identified more precisely in ELAN than with the Cinemetrics tool, and all four films were analyzed and measured exactly in the same way.[22] In fact, it needs to be said that ELAN itself in its timeline approach offers a representation of rhythm and shot lengths, too, as it displays shots (or segments) in their respective length horizontally next to each other, split by vertical lines representing shot changes (see Figure 5 and Figure 6). However, the tool does not allow to zoom out as far as to see an entire act or even the whole film.[23] The bar charts in Cinemetrics and Excel, on the other hand, support such a complete (over‑)view.
In the end, film rhythm and editing do not only rely on shot length, but also relate to alternation between shot distances, movements within shots, camera movements, and the content of shots. As Vertov himself put it, montage is “the sum of various correlations,” including “the correlation of planes” (shot distances), “the correlation of movement within the frame, the correlation of light and shadow,” and “the correlation of recording speed” [Vertov 1984, 90–1].[24] In this regard, ImageJ with the plug-in ImageMontage provides an analysis and visualization form that takes these elements into account and complements the graphical Cinemetrics analysis with a more figurative expression. More precisely, in its specific application developed by Manovich’s Software Studies Initiative,[25] this software allows to combine a selection of individual film frames into a single image as a grid – as a montage of images. As Olesen explains:

As a cinemetric tool, the ImageJ/ImagePlot software developed within Cultural Analysis distinguishes itself by processing films as image sets to create visualisations, instead of extracting metadata to produce reduced, statistical representations. ImageJ breaks down video files into sequences of separate images and seriates them according to specific image features in various visualization types. For example, the Montage visualisation type orders frames onto a grid according to their sequential order, in rows from left to right, enabling a quick, comprehensive overview of movements between shots.  [Olesen 2017, 170]

Indeed, by breaking down video files into separate still images, ImageJ recreates (more or less accurately) the individual frames of a film in digital form, comparable to their analogue equivalents on the film stip. Via a text file with a list of file names corresponding to individual images in the total number of images of a film, the ImageMontage plug-in allows to define a very specific selection of pictures to be compiled into such a montage visualization. For the analysis of city symphony structures, film rhythm, and editing patterns, I defined and produced three different types of image montages: Either a montage consisted of all individual frames of a shot or sequence (see Figure 11) or it represented a selection of frames from a specific scene or sequence: This selection presented one frame per second (see Figure 12) or it displayed one frame per shot (see Figure 13 and Figure 14).[26] Together, these three montage types, used depending on the length and complexity of the specific shot or sequence to analyze, facilitated a deeper and more nuanced study of film rhythm and different editing patterns by taking a look at the visual material of the films in a condensed or compact way.
Figure 11. 
Image Montage of Rien que les Heures's newspaper sequence (all frames of the shots, 31'10''-31'26'')
Figure 12. 
Image Montage of Berlin's newspaper headlines, rollercoaster, merry-go-round and spiral sequence (one frame per second, 46'55''-47'34'')
Figure 13. 
Image Montage of Manhatta's crowds in the city sequence (one frame per shot, 1'41''-2'59'')
Figure 14. 
Image Montage of Man with a Movie Camera's train ride sequence (one frame per shot, 10'16''-11'14'')
To be able to produce these image montages and define specific frames to be included in the grids, it was essential to convert ELAN’s timecodes of shots and shot changes into frame numbers corresponding to the numbers of individual images in the film (or video file respectively). In fact, if ELAN would have had an option to define specific segments (or shots) not only in a time structure but also in frames,[27] this converting step could have been skipped. Either way, the importance of ELAN and segmentation as basic step once more becomes obvious here.
Finally, for the analysis of structural patterns, the timeline representation and visualization in ELAN itself was essential as well. Above, I have already mentioned the visualization of film rhythm and editing in the horizontal timeline approach. Moreover, the timeline and annotation options in various tiers make it possible to see how different elements are linked to each other and to detect relations, recurring elements, and patterns. In addition, the tier “filmic techniques” contributed to the identification of structural patterns as it gave an overview of experimental techniques and shot changes per film.
In the end, all these analysis steps together, from ELAN via Excel to Cinemetrics and ImageJ, made it possible to compile a master list of shared structural patterns of Berlin, Man with a Movie Camera, Rien que les Heures, and Manhatta (see Figure 15). However, as it has become obvious from the various analysis steps, this list was created in a less straightforward way than the master list of visual motifs and required a higher degree of interpretation.
Figure 15. 
Excerpt of structures list

From the Canon to the Broader Corpus of City Symphonies

In a final analysis step, the two master lists of visual motifs and structural patterns were expanded to the broader corpus of city symphony films. Regarding this greater corpus, I used the overview of identified shared characteristics of Berlin, Man with a Movie Camera, Rien que les Heures, and Manhatta to check, based on viewings, if and to what extent they could be found in these titles, too. This means that I did not start from a detailed in-depth close reading of each film in ELAN, as I did with the canon films, but used the canon features as a check list to analyze to what extent these features are applicable to a bigger group of films and thereby can be defined as characteristics for the interwar city symphony phenomenon in general.[28]

City Symphony Features, Motifs, and Structures (Results 1)

Through the methodology described above, I could indeed identify an extensive catalogue of shared visual motifs and structural patterns of interwar city symphonies and their presentation of the modern city.[29]In fact, while, on the one hand, I could underpin certain aspects emphasized in scholarship and film historiography, on the other hand, I could also revise other elements and mark them as misreadings or (over‑)hasty conclusions. Moreover, I could also add new findings. The result is a much more nuanced, detailed and sophisticated picture of the phenomenon of city symphonies, of which this article includes just a small fraction.[30]

Supporting Lights on Garbage-Minding, Crowds, and Rhythmic Montage

My analysis, for instance, confirmed the “garbage-minding” discussed by Kracauer (1960, p. 54) in relation to Berlin and Rien que les Heures, and I could identify it as a general feature of city symphonies.[31] I could do so by detecting and analyzing the application of this motif in the canon films, which does not only concern the actual depiction of waste and dirt,[32] but also includes (and overlaps with) the elements of street cleaning and waste collection as city activities that are presented especially as morning routines (see Figure 16). Moreover, I could find similar shots and scenes in the bigger city symphony corpus. At least twenty other titles include a focus on waste.[33] The same is true for the depiction of street cleaning activities.
Figure 16. 
Waste collection in Berlin (top row) as well as street cleaning and “garbage-minding” in Berlin (left), Man with a Movie Camera (middle), and Rien que les Heures (right)
A second feature often associated with city symphonies in the literature I could support and strengthen by my analysis is the films’ focus on crowds (see e.g. Barsam, 1973; Weiss, 1995; Weihsmann, 1997; Dähne, 2013). While the look at the greater city symphony corpus revealed that crowds play an important role in at least forty-four of the sixty-seven titles studied, the ELAN analysis of the canon films allowed to take a closer look at the construction of these crowds – thus how visual content and cinematic techniques create this motif together. In this regard, it is remarkable that there are not too many “real” crowd shots with masses of people. Instead, we can often speak of an accumulation technique, by with the impression of crowds is created visually through editing. In the case of Berlin, seventy-two shots display more than fifty people, which is about 6.5% of the entire film. The crowd-effect, however, arises significantly from the sum of street shots with a few people as well as smaller groups (see Figure 17).[34] Edited together, they form the city crowds and huge movements within the cityscape. The same is true for Man with a Movie Camera, which contains fifty-six “real” crowd shots (3.3% of the entire film), while presenting 705 shots with a single person (41.1% of the entire film). Put together by means of montage, these individuals merge into the urban crowd of the cinematically created new Soviet city, that combines the cityscapes of Moscow, Odessa, and Kiev.
Figure 17. 
Example of crowd construction by accumulation in Berlin (one frame per shot)
In combination with the Cinemetrics approach and ImageJ, ELAN also allowed to underline the symphonic form and rhythmic montage of city symphonies, described, among others, by Grierson (1933), Rotha (1936), Kracauer (1947), Jacobs (1949), Barsam (1973), Weihsmann (1997), and Dähne (2013). Indeed, the analysis of the canon films showed in detail how the well-calculated and -structured alternation of shot lengths and the varying duration of shots together with associative editing, analogous and contrasting montage, as well as cross-cutting strategies generate this rhythmic form (see Figures 5-6 and Figures 10-14). In addition, the visual elements of opening, closing, starting, stopping, arriving/entering, and leaving activities (motifs list) also greatly contribute to city symphonies’ rhythmic form and general pulse of the city. There are, for example, shots of opening doors, gates, blinds, and rolling shutters in Rien que les Heures, Berlin, Manhatta, and Man with a Movie Camera, particularly presented in the morning. Moreover, there are scenes of people arriving at work, entering buildings, cafés, shops, elevators, telephone boxes, cars, busses, taxis, and various types of trains. The opposite activities of leaving and de-boarding are depicted in combination with the same elements and situations.

The Myths of the Nature-Free Industrial City without Landmarks

I could also rectify certain aspects that had been introduced in the literature. For instance, intertitles and hired actors (or better: staged scenes) are not as unusual as often claimed (see e.g. Jacobs, 1949; Uricchio, 1982; Turvey, 2011).[35] Though, while these points could be supported especially through the look at the greater city symphony corpus, the aspect of natural elements in the city as a focus in city symphonies could be identified and revised particularly through the act of doing the ELAN analysis and studying the compiled (and exported) data with regard to Berlin, Man with a Movie Camera, Rien que les Heures, and Manhatta. More specifically, during the activity of marking shots and adding notes to the visual elements in these shots, I realized the recurrence of aspects such as clouds and sky, water and trees. A focus on these phenomena was further confirmed in the wordlists exported from ELAN and merged into the motifs master list. In addition, the step of checking the greater city symphony corpus also underlined the presence and focus on these natural elements in city symphonies. There are at least thirty-six titles depicting aspects such as clouds, water and trees in the city. Moreover, the detailed ELAN analysis of the canon films showed that these elements often function as atmospheric shots and images of transition or introduction. Clouds and water also work as indicators of the passing of time. While this might not be too surprising, the close reading of the films also stressed both the aspect of urbanized natural elements and the coexistence of natural and urban aspects in the city. Indeed, especially water appears in the city as civilized or urbanized water – in relation to street cleaning, wet streets, gutters and sewerage as well as in combination with water fountains and, in the case of Man with a Movie Camera, the waterfall of the Dneiper electricity power station (see Figure 18). Here, a natural source is transformed into a modern and industrialized energy force (see also Roberts, 2000). Moreover, natural elements exist in the city, such as trees that often appear in street scenes at the line between sidewalk and street, especially in Vertov and Ruttmann’s urban universes.[36] In addition, natural elements exert an influence on urban structures and, in this way, make part of the urban environment, permeate urban structures, and become interwoven with the city. This is especially the case of wind, rain, and thunderstorm in Berlin, Man with a Movie Camera, and Rien que les Heures (see Figure 19). Natural elements and organic forces are thus introduced to the city in these films, which show that the modern and industrialized city is anything but nature-free. This stands in contrast to Horak (1995), who states that the role of nature in city symphonies is basically limited to its role in leisure-time activities, especially in the European films.[37] However, while natural elements, indeed, also appear in relation to sports and the recreation sector in many city symphonies, the ELAN analysis shows that they play a much more prominent and complex part – also (and especially) in the European films. They permeate the city just like the city permeates and urbanizes nature and coexist with urban structures.[38]
Figure 18. 
Urbanized Water in Rien que les Heures, Manhatta, Man with a Movie Camera, and Berlin
Figure 19. 
Wind in Berlin, Man with a Movie Camera, and Rien que les Heures
Another shortcoming resulting from literature and film historiography my analysis could revise is the assumption that city symphonies exclude landmarks and monuments. Weihsmann (1997), for example, claims regarding Berlin that the film makes no reference to actual places. Balázs also points into that direction as he describes Berlin as a non-touristic film “with little use as a guide to a stranger arriving in the city for the first time” and sees both Ruttmann and Cavalcanti’s films as “mental associations” that have nothing to do with presenting reality or specific places [Balázs 2010, 162–163]. From discussions like these and the undeniable focus of city symphonies on everyday life (hence, they are non-touristic films),[39] the general assumption emerged that these films do not present landmarks, specific buildings, and monuments in favor of an abstract presentation of a generic cityscape.[40] Nevertheless, while city symphonies also present numerous rather unspecific street scenes and city views, recognizable landmarks and monuments are included in quite a number of these films (at least in twenty-four titles). I could specifically mark and analyze these landmarks in ELAN regarding Berlin, Man with a Movie Camera, Rien que les Heures, and Manhatta. In this context, it is remarkable that all four canon films do not only present recognizable constructions like the Woolworth Building, Brooklyn Bridge, or Trinity Church (Manhatta), the Place de la Concorde with the Luxor Obelisk and the contours of the Eiffel Tower in the background (Rien que les Heures), the Berliner Dom, the Berliner Funkturm, or the Rotes Rathaus (Berlin), or the Bolshoi Theatre in Moscow (Man with a Movie Camera),[41] but they also do so in the very first minutes. This can also be seen in the light of introducing the cityscape. As a sort of establishing shots, these images function as structuring devices of introducing and establishing space. They anchor the generic and often anonymous streets and squares that follow in a concrete and identifiable place.[42]

Horizontal and Vertical City Explorations and Temporal Phasing

Closely linked to the identification and revision of certain aspects as shortcomings or myths, I could finally also find aspects that were not greatly paid attention to so far or even completely overlooked. For instance, by the close analysis of the canon films in ELAN in combination with the examination of the greater city symphony corpus, I could discover the prominence of harbors (and harbor cities)[43] and the extensive depiction of swimming as the most popular sports and leisure activity. Moreover, once again by the act of doing the analysis in ELAN as well as the resulting data and motifs list, I could identify an attention given to the depiction of wet streets (with their reflecting surfaces). I observed this first in Berlin (especially at night and in combination with electric lighting, neon signs, and vehicles’ headlights), but could also detect wet street scenes in the other canon films and the extended city symphony corpus.[44] Similarly, the focus on stairs came above water as well as their function to underline the verticality of the city – its above and below, including subways, streets, bridges, viaducts, and tall buildings.
Some of the most interesting findings also resulted from the in-depth exploration of the structuring patterns. Indeed, while generally the city in city symphonies is perceived as chaotic and overwhelming, there is a highly organized structure to create and emphasize this effect, among others through cross-sectional and rhythmic editing.[45] However, to keep a balance with the turbulent and fragmented city impressions, city symphonies also apply a number of ordering strategies to structure the city experiences and keep things from becoming overly abstract. Literature has frequently (and often exclusively) referred to the day structure or dawn-to-dusk form (see e.g. Grierson, 1933; Rotha, 1936; Noguez, 1985; Weihsmann, 1997; MacDonald, 2001; Vogt, 2001; Dähne, 2013), but, as my analysis demonstrates, the ordering strategies are much more substantial and include both temporal and spatial structures.
Regarding spatial structuring patterns, I want to focus here on the horizontal and vertical establishment of space and city explorations identified and analyzed in ELAN in all four canon films.[46] The arrival-into-the-city subject is part of the horizontal explorations.[47] This is also true for rides through the city, that give the viewer an idea about the spatial dimensions of the city that unfolds in front of her eyes.[48] Moreover, Berlin, Manhatta, Rien que les Heures, and Man with a Movie Camera also establish the city in a vertical perspective. Above, I have already mentioned the visual motif of stairs that underlines verticality. Moreover, in a sort of zoom-in or cut-down effect by means of montage, Berlin presents high angle panorama shots of the area around the Berliner Dom after the initial arrival into the city – with each shot presenting the city from a slightly closer and more tilted perspective. Subsequently, Ruttmann shows shots of lower angles looking up at the clock tower of the Rotes Rathaus, followed by ground level shots of deserted streets. These images provide a certain overview of the cityscape and add to a sense of orientation.[49] Strand and Sheeler, too, present high-angle perspectives and panorama shots from a greater distance before they get closer to street level in shots of the Staten Island Ferry and crowds walking in the streets of lower Manhattan. The same is true for Cavalcanti, who first introduces the city by a high (and abstracted) perspective of maps before switching to the urban space of Paris and, subsequently, the borrow of Montmartre on ground level (see Figure 20). Vertov, in contrast, does not really make use of such a cut-down effect, but, nevertheless, visually hints at verticality throughout his film – from a coal mine underground to a bridge and factory chimneys reaching high in the air. Moreover, he exemplarily explores urban space vertically and horizontally in a sequence in reel three, in which he presents images of a wildly titling and panning camera showing streets, junctions, and squares intercut with images of a looking and blinking eye (see Figure 21). While this sequence reflects on human vision, it also demonstrates both the vertical and horizontal filmic depiction of the city. Moreover, Vertov elaborately makes use of extremely high and low camera angles as well as tilted shots, that further emphasize verticality. The ELAN analysis of shot distances, camera angles, perspectives, and montage techniques demonstrates that these cinematic devices are also part of the other canon films and their investigation and presentation of urban space.
Figure 20. 
Vertical city introductions in Rien que les Heures (top), Berlin (middle), and Manhatta (bottom)
Figure 21. 
Eye-and-urban-panning-and-tilting sequence in Man with a Movie Camera, excerpt with all frames of the shots
Concerning temporal structuring patterns going beyond the well-known chronological day-structure,[50] the ELAN analysis particularly showed the aspect of the linear organization of events and activities into phases of before, during, and after (see Figure 22). To be clear, this is not the overall rule, but it is applied deliberately in a selective way and in combination with the chaotic and dizzying city impressions.[51] More precisely, there are two forms of this structuring technique. The first one refers to linear development (or phasing) within a single (continuing) scene, sequence, or section – thus by means of linear editing that ELAN accentuates in its timeline visualization. An example is the murder scene in Rien que les Heures or the sports-races-and-matches episode in Berlin, which Ruttmann explicitly structures into shots presenting starting moments, followed by shots presenting the middle part of competitions, followed by shots presenting the final moment of a marathon with the winner crossing the finish line. In Man with a Movie Camera the ambulance-and-fire-brigade-operations episode is presented with an introductory, operational and an extroductory phase in one continuing sequence. The second form is the organization of events and activities into different phases stretched out over greater parts of the film. This is especiall used by Ruttmann and Vertov. For instance, they both present the beginning, during and end of work spread out over their films, combined with the activity cluster of machines at rest, starting to move, moving, stopping to move and at rest again.[52] Moreover, Ruttmann displays the before, during and after of a thunderstorm, interrupted by (and joined with) both a suicide and a fire-brigade-operation sequence. Vertov, too, presents the operations of the cameraman often in a preparatory, an executing and a concluding phase, as he shows him, intercut with other scenes, riding to the location of the film shoot, the actual act of filming and the departure from the location. These elements that are paused and picked up later – not seldomly to underline simultaneity – could be examined in detail especially through the non-linear analysis of keywords and individual component parts of the films that ELAN facilitates to disassemble and reassemble in a highly structured way.
Figure 22. 
Temporal Phasing in Berlin (top), Man with a Movie Camera (middle), and Rien que les Heures (bottom)
Figure 23. 
Cross-sectional editing and thematic clustering in Berlin (cleaning activities)
Figure 24. 
Cross-sectional and associative editing in Rien que les Heures (men-sleeping-in-the-streets sequence, one frame per shot)

Database and Narrative

The temporal structures – both the linear organization of activities and events into phases and the chronological day structure – introduce, to a certain degree, narrative aspects to city symphonies. More precisely, coming back to Manovich’s logic of the database versus the logic of the narrative, we can conclude from the analysis that city symphonies, in fact, relate to both of these logics, combining, merging, and interweaving a catalogue of (unordered) equal items with elements of linearity and a certain cause-and-effect trajectory. Indeed, in their wide, extensive, and diverse visual material and the accumulation of numerous urban impressions and phenomena organized in cross-sections, city symphonies anticipate the logic of the database. They collect and present an enormous catalogue of urban subjects and cross-sections of the modern city and modern urban life, which I have analyzed (and put into a database myself) in the motif analysis. Cross-sectional editing strategies and thematic clustering, which evoke simultaneity across space, make part of this database logic or aesthetic, too[53] – hence the coming together of visual content and filmic techniques and structing patterns (see Figure 23 and Figure 24). City symphonies present the modern city as an inventory. However, this is only one side since, as elaborated above, there is also the structuring of urban events and activities into phases, which introduces linearity and a certain chain of cause and effect, of before, during, and after. Moreover, the day structure also causes a chronological/linear development.[54] Therefore, in their highly complex structuring and in their extensive urban material, city symphonies anticipate and reflect on the issue of database and narrative and play on tensions between a catalogue of motifs versus a structured history based on chronology and linearity. We can consider these films as both databases and narratives.[55] The ELAN analysis supports a study of both logics in the combination of its linear/chronological timeline approach and its data production, listing and cataloguing options. In this regard, the digital analysis tool itself also relates to both logics of the database and the narrative.

Reflections on Digital Tools, Research Transparency, and Manual Digital Analysis (Results 2)

I would like to conclude this research report with a couple of reflecting remarks that exceed the city symphony focus and underline some benefits of digital tools like ELAN and Cinemetrics, that might have a relevance for the entire field of film and media studies.
The first one concerns budget and research frameworks. My project did not provide specific funding for the use of digital tools or a close collaboration with software developers or IT engineers, who might have come up with a different solution for my research endeavor. In fact, the use of digital tools was not part of the initial research plan of the umbrella project “City Symphonies: Urban Modernity, Film, and Avant Garde (1920-1940)”, funded by the Special Research Funds of Ghent University, but came later by my own initiative.[56] Therefore, the application of ELAN and the Cinemetrics approach in combination with excel sheets and ImageJ can be considered as a best practice for getting my analysis done within the available recourses and timeframe.[57] As I only used free and open-source software (compatible with most computer operating systems) as well as programs that were provided as part of the work package for researchers at Ghent University,[58] this approach should be widely accessible and might be interesting also for other scholars with little funding. Moreover, with their variable, flexible, and open form, the tools I used are applicable and customizable for a great variety of research questions.
The second remark is about research transparency and visual argumentation. ELAN allows for highly structured annotations, segmentations and systematic annotation protocols, which can be considered as externalizations of the observations that normally take place inside the scholar’s mind during close analysis [Melgar et al. 2017]. Moreover, they can easily be shared with other researchers, who load them, together with the corresponding video files, in their own ELAN version on their computer. In this way, the analysis process and research, including various steps, collected data, findings, and conclusions, become more transparent, retraceable, and shareable. Moreover, the ELAN software and interface as well as the Cinemetrics concept provide visualization methods that support statements, which, with traditional film analysis methods, are harder to underpin, generally only by written explanations. In fact, these data visualizations become visual arguments themselves. This is also true for the image montages made with ImageJ. The film or media scholar receives new powerful tools to illustrate, underline, and fortify findings and observations in a direct visual way.
The final comment concerns a significant benefit of using a tool such as ELAN in a manual way, which might also make us rethink the notion of digital analysis. From the methodology described in this article, the impression might arise that I have mainly reproduced manual film analysis practices in a digital environment: I noted shot changes and added descriptions about visual and filmic aspects to this shot-based protocol. While one could indeed question the advantage of ELAN applied this way, there is the very activity of doing ELAN – the very experience of segmenting and annotating – that leads to an in-depth knowledge of the film and a level of hands-on research and learning that neither a traditional film analysis nor a digital analysis with automatic shot detection alone can provide.[59] This is probably most obvious for editing techniques and montage patterns, as the “ELAN experience” can, to a certain degree, be considered as a recreation or reconstruction of the film editing process, starting in reverse from the end product of the final film. In fact, in its design, timeline approach, segmentation, and annotation possibilities, ELAN resembles digital film and video editing platforms, such as Final Cut Pro or Avid Media Composer.[60] In this regard, manually segmenting and annotating in ELAN on the level of shots imitates or retraces the editing process (in digital form) and, in so doing, provides an empirical way of analyzing a film in its structures and motifs. We can speak of a reverse-engineering scholarly experience. More precisely, it is an experience that – as a hands-on activity – includes both eyes and hands (or better: fingers), which results in a more intimate interaction with the material instead of just watching the film. Simply speaking, by using your eyes and fingers in tandem, you get to see things and observe connections and interrelations between various elements you would not be able to see otherwise – or might easily miss. In this article, I have discussed a couple of findings regarding city symphonies that surfaced this way, such as the construction of crowds out of individuals and small groups, the prominence of wet streets and stairs, the interweaving of natural elements with urban structures or the adroit horizontal and vertical city explorations. Moreover, to be clear, the fingers aspect does not only concern the reverse editing activity as detecting shot boundaries, but also the descriptive part of adding keywords and describing shots that contributes to the increased examination of the material and in-depth knowledge of the films. This hands-on angle of using digital tools is an underestimated aspect in digital humanities as digital is increasingly understood as automatic as opposed to manual. Large bodies of automatically acquired data became available and, as a result, the trend is distancing scholars from their material. However, as I have shown, especially the manual hands-on use of digital tools such as ELAN or Cinemetrics can be extremely beneficial in giving film and media scholars (and students!) an invaluable experience of more intimately interacting with their material. This might also lead to a rethinking of the idea what digital analysis is or can be. In the end, the manual fingers-eyes use of ELAN can be considered digital in two respects and, if you will so, in its purest way: It is digital because it produces numbers; it is also digital, etymologically, because you do it with your fingers.[61]

Appendix

Links to Tools:

Notes

[1] For a comprehensive survey of interwar city symphony films, see Jacobs, Kinik and Hielscher (2018). In fact, apart from Dähne (2013), this edited volume can be considered as the first book-length publication and overview on the topic.
[2] This conclusion, though, should not deny the importance of key texts such as Dähne (2013) and Weihsmann (1997), who made a start in changing this situation. This is also true for Bollerey and Föhl (2014).
[3] They create these impressions by cinematic means.
[4] Manovich (2013) also did pioneering work in visualizing Vertov’s films by digital means, as did Heftberger (2016) and the project “Digital Formalism”. In fact, my methodology builds on both Manovich and Heftberger’s research. Another relevant study, to which I will return later, is Olesen (2017).
[5] This happens, among others, through cross-sectional editing and thematic clustering. I will come back to this.
[6] I made use of ELAN 4.9.4. for Mac OS X. For the software and additional information see https://tla.mpi.nl/tools/tla-tools/elan/ (last consulted on November 29th 2018).
[7] In addition, it is possible to mark and define hierarchical relationships and dependencies between tiers and annotations, a feature I have not made use of in my study. In fact, this article discusses the way I used ELAN for my research endeavor, whereas there are more functionalities, including also the possibility to generate different queries or retrieve more complex annotation statistics. For a more general discussion of ELAN, see Melgar et al. (2017).
[8] For the individual tools, see http://www.anvil-software.org/, https://www.advene.org/, and https://www.iri.centrepompidou.fr/outils/lignes-de-temps/ (last consulted on November 29th 2018). For a further elaboration of ELAN and other digital video annotation tools for film analysis, see Melgar et al. (2017).
[9] This depends on the zoom factor of the timeline viewer. When the zoom factor is at its maximum (1000), the control buttons, which enable to go to the next or the previous pixel, will make the video jump ahead or back one millisecond. Of course, jumping from frame to frame would be more accurate, which corresponds to jumping ahead or back 41.67 milliseconds (in case of a framerate of 24 frames per second).
[10] This is also an advantage of ELAN’s computer-based analysis in contrast to an analogue film analysis. While an analysis of extremely fast editing is not impossible in an analogue way when working with the film material itself on a montage bench, or by using a VCR or DVD player (see Kolaja and Foster, 1965), ELAN facilitates such an analysis in a feasible and time-saving way. This, of course, could be made even more time-efficient when shot changes would be detected automatically. In fact, some of the tools, including Linges de temps, provide such an automatic or semi-automatic segmentation. However, according to my knowledge, there is no satisfactory tool including a reliable and error-free automatic shot boundary detection at this moment. Lignes de temps, for example, fails when it comes to extremely short segments, as is often the case in experimental films. Moreover, there is the general issue of fades, since the software cannot detect these soft changes. In addition, there is a crucial benefit of manually segmenting, to which I will come later in more detail.
[11]  I used the option of predefined vocabularies especially for filmic techniques and shot distances.
[12] I also looked into camera angles, shot distances, camera and object movements. However, this was rather done in a selective way.
[13] “Shot” is understood here as the basic unit of the film, being characterized by a unity of time and space, a continuously exposed and uncut piece of film – a series of film frames without cuts.
[14] Shot changes included hard cuts, dissolves, fade-ins and fade-outs.
[15] The digital video files used for the analysis (mpeg-4-films, codecs H.264, AAC, and QuickTime Text) had their origins in the following DVDs: Berlin, die Sinfonie der Großstadt and Melodie der Welt, Walther Ruttmann, 1927, 1929 (Edition Filmmuseum 39, 2008), Man with a Movie Camera, Dziga Vertov, 1929 (Image Entertainment, 1998), Masterworks of American Avant-Garde Experimental film (1920-1970), including 2K restoration of Manhatta, Paul Strand and Charles Sheeler, 1921 (Flicker Alley, 2015), and Avant-Garde 3: Experimental Cinema 1922-1954, including Rien que les Heures from the collection of the George Eastman House, Alberto Cavalcanti, 1926 (Kino International, 2009). Unfortunately, the Masters of Cinema/Eureka Blu-ray of Man with a Movie Camera (2016) had not yet been released when I started my analysis, which presents the new Lobster Film/EYE Filmmuseum restoration from 2014, the uncut, most complete, and full-frame version of Vertov’s film. It needs to be said that the digital video files as well as their source DVDs differ from the original 35mm film material in terms of absolute frame accuracy. As silent films were shown at rates between 16 and 24 frames per second, these original rates have to be “transformed” to the DVD standard of 25 frames per second (European PAL format) respectively 30 frames per second (American NTSC format), otherwise the films look speeded up. This film-to-video transfer happens either by interpolating frames, selectively and consistently repeating frames to convert to the video frame rate, or (in the case of a DVD based on a progressive frame by frame digital scan) by flags in the DVD instructing the player to generate extra frames during playback. I won’t go into more detail here but underline that I am aware of the issue of frame (in‑)accuracy with regard to digital video files and that ideally, one would aim to use frame-accurate digital versions of the films. However, for the purpose of my analysis of city symphonies’ features, an absolute frame accuracy was not necessary. Therefore, I went with the “unclean” digital video files without aiming to eliminate the interpolated or doubled frames first – simply for pragmatic reasons. For a frame accurate process, see Jacobs and Fyfe (2016).
[16] This double focus was also related to a further research sub-question: if the city symphony can be considered as a film genre or a film cycle. I approached this issue by using Altman’s semantic/syntactic approach to film genre (1984) as a heuristic tool for the analysis, which resulted in a split of the analysis and city symphony features into visual motifs (the city symphony semantics) and structural patterns (the city symphony syntax).
[17] In the analysis scheme, this is represented by the orange double arrows (see Figure 3).
[18] In fact, the more detailed the level of annotating, the more difficult it becomes to identify greater motifs. In this regard, elements such as cars, trams, taxis and trains can be combined under the common denominator of modern means of transportation.
[19] Moreover, the combination of the tier “imagery/visual shot elements” with other tiers, especially the row on shot distances, gives indications about the role of visual elements in the picture. Further aspects include shot length, position in the entire film, within a specific sequence etc.
[20]  More precisely, this “tab-delimited text” export option allows to select specific tiers and present them in the resulting text file in separate columns per tier, including (if selected) columns for start time, end time and duration of segments or shots. The data and columns can easily be transferred into Excel spreadsheets by copying and pasting.
[21]  See http://www.cinemetrics.lv/index.php (last consulted on February 6th 2017). On the Cinemetrics tool and method, see also Heftberger (2016) and Olesen (2017).
[22] Apart from the analysis tool, the Cinemetrics website actually also provides a growing database of films measured and analyzed with the Cinemetrics software by scholars all over the world. This database comprises shot length analyses of Man with a Movie Camera and Berlin. However, to guarantee comparability and accuracy, I used the data from all your films compiled in ELAN.
[23] At least, this is the case in version 4.9.4. for Mac OS X.
[24] He speaks about “the visual correlation of shots,” “the visual ‘interval’” and “movement between shots” that is montage. In fact, later he refers to the “montage battle.”
[25] Manovich developed the plug-in ImageMontage and the software extension ImagePlot for the open-source scientific visualization software ImageJ, which was originally introduced by the National Institute of Mental Health in the US. Following Olesen it “advanced the combination of modern computation techniques with microscopy and gained widespread success in a broad range of disciplines in the natural sciences” [Olesen 2017, 168]. See also http://lab.softwarestudies.com/2014/03/how-to-visualize-4512-instagram-selfies.html (last consulted on July 20th 2018).
[26] Normally the tenth frame was selected to avoid imprecisions in shot changes, fades etc. See also Heftberger (2016) and Manovich (2013) and their earlier analyses and visualizations of Vertov’s films.
[27] With one second consisting of 24 frames or less (in the case of silent films).
[28] The strong focus on the canon films as prototypical examples was particularly motivated within the greater objective of my dissertation: an examination and critical reflection of the city symphony historiography. Therefore, the analysis also takes historiography and the city symphony as the product it became throughout historiography as its center of focus: How can the city symphony be defined according to the construct it became in film historiography – and from there expanding to the greater corpus of these films, which have only marginally been mentioned in film history writings? Moreover, there was also a very pragmatic reason as the level of detail applied in the analysis of the canon films could not have been extended to a greater corpus of sixty to eighty-five titles.
[29] Moreover, by arriving at a list of visual motifs and structural patterns – or semantic and syntactic features – shared by a great number of films, I could also fortify the idea of considering the city symphony as a full-fledged film genre rather than a limited cycle. Obviously, not every feature is present in every film, but this is precisely an observation that can be made for every film that is considered in the context of a broader genre.
[30]  For a complete overview and extensive discussion of city symphony features, see Hielscher (2018). Since my analysis focused on shared features of the canon films and the broader corpus (thus the city symphony phenomenon as a whole), it excluded the comprehensive examination of individual films and the filmmaker’s individual projects. This means also that individual intellectual arguments remained in the background, even though they make part of these films.
[31]  When discussing the revealing functions and capacity of the film medium to make things visible that are usually overseen, Kracauer refers to the “wealth of sewer grates, gutters, and streets littered with rubbish” in Berlin before turning to Cavalcanti, who “in his Rien que les Heures is hardly less garbage-minded” [Kracauer 1960, 54].
[32] Berlin, Man with a Movie Camera and Rien que les Heures all depict waste objects – quite often as atmospheric images. While Manhatta forms rather an exception, Strand and Sheeler’s film also depicts smoke as waste product of heating and the maritime industry.
[33] Georges Lacombe even uses waste, waste collection and recycling as main theme and structuring element in La Zone (1928).
[34] I identified these groups in ELAN by ascribing each shot to one of the following categories: (1) a single person visible in the shot, (2) two people, (3) a small group of three to ten people, (4) a larger group of up to fifty people, (5) a crowd of more than fifty people.
[35]  These texts particularly refer to Berlin and Man with a Movie Camera regarding the deliberate avoidance of intertitles. In terms of the refusal of staged scenes and hired actors, it is especially Vertov’s film that is highlighted, while Ruttmann’s film is often criticized exactly for the integration of staged scenes, most prominently by Kracauer (1947).
[36] This modern urban street construction aspect belongs to the modern metropolitan streetscape in general since Haussmann’s restructuring of Paris.
[37] More precisely, he opposes Manhatta and the American film avant-garde of the 1920s and 1930s with their longing for the city’s (re‑)unification with nature to the European city films of that time, which, according to Horak proudly celebrate urbanism and the machine age and praise the urban environment “for its excitement, speed, and modernity, with few references to nature, beyond its role in leisure-time activities for Sunday picknickers” [Horak 1995, 35]
[38]  This becomes even more obvious in city symphonies like Regen (Joris Ivens and Mannus Franken, 1929) and Images d’Ostende (Henri Storck, 1929).
[39] Interestingly enough, this does not mean that they do not take tourism as a theme. All four canon films (as well as a handful of other titles) present hotels, guests and travelers. Also the famous arrival in the city, depicted in at least twenty-three films, can be seen in light of a touristic aspect.
[40] The prologue of Rien que les Heures might have also contributed to this assumption, in which Cavalcanti states in intertitles that all cities are the same, once we ignore their monuments, and that his film does not claim to synthesize any city as it is only a sequence of impressions of the passing of time. Moreover, the notion of the generic and unspecific makes sense also in light of city symphonies’ focus on streets and various means of transportation – places that Augé (1995) would later call non-places without identity in the postmodern city. City symphonies anticipate these later non-lieux of post-modernity. Nevertheless, it should be added that there are also some rare remarks on urban specificity and recognizable urban and architectural constructions (see e.g. Dähne, 2013 and MacDonald, 2001).
[41] In Berlin buildings receive a definition and recognition as city markers also via written signs and boards on their façades, such as the Gloria Palast at Kurfürstendamm, the Hotel Excelsior, or the Pschorr-Haus at Potsdamer Platz. Vertov’s film also includes a couple of textual signifiers, by which buildings become identifiable, like the Lenin Club building and the Club Vladimir Ilyich Ulyanov (aka Lenin) at Odessa Station, the Kiev Proletarian Film Theatre the camera pans across, the All-Union newspaper building (the Izvestiya building), and the Bakhmetievsky district bus depot in Moscow. Going beyond buildings as specific markers in the city and taking a closer look at the street scenes in Man with a Movie Camera, it stands out that Vertov chose only five or six urban locations to which he returns repeatedly throughout his film – depicted in similar or (slightly) different camera positions. In this way, they become urban markers and specific locations too. One of these streets is Tverskaia Street in Moscow, the street with the Gorky banner spanning the street, which we see deserted in the morning and later with the traffic policeman doing his work. Nevertheless, the most recognizable location and landmark in Man with a Movie Camera remains the Bolshoi Theatre, which Vertov makes visually collapse by a split screen shot in the climax of the film, according to Tsivian a “symbolic destruction” [Tsivian 2004, 19] Though, he also presents it in the first minutes of his film as an urban signifier.
[42] Nevertheless, even though Ruttmann, Cavalcanti, Vertov, Stand and Sheeler do include city markers and specific buildings, it needs to be said that they all exclude the most famous sights of their respective city/-ies at the time, including the Brandenburger Tor, the Statue of Liberty, the Louvre or Notre Dame de Paris, and Red Square with Kremlin, St. Basil Cathedral, and the Lenin’s Mausoleum. Indeed, it is not the touristic city the films focus on, but certain landmarks and monuments appear as part of the cities’ everyday life.
[43] Moreover, there is a great variation of cities thematized in city symphonies. While the literature generally speaks about metropolises such as New York, Paris, or Berlin, city symphonies of the extended corpus also focus on somewhat smaller, mid-sized cities, including Rotterdam, Porto, and Ostend. In addition, the greater city symphony corpus also demonstrates that Vertov’s concept to cinematically merge and spatially organize the visual material of several cities into a newly created filmic city, was not picked up. It was only once more realized in Les Nuits Élecriques (1929), in which Eugène Deslaw combines material of nocturnal Berlin and Paris.
[44] Of course, this feature was most elaborately deployed by Joris Ivens and Mannus Franken in Regen (1929) as the entire film is structured around the visual impacts a rain shower has on the city. Regarding the canon films, Manhatta is an exception as wet streets cannot be identified due to the film’s high camera angles, extreme long and panorama shots and great distance to the streets on ground level. Moreover, the depiction of wet (reflecting) streets is present in virtually all films of the genre that include street scenes at night.
[45] Montage forms such as rhythmic, fast, associative, metaphorical and cross-sectional editing as well as experimental techniques like unusual camera angles, extremely short takes, kaleidoscopic images, split screens, multiple exposures and manipulations of speed increase and cinematographically (re‑)create the intoxicating and overwhelming impressions in the modern city. They underline particularly the frenetic pace, highly fragmented nature and the contrasting features of urban modernity and modern urban life. In so doing, city symphonies can be considered as visualizations of the theories of Benjamin (1999) and Simmel (1997) and their ideas about the modern urban shock experiences and the overstimulation of the senses.
[46] We can find those vertical and horizontal city explorations and establishments of space also in at least twenty other titles of the extended city symphony corpus.
[47] The city is spatially introduced by an arrival into that urban space. Berlin and Man with a Movie Camera present arrivals into the city by train; Manhatta starts with an arrival by a local ferry boat. Rien que les Heures presents a rather conceptual and vertical arrival into the city as in the prologue Cavalcanti shows maps (and miniature monuments), from which he cuts to a shot of the Place de la Concorde on street level. Moreover, he switches from the idea of general city life in the prologue to the specific portrait of Montmartre in the main part. These arrivals into the city, which often stand at the beginning of the films, are also a pragmatic way of getting a film on a topic as huge as the metropolis started.
[48]  There are rides through the city especially in Berlin and Man with a Movie Camera, such as in the horse-carriage-ride-through-the-city-and-filming sequence in Vertov’s film and nocturnal taxi rides in Ruttmann’s Berlin portrait.
[49] In this regard, they also go together the aforementioned recognizable landmarks and monuments, which, too, add to such a sense of spatial orientation.
[50] Berlin, Man with a Movie Camera, Rien que les Heures and Manhatta show this linear structuring element, even though not all parts of the day are presented and made equally explicit in all four films.
[51] Manhatta is an exception as it does not apply this temporal structuring element. Moreover, it needs to be said that Rien que les Heures includes a certain linear narrative development as it shows several urban types and their doings in the course of one day presented in episodes with beginnings and endings, that happen one after the other. Though, it is an experimental and fragmented narrative structure as the different narrative threads constantly interrupted each other and the characters (and their stories) remain underdeveloped.
[52] Of course, this also emphasizes the chronological day structure and is also interwoven with the aforementioned numerous shots depicting acts of opening and closing, arriving and entering, leaving and de-boarding, and (re‑)starting and stopping.
[53] Indeed, while film itself as a time-based medium presents pictures one after the other in chronological time, city symphonies employ strategies to evoke the idea of simultaneity resulting in cross-sections across space.
[54] In addition, there are also mini-narratives that further underline the logic of the narrative. In Man with a Movie Camera, for instance, there is the mini-thriller narrative of the cameraman getting stuck with his foot under the rails when the train is approaching or the story of the accident, the fire brigade and ambulance operations. An example from Berlin is the mini-narrative of the female flâneuses in the streets (there is a sequence of similarly looking women), which turns into the story of a woman (a cocotte) and a man coupling.
[55] Manovich, in fact also points toward the interweaving of both logics as he describes Man with a Movie Camera not only as a film anticipating the database logic but also as a work merging “database and narrative into a new form” [Manovich 2001, 241–243]. Besides the “catalog of subjects that one could expect to find in a city of the 1920s – running trams, city beach, movie theatres, factories ...,” and “the most amazing catalog of film techniques,” there is also the narrative of modern urban life and the “gradual process of discovery,” which, according to Manovich, “is film’s main narrative, and it is told through a catalog of discoveries.” Similarly, other city symphonies also include a narrative – the story of the city and modern urban life in a day – while they also present a cross-sectional database of modern urban material of the era between the wars.
[56] The fortunate encounters with other scholars from the field of media studies and information science, particularly L. Melgar and A. Heftberger, introduced me to various digital tools and inspired me to investigate benefits and options of applying those tools for my own research purposes.
[57] There are further interesting analysis methods and data evaluation/visualization types that could be very fruitful for a study of urban films. For example, in a cloud or network representation, relations between motifs could be highlighted or the manner specific features (re‑)appear in a number of films. Moreover, an online database project could be interesting, which would allow to provide access to findings and research results in an interactive and a rather non-linear way.
[58] I refer here especially to the MS-Office package, including Word and Excel.
[59] No doubt, an automatic shot detection would be more time-efficient, but it would also eliminate this empirical aspect. As important part of the analysis, the experience of segmenting and annotating in ELAN and the pure viewing observations are indicated in the center of the analysis scheme, contributing to both sides of the study of motifs and structures (see Figure 3).
[60]  Moreover, it also recalls the analogue film strip.
[61] I am very grateful to [reviewer 1] for the description of the hands-on recreation of the editing process in ELAN as a reverse-engineering scholarly experience, the underrated aspect of manual digital analysis in digital humanities and for stressing and reminding me of the etymological relation of with your fingers and digital.

Works Cited

Altman 1984 Altman, R. “A Semantic/Syntactic Approach to Film Genre”, Cinema Journal 23, 3 (1984): 6-18.
Augé 1995 Augé, M. Non-places: Introduction to an Anthropology of Supermodernity. Verso, London (1995).
Balázs 2010 Balázs, B. “The Spirit of Film”. In E. Carter (ed), Béla Balázs: Early Film Theory. “Visible Man” and “The Spirit of Film”, Berghahn Books, New York, Oxford (2010), pp. 91-230.
Barsam 1973 Barsam, R. Nonfiction Film. A critical History. E.P. Dutton, New York (1973).
Benjamin 1999 Benjamin, W. The Arcades Project. Harvard University Press, Cambridge (1999).
Bollerey and Föhl 2014 Bollerey, F. and A. Föhl (eds). City Symphonies. Film Manifestos of Urban Experiences. Eselsohren. Journal of History of Art, Architecture and Urbanism, 1+2 (2014).
Bordwell and Thompson 2019 Bordwell, D. and K. Thompson. Film History. An Introduction. Fourth Edition. McGraw-Hill, New York (2019).
Cowan 2014 Cowan, M. Walter Ruttmann and the Cinema of Multiplicity. Avant-Garde, Advertising, Modernity. Amsterdam University Press, Amsterdam (2014).
Dähne 2013 Dähne, C. Die Stadtsinfonien der 1920er Jahre. Architektur zwischen Film, Fotografie und Literatur. Transcript, Bielefeld (2013).
Grierson 1933 Grierson, J. “Documentary (2): Symphonics”, Cinema Quarterly 1, 3 (1933): 135-139.
Heftberger 2016 Heftberger, A. Kollision der Kader. Dziga Vertovs Filme, die Visualisierung ihrer Strukturen und die Digital Humanities. Edition Text + Kritik, Munich (2016).
Hielscher 2018 Hielscher, E. Rewinding the City Symphony: Historiography, Visual Motifs and Structural Patterns of Interwar City Symphony Films. Ph.D. thesis, Ghent University (2018).
Horak 1995 Horak, J. “The First American Film Avant-Garde, 1919-1945”. In J. Horak (ed), Lovers of Cinema: The First American Film Avant-Garde, 1919-1945, University of Wisconsin Press, Madison (1995), pp. 14-66.
Jacobs 1949 Jacobs, L. “Avant-Garde Production in America”. In R. Manvell (ed), Experiment in the Film, Grey Walls Press, London (1949), pp. 113-152.
Jacobs and Fyfe 2016 Jacobs, L. and K. Fyfe. “Digital Tools for Film Analysis: Small Data”. In C. Acland and E. Hoyt (eds), The Arclight Guidebook to Media History and the Digital Humanities, REFRAME Books, Falmer (2016), pp. 250-251.
Jacobs et al. 2018 Jacobs, S., A. Kinik, and E. Hielscher (eds). The City Symphony Phenomenon. Cinema, Art, and Urban Modernity between the Wars. Routledge, New York (2018).
Kinik 2008 Kinik, A. Dynamic of the Metropolis: The City Film and the Spaces of Modernity. Ph.D. thesis, McGill University (2008).
Koeck and Roberts 2010 Koeck, R. and L. Roberts (eds). The City and the Moving Image. Urban Projections. Palgrave MacMillan, Basingstoke, New York (2010).
Kolaja and Foster 1965 Kolaja, J. and A. Foster. “‘Berlin, the Symphony of a City’ as a Theme of Visual Rhythm”, The Journal of Aesthetics and Art Criticism 23, 3 (1965): 353-358.
Kracauer 1947 Kracauer, S. From Caligari to Hitler. A Psychological History of the German Film. Princeton University Press, Princeton (1947).
Kracauer 1960 Kracauer, S. Theory of Film. The Redemption of Physical Reality. Oxford University Press, New York (1960).
MacDonald 2001 MacDonald, S. The Garden in the Machine. A Field Guide to Independent Films about Place. University of California Press, Berkeley, Los Angeles, London (2001).
Manovich 2001 Manovich, L. The Language of New Media. MIT press, Cambridge (2001).
Manovich 2013 Manovich, L. “Visualizing Vertov”. Manovich.net (2013). Available at: http://manovich.net/content/04-projects/078-visualizing-vertov/74_article_2013_sm.pdf [last consulted June 5, 2018].
Mazierska and Rascaroli 2003 Mazierska, E. and L. Rascaroli. From Moscow to Madrid. Postmodern Cities, European Cinema. Tauris, London, New York (2003).
Melgar et al. 2017 Melgar, L., E. Hielscher, M. Koolen, C. Olesen, J. Noordegraaf, and J. Blom. “Film Analysis as Annotation: Exploring current Tools and their Affordances”, The Moving Image, Special Issue: Digital Humanities and/in Film Archives, 2 (2017): 40-70.
Noguez 1985 Noguez, D. “Paris – Moscou – Paris. Paris et les symphonies de ville”. In P. Hillairet, C. Lebrat and P. Rollet (eds), Paris vu par le cinéma d’avant-garde 1923-1983, Paris Expermental, Paris (1985), pp. 31-37.
Nowell-Smith 1997 Nowell-Smith, G. (ed). The Oxford History of World Cinema. Oxford University Press, London (1997).
Olesen 2017 Olesen, C. Film History in the Making. Film Historiography, Digitised Archives and Digital Research Dispositifs. Ph.D. thesis, University of Amsterdam (2017).
Roberts 2000 Roberts, G. The Man with the Movie Camera. Tauris, London, New York (2000).
Rotha 1936 Rotha, P. Documentary Film. Faber and Faber, London (1936).
Simmel 1997 Simmel, G. “The Metropolis and Mental Life [1903]”. In D. Frisby and M. Featherstone (eds), Simmel on Culture. Selected Writings, Thousand Oaks, London (1997), pp. 174-185.
Tsivian 2004 Tsivian, Y. “Dziga Vertov and His Time”. In Y. Tsivian (ed), Lines of Resistance: Dziga Vertov and the Twenties, Le Giornate del cinema muto, Gemona, Udine (2004), pp. 1-28.
Turvey 2011 Turvey, M. “City Symphony and Man with a Movie Camera”. In M. Turvey, The Filming of Modern Life. European Avant-Garde Film of the 1920s, MIT Press, Cambridge/Massachusetts (2011), pp. 135-162.
Uricchio 1982 Uricchio, W. Ruttmann's “Berlin” and the City Film to 1930. PhD dissertation. New York University, New York (1982).
Vertov 1984 Vertov, D. “From Kino-Eye to Radio-Eye”. In A. Michelson (ed), Kino-Eye: The Writings of Dziga Vertov, University of California Press, Berkeley (1984), pp. 85-92.
Vogt 2001 Vogt, G. Die Stadt im Film. Deutsche Spielfilme 1900-2000. Schüren, Marburg (2001).
Weihsmann 1997 Weihsmann, H. “The City in Twilight. Charting the Genre of the ‘City Film’ 1900-1930”. In F. Penz and M. Thomas (eds), Cinema and Architecture. Méliès, Mallet-Stevens, Multimedia, BFI, London (1997), pp. 8-27.
Weiss 1995 Weiss, P. Avantgarde Film. Suhrkamp, Frankfurt (1995).