Chapter 4 of Visual Grammar

[Contents] [Previous: Chapter 3 Subsytems] [Next: Chapter 5 Grammatical Research]

So far we have been working on the theoretical framework for visual grammar. In this chapter I will outline a visual grammar that follows from the discussion in the previous chapters. For this I will use many ideas presented by Kress and Van Leeuwen in Reading Images. I will incorporate their ideas into the theoretical framework of this thesis and use them to formulate grammatical rules as discussed in chapter 2. I do not aim to present a complete and exhaustive grammar. I aim for a rudimentary visual grammar that can be used to classify images and verifiably detect visual means. This grammar will be the basis of a brief research of different types of posters that will be discussed in chapter 5. In the first section I will summarize the most important conclusions that can be drawn from this framework and reformulate these into clear conditions for visual grammar. In the following sections I will work out various grammatical rules that form the visual grammar presented in this thesis. The last page in the appendixes can be folded out and contains an overview of all the grammatical rules discussed for easy reference.

4.1 Outlines

Visual language shares many crucial characteristics with verbal language. Both visual and verbal language can be said to employ signs and phrases as the building stones of texts. These signs and phrases correspond to a conceptual structure that can be said to be the meaning of the text or image. The most elementary signs are finite and recurrent in function and meaning, but the possible combinations of signs are infinite (discrete infinity).

The possible signs of a language are greatly influenced by their medium of representation. This accounts for the differences between verbal and visual languages, as this difference in potential is projected onto the potential structures and representations made within the particular language (A-projection). At the same time the cognitive capacity of language users is also projected unto the language potential (C-projection). A particular language can be said to be more effective in representing certain concepts when the results of A-projection and C-projection overlap. For example, the visual medium is well suited to represent maps, because it allows for the use of a 2-dimensional space (A-projection) that is compatible with the conceptual structure of a map (C-projection), which is also 2-dimensional. Spoken language lacks this 2-dimensional space of expression and is therefore ill-suited to represent maps.

Every language organizes its meaningful elements on different structural levels. For verbal language these different levels are words, sentences, paragraphs and whole texts. For visual language these can be said to be visual signs, participants, and scenes. All these elements need to be articulated in some way, so they can be distinguished from one and other. On the lowest structural level the influence of the medium of representation, A-projection, is the greatest. On higher structural level the C-projection, and through C-projection also extra-semiotic influences, are stronger. Thus while it is not possible to represent time in a visual representation directly on a low structural level (because static visual representations lack a temporal dimension) this can be represented indirectly by using higher level structures: conventions. The lowest structural level is not the level of individual signs, but the level of means of articulation. This is more important for visual language than for verbal language because in a way verbal conventions and conventional use of sings is stronger; verbal language relies less on A-projection due to its arbitrary nature, or conceptual reliance. Thus visual grammar should be based on the possibilities of articulation of the medium of representation. This will anchor visual grammar strongly on graphic means.

Cognitive studies of the brain suggest that the human language faculty is modular and processes information in a parallel fashion. This is an important reason for the development of the Modular Parallel Architecture (MPA) of the visual language faculty. In addition, such an architecture is well-suited to model the discrete infinity of language. The MPA forces a grammar to strictly distinguish the influences of the different modules within the architecture and formulate clear and strict mapping rules that allow for communication between the modules in the architecture (modular representation). These rules of mapping are in effect the grammatical rules visual grammar should consist of.

Information flows through the MPA in a parallel fashion and finally converge in the conceptual module. This allows for infinite possible combinations and can account for the ambiguity and ambivalence frequently encountered in any language, because the information passed to the conceptual module is not always complementary but often opposing or even contradicting.

The information added by each module is attached to the structural units of the grammar. This will give the articulation module in the MPA the dual function to pass the structural information to all other modules, and map meaningful means of articulation to the conceptual module. The role of the lexical module is to associate elements (at different structural levels) with lexical entries or concepts. The syntactical module attaches syntactical information that describes the structure of several sub-elements within a higher level structural unit.

For Karl Popper the value of a scientific theory, such as visual grammar, corresponds to its testability. The testability of a theory can be expressed in the number and severity of crucial experiments that can be formulated to test the theory. A high number of crucial experiments increases the value of the theory (Popper 1979: 10). As has been outlined in chapter 3, I will use genre classification as to test the grammar here presented. Unfortunately, this method cannot really falsify the grammar. Although it does say something about its usability. An additional test is required that might falsify the grammar. For this I will use a correlation test: the individual grammatical rules of this grammar should not correlate to much with other grammatical rules. A high correspondence between two grammatical rules would indicate that these grammatical rules actually amount to the same structure.

The following conditions for visual grammar can be distilled from the preceding discussion:

C1 Graphical means of articulation are the basis of visual grammar.
C2 A visual representation is organized in structural elements on different levels. All these elements are articulated and can be identified.
C3 Condition of discrete infinity: the number of elements of a visual grammar are finite, while the possible structures that can be formed with these elements is infinite.
C4 The formal structure of grammar should be based on the working of the human mind-brain. Modular parallel architecture reflects the working of the mind-brain and therefore serves as the formal structure of visual grammar.
C5 Condition of modular representation: grammatical rules need to be clearly attributed to the individual modules.
C6 Grammatical rules and their associated structures converge in the conceptual module. The information passed to the conceptual information by means of these grammatical rules or mappings can be complementary, opposing or contradicting.
C7 The usability of this visual grammar will be tested with genre classification, it should be possible to establish different grammatical patterns that are typical for visual genres.
C8 The grammatical rules need to be formulated clearly, objectively and in such way that they can be easily tested, individually.

The aim of this chapter is to set up a number of grammatical rules. In the following sections I will discuss and explain a couple of rules associated with different elements of visual grammar. However, the label 'rules' might have the wrong connotations. Grammatical rules cannot be used to attribute meaning to a particular structure. Meaning is established by a multitude of grammatical structures and their associated rules in an image. One rule may act against another. Correct use of rules should also be seen in this way. One clear meaning will suggest itself from an image in which all the grammatical structures strengthen each other to someone who 'knows' the rules[1]. Such an image is not 'correct' in any sense except that the grammatical rules have been applied correctly in this image. Images that confuse a viewer by inconsistent application of the grammatical rules is only 'incorrect' when it should have been clear in its meaning. The grammatical rules discussed here are therefore not rules of application, but rules of mapping: they suggest what information is passed to the Conceptual Module and what sort of meaning they might suggest.

The rules presented below might not be universal rules. In fact quite a few will only be part of a visual language that is part of Western culture. Since this thesis focuses on images from Western culture this is not a problem for me. Although I would be very interested in the changes and additions needed for this set of rules to apply to a non-Western culture.

4.2 Articulation

The most basic means of articulation in visual representation is limited by the characteristics of the visual medium and the qualities of the human eye. In chapter 1 I already discussed visual means in some detail. In that chapter I put forward X and Y dimension, colour and value as the most basic forms of articulation, followed closely by shape, size, grain and orientation. In this section I will re-evaluate and expand these visual means of articulation.

The X and Y dimension that an element has are obviously very fundamental visual means of articulation. Elements in a visual representation are set apart first by having different positions within the picture plane. However, it might not always be practically to separate the X dimension from the Y dimension, even though some elements are only articulated in one dimension. I will use position to refer both of to these.

Colour and value form the next important group of means of articulation. Since value refers to greyscale, both colour and value can be grouped together under the heading colour. There are many different ways to describe a colour of which brightness, hue and saturation are the most conventional. However I do not think that such a division would be very practical in the light of this study.

Shape, size and orientation are means of articulation that usually cannot be used to distinguish elements by themselves as this would also require the use of colour and position. However, they can be used to set op structural relations between different elements in an image. Thus two elements can be associated with each other because they share the same shape, size or orientation. Or their relation can be more complex, such as an element being twice as large as another element, etcetera.

Grain is more difficult to use as means of articulation. Bertin used it to refer to a certain quality of a textured plain: the grain of the texture. Figure 4.1 shows differences in grain, while what figure 4.2 shows are not differences in grain according to Bertin (but rather differences in shape and orientation of the texture). In the light of this thesis I will refer to quality of texture or simply texture to refer to the differences featured in both figures 4.1 and 4.2. Texture, as a means of articulation, is not as basic as position or colour. It is a complex of any of the means of articulation discussed so far. I will use the label line-quality to refer to similar complexes of means of articulation that can be used to give lines different characters. Figure 4.3 features some differences in line-quality.

Figure 4.1 - Difference in grain

Figure 4.2 - Difference in texture

Figure 4.3 - Line quality

The means of articulation serve a double purpose: they are used to separate structural elements in a visual representation and can be used to add meaning to elements or sets of elements. The former information is mapped to the syntactic and lexical modules for further processing while the latter information is mapped directly to the conceptual module. The latter is also of most interest to visual grammar and can be captured into three general rules (R7 R9 below) that are associated with the articulation conceptual interface. These three rules formalize the most important and meaningful aspects of articulation.

R7 General Rule of Intuitive Correspondence if participant PA is differently articulated than PB then this difference may correspond to a similar difference between concepts CA and CB.
R8 Quality of Articulation if concept CA can be associated with the quality (texture or line-quality) of articulation of participant PB then CA preferably is associated with CB.
R9 Grouping by Articulation if participants PA1-An share a characteristic in the way they are articulated then concepts CA1-An may be a group of Interordinates.

These rules are rather general in character. I think that as a result of a more detailed research of articulation many more rules can be formulated. However, it is not my aim to be exhaustive in this respect.

4.3 Processes and Schemata

To flesh out the grammar I will draw on the work of Kress and Van Leeuwen. Their work Reading Images is an extensive account of various grammatical structures encountered in many types of images. However, when testing the theory of Kress and Van Leeuwen on the condition of modular representation (C5), their theory does not score well. Many of the described structures blend influences of articulation, syntax, lexical identification and conceptual structures. In order to set up a visual grammar that meets C5 these influences should be separated and analysed individually. When discussing the meaning of an image all these influences should be combined, but the grammatical structure of an image is best described with C5 in mind.

The fact that Kress and Van Leeuwen mix all these influences is not at all strange when Mark Johnson's The Body in the Mind is taken into account. Johnson argues that reasoning and conceptual structures are strongly affected by bodily experiences: "the centrality of human embodiment directly influences what and how things can be meaningful for us, the ways in which these meanings can be developed and articulated" (Johnson 1987: xix). When using the notions and words of this thesis, one can say that Johnson argues that most (and maybe ultimately all) conceptual patterns are subject to an A-projection that works through all possible languages; it is the limits and possibilities of our embodied experience that shape our thoughts. Since this experience is strongly focussed on touch and sight (both of which share a spatial as opposed to temporal organisation) it is only logical that this form of 'bodily A-projection' is more influential in visual language than in verbal language.

According to Johnson not propositional logic but 'image-schemata' are the fundamental structures of conceptual knowledge. These image-schemata are not real or mental images. They are abstract and not limited to visual properties alone. On image-schemata one can perform several mental operations that are akin to bodily movement or the result of bodily-movement. These operations serve to create order in the chaos of stimuli. He defines image schemata as a "recurrent pattern, shape, and regularity in, or of, these ongoing ordering activities" (ibid. 29).

'In-Out Orientation' and 'Force Gestalt' are two examples Johnson discusses extensively. In-Out Orientations deal with containment and describe how one thing is contained within another. This can be real such as a 'cat in a box' or metaphorical such as a 'weak spot in an argument'. In-Out Orientation implies a number possible movements or operations such as 'moving the weak spot out of an argument' (ibid. 30-37). Force Gestalt is akin to Kress and Van Leeuwen's Action Process. It describes how a force or "vector" acts upon an element or object, and distinguishes several roles that elements in a Force Gestalt configuration can have, such as trajectory, barrier, counterforce, etc. These elements can be arranged in typical configuration which serve as templates for several types of representations: images, verbal texts and argument structure, to name but a few (ibid. 42-48). Figure 4.4 is a schematic representation of an In-Out Orientation. Figure 4.5 is a schematic representation of a typical Force Gestalt called 'compulsion' by Johnson[2].

Figure 4.4 - In-out orientation

Figure 4.5 - Compulsion

Image-schemata are closely akin to Kress and Van Leeuwen's processes. However, there are two important differences which I like to point out:

  1. On the one hand processes are or at least seem to be strict classifications: for Kress and Van Leeuwen a particular configuration will be either a Action Process or an Reaction Process[3], not something in between. Johnson's schemata, on the other hand, are much more flexible: the schemata are the most typical configurations but representation structures can be encountered that deviate from the templates. A compulsion that lacks a clear trajectory is still a compulsion albeit a less typical one.

  2. Processes are strongly attached to particular meaning, Image schemata are not. Kress and Van Leeuwen's Actor is foremost a conceptual entity. Johnson's object (in figure 4.5) is a syntactical role to which several conceptual, and meaningful, roles can still be attached.

Kress and Van Leeuwen's theory can be easily changed with regards to these two points. Instead of focussing on the processes Kress and Van Leeuwen describe I will focus on the structural roles these processes consist of. In that way, Kress and Van Leeuwen's processes can easily become schemata, opening up many possibilities for the existence of other processes that are not described by Kress and Van Leeuwen but can be derived from their work or from other sources. I have tried to make the structural roles as syntactical as I could. This should allow their 'meaning' to be strengthened or weakened by the lexical meaning of the participants or the meaning of their articulation. Again this opens up many new possibilities. As been hinted at with the 'strong actor' and 'weak actor' examples in chapter 2 (figures 2.17 and 2.18). Of these, the 'strong actor' schemata is likely to be more typical than the 'weak actor' schemata[4].

The Syntactic Module in the MPA for visual language faculty is responsible for the syntactic ordering of visual stimuli and identifying any recurrent patterns in the stimuli or in the ordering process. Patterns need not be similar but alike to be recurrent. Most of these patterns are in a sense quite meaningless by themselves. An Action schemata, in which an Actor, Vector and Goal are encountered in a particular configuration, only opens up a number of potential meanings. The combination with lexical information and articulation (hopefully) directs us to a particular meaning (or several particular meanings).

The idea of schemata also applies to modality. In chapter 3 I have already criticized Kress and Van Leeuwen's modality for being to restrictedly orientated towards the four coding orientations. Again, the four coding orientations might be considered to be schemata or typical orientations. Also many more typical and less-typical orientations might be found.

4.4 Narrative Structures

The narrative and conceptual structures or 'procedures' as Kress and Van Leeuwen call them form an important aspect of the visual grammar presented in Reading Images. The concepts introduced in chapters 2 and 3 of this book are valuable tools for the analysis of images. In chapter 2 of this thesis I have already presented some of this material. In this section and the next I will discuss this material in more detail, point out the flaws and present some solutions.

Narrative and conceptual structures are both meaningful constellations of participants. The types of participants present in the images and their interrelations determine the type of structure or process. Kress and Van Leeuwen' major distinction is between narrative and conceptual structures. This distinction can be easily made on basis of the presence of a vector an explicit or implicit line that is connected to one or more participants. If a vector is present the image is classified as narrative and if a vector is absent the image is conceptual.

In chapter 2 I already discussed the most basic type of narrative structure called an 'action process'. Other narrative structures are the non-transactional action process, the bi-directional transactional action, transactional reaction, non-transactional reaction, mental process and verbal process. All these different types of processes can be distinguished by the presence of different types of participants and vectors. Apart from the 'normal' vector that is typical for the action processes there is the eye-line vector that is typical for reaction processes. The eye-line vector is an implied vector that is formed by an eye-line, for example see images 1.2 and 4.6. Even though eye-line vectors are only implied the visual system of the human brain is quite sensitive for this particular type of lines.

Figure 4.6 - Hamlet (J.J. Chris Lebau, c. 1915)

Verbal and mental processes are also distinguished by a typical vectors. In this case the vector can be conventionally identified as attributing an utterance or a thought to a participant. Today the most typical examples are the speech and thought balloons of comics, but similar devices can also be encountered in medieval art (Kress and Van Leeuwen, 1996: 61-69). All narrative processes discussed by Kress and Van Leeuwen can be visually presented as in figure 4.7. In this figure the participants in the process have been given labels that correspond to the names given to them by Kress and Van Leeuwen. Thus a participant from which the reaction vector emerges in a reaction process is called a 'reactor'.

Figure 4.7 - Narrative processes

There are a number of problems with Kress and Van Leeuwen's approach to narrative and conceptual structures. Charles Forceville points out many of these problems: the very complicated and rigid hierarchy of different types of processes in which some options seem to have been omitted, the unclear and sometimes fuzzy categories on which this hierarchy is built, and in some instances flawed interpretations of some examples (Forceville 1999). Forceville claims these problems have to do with Kress and Van Leeuwen's social semiotic framework he argues that Reading Images would have greatly benefited from a cognitive framework (ibid. 163). I do not entirely agree with Forceville's last claim, but I do think that he accurately points out the weak spots in Reading Images.

Forceville's solution to escape Kress and Van Leeuwen's rigid hierarchy is to interpret their structures as prototypical or prototypical schema's. This seems to be a good alternative and was already discussed to some extent in section 4.3. To expand on this train of thought and take the classifications of the various different types of participants of these structures as our basis, it becomes possible to construct many structures not discussed by Kress and Van Leeuwen. Thus if we take the Actors, Goals, Vectors, etc as our point of departure then an infinite set of possible structures opens up. For instance an Actor that acts on a group of goals, or an Actor that acts as Phenomenon at the same time. The possibilities quickly become infinite.

In this light the structures or processes discussed by Kress and Van Leeuwen can still be seen as the most typical structures in which a specific class of participant occurs. Below is a list of participant roles and their descriptions, which I will call Visual Actants from now on[5]. This list is distilled from chapters 2 of Reading Images and will be expanded in the next section. Even after this expansion I do not claim it to be exhaustive.

Action Vector Strong, implicit or explicit and usually diagonal lines are called vectors. When a vector is not a Reaction Vector it is an Action Vector. A Action Vector must be connected to an Actor, Object or both. The action itself might be represented by the Vector if it is explicit, or by the nature or characteristics of the Actor and Goal.
Reaction Vector A Vector that is formed by an eye-line is a Reaction Vector. Reaction Vectors must be connected to an Reactor, and may be connected to a Phenomenon. The nature of the reaction is usually represented by the Reactor.
Actor An Actor is the participant from whom an Action Vector emerges. Its not uncommon for a participant to be Actor and Action Vector at the same time (as is the case with The British Used Guns, figure 2.4).
Object The participant to which an Action Vector points is the Object. Kress and Van Leeuwen named this role Goal, but I prefer Object because usually it is quite literally the object of an action.
Reactor A Reactor is a participant from which a Reaction Vector emerges. A Reactor is required to identify a Reaction Vector, for it must have the eye to form the eye-line.
Phenomenon The participant to which a Reaction Vector points is the Phenomenon.

These visual actants form the core of the narrative processes described by Kress and Van Leeuwen. To incorporate them into my visual grammar I have formulated the following rules that capture the most basic structures, but can still be used to create many combinations (but I leave out speech and thought processes):

R10 Action if participant PA is connected to Action Vector VB that emerges from PA then CA preferably is the Actor of the action represented by VB.
R11 Compulsion if participant PA is connected to Action Vector VB that points to PB then CB preferably becomes the Object of the action represented by VB.
R12 Reaction if participant PA is connected to Reaction Vector VB that emerges from PA then CA preferably becomes the Reactor to Phenomena represented by PC that is also connected to VB.

4.5 Conceptual Structures

An image that contains no vector is classified as a conceptual process. Like narrative processes, Kress and Van Leeuwen identify different types of conceptual processes. The first group is called classificational processes which can be either a covert taxonomy, a single-levelled overt taxonomy or a multi-levelled overt taxonomy. In a clasificational process at least one set of participants act as subordinates to a represented or virtual superordinate. If the superordinate is represented the taxonomy is overt, while if it is only virtual it is covert. Covert taxonomies can be recognized by the apparently equality in representation of the subordinate participants (Kress and Van Leeuwen 79-88).

Analytical processes present a participant, the carrier, that can be analysed into its parts or possessive attributes. Maps, diagrams and various technical or scientific graphics are all different types of analytical processes. Analytical processes are further classified by their internal organisation: they can have a temporal dimension like most graphs have, be topographical or topological be exhaustive or inclusive, conjoined or compounded, or simply be unstructured. The temporal, topographical and topological dimensions can have various levels of accuracy, which in fact is quite similar to the accuracy of means of articulations discussed in section 1.3 (ibid. 89-107).

Figure 4.8 - Sheer Magic advertisment (1987)

The last type of conceptual processes is the symbolic process. Symbolic processes can be either attributive or suggestive. In an attributive symbolic process a represented symbolic attribute projects symbolic meaning onto a carrier. In a suggestive symbolic process there is no represented symbolic attribute, rather the 'mood' or 'atmosphere' of the image projects symbolic meaning to the participants in the image. An example of these that Kress and Van Leeuwen name is the "soft golden glow" which projects a sense of "softness" in a commercial ad (ibid. 110, see figure 4.8). Figure 4.9 presents the conceptual structures visually, similar to figure 4.7.

Figure 4.7 - Conceptual processes

From these conceptual processes a number of conceptual roles or visual actants can be distilled. These function in a similar way as the visual actants that were distilled from the narrative processes in the last section:

Superordinate The Superordinate is a participant that forms the highest level participant in an Overt Taxonomy.
Subordinate Subordinates are participants that are subject to Superordinate in an Overt Taxonomy, these usually come in groups. In a multi-level taxonomy a participant may be both Superordinate and Subordinate.
Interordinate Kress and Van Leeuwen use this label to refer to participants that are both Superordinate and Subordinate. I use it to refer to a 'subordinate' that is part of a Covert Taxonomy. Interordinates come in groups and their 'equality' is emphasized.
Carrier A Participant that is formed by a number of other participants is a Carrier. A Carrier and its Attributes form an Analytical Process.
Attribute Attributes are the participants that are part of a larger participant. Usually Attributes are sub-phrases (or sub-participants) of the larger Carrier.
Symbolic Carrier A Symbolic Carrier is a participant that receives symbolic meaning from a Symbolic Attribute or in Suggestive Symbolic Process.
Symbolic Attribute The Symbolic Attribute is a participant that projects symbolic meaning to a Symbolic Carrier. Unlike the 'normal' Attribute a Symbolic Attribute can, but must not, be part of its Carrier. Symbolic Attributes can also be near its Carrier or connected to it by other means. Symbolic Carriers and Symbolic Attributes are harder to identify. In section 4.6 we shall see that this is due to the fact that there are no strong grammatical patterns for symbolism, but that symbolism is achieved by the combination of different rules.

Again the recognition of these actants can be formulated into grammatical rules. This works well enough for classification and analytical processes. However I feel that the symbolic process is largely the result of convergence of several rules such as Attribution by Proximity (R14 below) but also Qualtiy of Articulation (R8, see section 4.3) and Lexical Connotion (R21, see section 4.7) :

R13 Embedding if participant PA is syntactically embedded within participant PB then concept CA is preferably embedded within concept CB. If so, CA is an Attribute of Carrier CB.
R14 Attribution by Proximity if participant PA is close to participant PB than concept CA may be attributed to concept CB. If so, CA is an Attribute of Carrier CB.
R15 Syntactic Grouping if participants PA1-An share syntactic qualities such as position then concepts CA1-An may be a group of Interordinates.
R16 Explicit Subordination if participants PA1-An are syntactically embedded within PB then concepts CA1-An may be Subordinates of Superordinate CB.

I would like to stress the fact that Visual Actants are not classes of participants, but classes of structural roles that participants can have. A single participant can have any combination of roles in this respect. Thus a participant can be the Actor of one action while at the same time being the Goal of another action[6]. Also, a group of participants can have a single role; and thus form a single Visual Actant. The great advantage of this approach is that it becomes easier to describe complex structures that mix and combine different elements of Kress and Van Leeuwen's processes. This does also mean that the meaning that Kress and Van Leeuwen attribute to the various processes becomes less clear and distinct. However, I do not regard this as a downside of my approach. We have already seen that meaning is often constructed from different elements that can be contradicting, and this allows for the analysis of double or contradicting meanings and creates a palette that can be used to sketch an infinite range of subtly realised meaningful structures.

For Kress and Van Leeuwen the only option for combining different processes seems to be embedding: a certain process is embedded within a more dominant, and usually visual larger, process (Kress and Van Leeuwen: 112-114). This is not always the case and more complex combinations of different processes occur. Take for instance figure 4.10. In this image an interesting combination of several analytical and classificational processes can be found. The figures can be grouped in two directions: the horizontal rows form two distinct groups and there are also five vertically aligned pairs. The strong covert taxonomy that is the lower row is contrasted with the analytical qualities of higher row. At the same time the vertically aligned pairs are partly analytical (their difference is stressed) and a taxonomy (their equality is stressed by the skin-colour, the main distinctive feature of the lower row). Although the rows are visually larger, I would not say that this grouping is more dominant than the vertical pairing. In this case different processes are not embedded but combined or superimposed to form a complex structure that may very well be unique for this design.

Figure 4.10 - World Empires (Gerd Arntz, 1940)

Participants are the most important structural elements of this kind of analysis. Therefore a clear definition or way of identifying participants is required. According to Kress and Van Leeuwen there are two different ways to identify participants. The first way is "formalistic, and grounded in the psychology of perception" (ibid. 47). Participants are identified by the forms they consists of or to put it in my terms by the way they are articulated. The second way to identify participants is by their semantic function, or structural role as I would call it (ibid. 48). Both are ways are required, thus a participant is an articulated form to which at least one structural role can be attributed. This goes for all participants even an implied vector is articulated in someway.

Figure 4.11 - Centrale Bond Transportarbeiders (Paul Schuitema, 1930)

A participant can be a phrase; a participant can consists of several lower level participants and elements. Thus a self-contained structure consisting of several participants can act as a participant on a higher level. Figure 4.11 is an example of such a structure. In this image the large figure in the centre is a participant. The figure consists of several elements, the head, the thick lines that make out his body and the crowd that fills his body[7]. Of these elements the crowd itself can and does act as a sub-participant. There are several reasons for this. The crowd can of course be identified as such. It also has the structural role of an Attribute with the figure as its Carrier. Finally, its different orientation from the central figure stresses this further (its 'vertical' axis goes from the bottom-right to upper-left, as opposed to the central figure). We could pursue this even further by pointing out that the crowd it self is a Carrier with all the particular men in the crowd its Attributes, another possibility might be that the crowd is a Covert Taxonomy, where the crowd is only implied by the group of men that would be Interordinates in that case.

4.6 Composition

According to Kress and Van Leeuwen the meaning of an image can be partly derived from the information value of the participants (Kress and Van Leeuwen: 183). The information value of an element in an image is the product of the position of the element in a typical compositional organization in the picture as a whole. The most dominant compositional organizations are left-right, top-margin and centre-margin.

Figure 4.12 - Fortyn and Wiegel Cartoon (Jos Collignon, 2002)

Left-right compositions are encountered most frequently. Elements on the left in this configuration are called the Given and correspond to the 'commonsensical', 'self-evident' or the 'existing situation'. The elements on the right are called the New and correspond to the 'problematic', 'contestable', 'the information at issue' or simply the 'new situation'. The Given and the New can closely correspond to 'before' and 'after' or 'first' and 'second' in verbal language (ibid. 186-188). Of course, the meaning attributed to the Given and the New is very conventional and, as Kress and Van Leeuwen point out, is closely aligned with the direction of writing[8]. This convention is quite strong, so strong even that I encountered a political cartoon in which the position of the Given and the New forced the cartoonist to set up a reading direction that starts at the right and ends to the left, see figure 4.12. The caption reads "just when you thought the worst had passed". The figure on the right is identified as Pim Fortuyn, at the time an emerging populist right-wing politician in the Netherlands. The figure on the left is identified as 'prime-minister' Wiegel, who is a retired conservative politician who never was prime-minister but whose triumphant return to politics might be made possible by Pim Fortuyn. In this cartoon, the old politician Wiegel is accurately positioned as the Given in the image, while newcomer Fortuyn is depicted as the New.

The information value of top and bottom is Ideal and Real, respectively. This configuration is commonly encountered in Western advertisements (see figure 4.13). This image features a strong element at the top the scene of a man and woman enjoying their coffee and a strong element at the bottom the product being sold accompanied by some text. The scene is the Ideal: it makes an emotive appeal and shows us 'what might be' if we drink this particular brand of coffee. On the other hand the lower part is practical and informative; it shows us 'what is' (ibid. 193). In the case of this advertisement the appeal is highly emotive.

Figure 4.13 - Bushells Advertisment (1987)

According to Kress and Van Leeuwen Centre-Margin configurations are less frequently encountered in Western culture, but can be found more often in children's drawings, and Eastern art. Although they also claim that this configuration is becoming more frequent in the West. Centre-Margin configurations set up a hierarchal composition in which the central elements represent the most important information and the Margin elements that are dependent or subordinate to the Centre (ibid. 203-206). Figure 4.14 is an example that features Centre Hierarchy. There are many groups of participants that surround the central participant 'Vietnam'. All these groups represent different aspects that can be associated with the central participant that can be said to represent the war as a whole. The example is of course a satirical version of a film poster, a genre in which this type of configuration can be found quite often.

Figure 4.14 - Vietnam (Nordahl 1968)

The information value of the participants is captured in the following grammatical rules:

R17 Horizontal Relation if participant PA is positioned left of participant PB then concept CA may come 'before' CB; CA is the Given of the New CB.
R18 Vertical Relation if participant PA is positioned above participant PB then concept CA may become the Ideal of the Real CB.
R19 Centre Hierarchy if participant PA is positioned in the centre of PB1-Bn then concept CA may become the Centre of Margins PB1-Bn.

4.7 Grammatical Rules

The Lexical Module did not receive much attention in this thesis. Rules R20 to R22 are a few examples of lexical rules that can be added to the MPA. I feel these form a core of necessary rules to work with the MPA. But I also feel that many rules can be added to this list, but for now this core of lexical rules suffices.

R20 Lexical Iconocity if participant PA is an icon for concept CA then PA must correspond to CA.
R21 Lexical Connotation if the articulation of participant PA has the lexical meaning CB then CA preferably takes on the connotation CB.
R22 Lexical Grouping if concepts CA1-An of participants PA1-An share a lexical indentity then CA1-An may be a group of Interordinates.

One aspect that has been discussed in this thesis cannot be found in these rules: modality. Modality influences the rules in a certain way: some coding orientations will change the strength of the rule. For instance within the technical coding orientation of diagrams the General Rule of Intuitive Correspondence (R7) will become stronger. It may be become a default rule instead of a permissive rule:

R23 General Rule of Intuitive Correspondence within the technical coding orientation if participant PA is differently articulated than PB then this difference preferably does correspond to a similar difference between concepts CA and CB.

Likewise, in a technical coding orientation rules R9, R14-16, R19 and R22 will become stronger. Rules R8 and R10-12 will become weaker. On the other hand the natural coding orientation will strengthen the rules that are deal with 'narrative' Actants (R10-12) and may weaken some rules that deal with 'conceptual' Actants (most likely R13 and R16).

The rules listed above can be said to be quite general rules for visual language within Western culture. Each sub-language as has been discussed in chapter 3 is characterized by a different list. Specific rules can be added, such as the typical text-balloon structure encountered in graphic-novels, comics and cartoons. Also, the strength of the rules can be slightly altered, similar to the effects of different coding orientations. In fact, the difference between coding orientations and sub-languages becomes slightly blurred by a strong correlation between them. In the next, and final, chapter I will investigate this correlation further by comparing different genres of images to the general rules above. Hopefully this will allow me to clearly point out the different patterns and structures that underlie these genres.


[1] Someone who 'knows' the rules does not need to know these rules consciously. Many people will have an intuitive understanding of visual grammar. Especially people who work with images a lot will acquire a certain level of what might be called 'visual literacy'.

[2] Figure 4.1 represents only one of three configurations that Johnson discusses. The object in figure 4.2 is not labelled or named in Johnson, it is my own addition.

[3] A Reaction process is another type op structure distinguished by Kress and Van Leeuwen. It is mentioned here as an example but will also be discussed in section 4.4.

[4] Note, however, that the 'weak actor' structure and the 'strong actor' structure are in this case not the result of syntactic analysis only. It is the result of the combination of syntactic information and a particular articulation, and must therefore be realized in the Conceptual Module; the syntactic structure does not make the actor weak. A 'weak actor' structure can also be realized by the combination of syntactic and lexical information if the actor can be identified as a person or thing that is known to be weak.

[5] The term 'actant' comes from the narrative theory of A.J. Greimas. In his theory an actant is a label for a typical role a character, object or place might have in a certain story, which is very close to the typical roles of participants in images (cf. Bal 1997: 196-202).

[6] The possibility of a participant that is the both Actor and Goal of the same action is provoking, but I could not find an example of this.

[7] I would not say that the label "30 000" is part of the figure, rather it is superimposed over it.

[8] In cultures where people write from left to right the positions of the Given and the New are changed. What is more, from a conversation with someone from Israel where most people are used to languages that use right to left writing such as Hebrew and languages that use left to right writing such as English, I learned that this convention seems to be not very strong at all. It would be very interesting to test this phenomenon.

[Contents] [Previous: Chapter 3 Subsytems] [Next: Chapter 5 Grammatical Research]