Chapter 1 of Visual Grammar

[Contents] [Previous: Introduction] [Next: Chapter 2 Architecture]

This is a thesis about visual grammar. I assume that imagery is a form of language akin to spoken and written words. In this chapter I will try to support this assumption by examining the similarities between verbal language and imagery [1].

I am not the first to examine these similarities. It has been the focus of quite a few semiotic theories. Due to its linguistic background semiotics initially focused on verbal language and representations before moving to languages outside the verbal domain. In fact, Saussure thought that the verbal sign was the best example of an “entirely arbitrary” sign [2] and that, by extension, verbal language “serves as a model for the whole of semiology, even though languages represent only one type of semiological systems” (Saussure, 1983: 68) [3]. However, a semiotic approach to visual representations has its problems. Some scholars see any attempt to approach images as language as a hostile takeover of visual art by the field of linguistics. This is by no means my intention. Although, I do make extensive use of linguistic (an other) theories, I do so to set up a grammar that is very different from verbal grammar. I focus on the visual substance: the particular, visual tricks and mechanics used by the visual designer. I do not try to subject visual grammar to a verbal one, nor do I wish to translate (or represent) visual representations to a verbal form for easier processing. My object is to create a model or basis for a visual grammar that is shares some theoretical characteristics with contemporary linguistics, so that word and image can be studied and compared in a theoretical way and in unison. This grammar is not designed to study particular visual meaning, but visual language itself.

The existence of a visual language that is more or less similar to verbal language would justify a linguistic approach to visual representations. To find this common ground between words and images I will refer to Noam Chomsky and the fundamental characteristics of language he proposes. This exploration will serve me two answer the two fundamental questions that are central to this chapter: what constitutes a language? And are the fundamental elements of a language present in visual representations?

1.1 The Characteristics of Language

Visual language is similar but not identical to verbal language. The media in which visual and verbal expressions are made, are simply too different. Verbal language has a temporal dimension that most static images lack. Visual representations can exploit a two dimensional plane in different and often more powerful ways. These are but a few examples of how the two differ. However, one of the essential elements of semiotics is the distinction between (individual) language-use and the system underlying language-use; between ‘parole’ and ‘langue’ (Saussure 1983: 13-14). Semiotics is primarily interested in the system underlying language-use. Thus a semiotic comparison of verbal and visual languages is above all a comparison of the systems underlying these languages. These systems can still be similar in competence and even similar in effect, even though the languages themselves are quite different.

The “abstract system underlying [linguistic] behavior, a system constituted by rules that interact to determine the form and intrinsic meaning of a potentially infinite number of sentences” is what Noam Chomsky calls a generative grammar which he equates with linguistic competence (Chomsky 1972: 71). A generative grammar consists of elements particular to one specific language and universal elements that are “conditions on the form and organization of any human language” (ibid: 71). A visual language should contain similar elements and meet the same conditions. It is this common ground that can justify a semiotic/linguistic approach to visual representations.

For Chomsky the first important condition a grammar is required to meet, is the (possibility of) correspondence between surface structure and deep structure. The surface structure is the structure formed by the elements and groups of elements that form the sentence. The surface structure emerges from the subdivision of a sentence into separate phrases. These phrases are further subdivided until an elementary level is reached. The deep structure, on the other hand, is formed by ‘units of meaning’ or propositions. These propositions also form a constellation or structure. Consider the following sentences:

S1 Invisible God created the visible world.

S2 A wise man is honest.

On a superficial level sentence S1 consists of the subject ‘invisible God’, the object ‘the visible world’ and the action verb ‘created’ (see figure 1.1). Its deep structure consists of propositions: ‘God is invisible’, ‘God created the world’ and ‘the world is visible’. In this case the surface and deep structure appear to be quite similar, but this is not necessary. S2 has a surface structure that is quite similar to the surface structure of S1. Its deep structure, however, is different. The deep structure of S2 contains the propositions ‘a man is wise’ and ‘a man is honest’ but these are in a different constellation than the propositions in the first example (ibid. 16-17).

Figure 1.1 - Surface structure of S1

If visual representations are governed by a visual language, they also need to have a surface structure that can be related to a deep structure. In fact, many semioticians claim exactly this, eventhough they might use different words and concepts to describe it. For example, Umberto Eco’s syntactic and semantic systems resemble Chomsky’s surface and deep structure quite closely. The syntactic system is a “set of signals ruled by internal combinatory laws”, while the semantic is a “set of ‘contents’”. Like the surface and the deep structure, the syntactic and the semantic systems are linked by a set of rules that Eco simply calls a ‘code’ (Eco 1976: 36-37). Interestingly, Eco’s first example of a syntactic structure, and many that follow, are visual ones.

Figure 1.2 - A still from Clair'e Knee (Rohmer 1972)

Many people will agree that visual representations are in some way syntactically structured and thus have a surface structure. Although the visual syntactic structure might not share the level of complexity encountered in verbal language, this structure can be analysed in a way similar to verbal surface structure. For example, the scene in figure 1.2 above can be divided into a girl, a woman on a ladder and a man watching her. These syntactic units or ‘phrases’ can be further subdivided: the girl is carrying a basket, wearing a particular jacket, and has a peculiar haircut.

In visual representations individual signs take the place of words in verbal surface structures. Visual signs can also be grouped into phrases or ‘participants’ as I will call the visual equivalent of phrases [4]. Visual surface structures correspond to deep structures similar to surface structures and deep structures of sentences. Sections 1.2 and 1.3 focus on the visual sign, how it can be identified and how they correspond to mental concepts.

Chomsky’s second condition for his universal grammar is that language allows the speaker to make infinite use of finite means. According to Chomsky this explains man’s creativity in using language (Chomsky 1972: 17). This characteristic of language is often called discrete infinity (cf. Jackendoff 1997: 3). In other words, the potential use of verbal language is infinite. This is easily illustrated by the sentence ‘an old man’, which can be expanded to ‘a very old man’, which in turn can be expanded to ‘a very very old man’, and so on. Likewise, the number of possible words is also infinite; most Indo-European languages allow the speaker to create new words by combining two (or more) existing words. Given this infinite combinatry possibility of language and the human capability to understand newly encountered structures and words, add to this the limited ability of the human brain to store grammatical rules and words, it becomes evident that a universal grammar has to consist of a finite means to create and understand infinite utterances. (Chomsky 1957: 18-24). To accurately model this characteristic of language Chomsky makes use of rules that are abstract and recurrent. A finite set of such rules can be combined in infinite ways allowing for a set of finite units to have an infinite number of possible configurations. Section 1.4 deals with discrete infinity in visual representations.

1.2 The Visual Sign

The sign is the elementary structural unit of semiotics. Saussure defined sign as follows: “a sign is the combination of a concept and a sound pattern” (Saussure, 1983: 67). This terminology shows that Saussure had spoken verbal signs in mind when he formulated this definition. Saussure quickly replaced ‘concept’ and ‘sound pattern’ with signified and signifier respectively in order to remove the ambiguity that arises from the fact that ‘sign’ is commonly used to refer to the signifier only. Saussure argued that the labels ‘signified’ and ‘signifier’ “have the advantage of indicating the distinction which separates each from the other and both from the whole of which they are part” (ibid. 67).

The signifier is not the material aspect of the sign. It is the psychological impression of something material: the impression of sound, light or anything else the human senses are able to detect. Likewise the signified is also a psychological entity that is conventionally linked to the signifier (ibid. 66).

After this Saussure formulated two fundamental characteristics of the sign. The first characteristic is the sign’s arbitrary character: there exists no internal or necessary connection between the signifier and the signified (ibid. 67). That is to say, the connection between the signifier and the signified is unmotivated. This does not mean that the connection depends on the free choice of its users. In fact, users of sign are bound by many conventions that govern which signifier can be connected to what signified. This arbitrary or unmotivated nature can easily be illustrated. The sign ‘ox’ consists of the signifier ‘ox’ and the signified ‘ox’. It is quite conceivable to replace the signifier with something else. In fact, different languages use different signifiers for the same signified. Although Saussure does not rule out the possibility of non-arbitrary signs, he argues that at least in human language most signs are arbitrary. Even exclamations and onomatopoeic words can be shown to vary from one language to another (ibid. 68-69).

The second characteristic of the spoken sign is its linear character. Its auditory nature causes it to occupy “a certain temporal space and […] this space is measured in only one dimension” (ibid. 69-70). Of course this does not hold for all signs; Saussure himself points out that visual signs can exploit more than one dimension simultaneously. And it is quite possible to circumvent the linear character of the spoken sign. When written down, a text can exploit the dimensions of the page to some extent, and as we will see rhyme and rhythm can also set up ‘extra-linear’ relations in a written or spoken text. However, the linear character of the sign is worth mentioning because from this it follows that a sign needs to be articulated in contrast to what precedes and what follows (ibid. 70). Articulation not only serves to distinguish one sign from another, it also serves to distinguish signs from non-signs that might surround it. Articulation, for Saussure is not just a matter of pronunciation, it is the division of a set of stimuli into distinct and meaningful units. Articulation in this sense is always perceived articulation. It is dependent on the nature of the stimulus, the biological disposition of the senses and individual cognitive competence. When one listens to an utterance in an unfamiliar language it is very difficult, if not impossible, to detect where the words start and end.

According to Saussure the human language faculty consists mainly of its ability to construct a “system of distinct signs corresponding to distinct ideas” (ibid. 10). It is the human ability to articulate stimuli that for Saussure allows communication, even when the stimuli themselves are not articulated. Saussure’s famous statement that in “language itself there are only differences” (ibid. 166) should also be seen in this light, for differences in volume, timbre and pitch, ordered on a temporal plane are the most important means of articulation available to (spoken) human language.

I would to propose that ‘temporal position’ is another means of articulation. Like timbre, pitch and volume, the temporal dimension of verbal language is exploited to create order into an otherwise unordered string of sounds. Therefore I see little difference between the temporal dimension of verbal language and the means of articulation by timbre, pitch and volume. Speed, rhythm and spacing of sound are all means of articulation that can be used because verbal language has a temporal dimension, or a linear character. Speed, rhythm and spacing are also used in written language. Punctuation, spacing, and the division of a text in paragraphs, sections and chapters can, to some extent, be used to the same effect. Timbre, pitch and volume, on the other hand, cannot be directly depicted in writing. These are means of articulation that are directly linked to the medium of sound. There exists, however, some typographical experiments and conventions that mimic some aspects of timbre, pitch and sound. An interesting example is internet-etiquette in which capitalising words or phrases is referred to as ‘shouting’.

The notion of articulation holds true for all types of signs, although the possible dimensions for articulation might vary. Static visual signs are articulated by differences in colour and brightness, amongst others. I will discuss these in detail in section 1.3 below. This articulation allows us to distinguish different signs spatially ordered in the picture plane. The most fundamental difference between visual and verbal signs lies not in the possible dimensions of articulation but in the potential means of articulation that these dimensions allow. Because the potential for articulation with volume, pitch and timbre is fundamentally different from the potential for articulation with colour and brightness the verbal sign is fundamentally different from the visual sign. Written text is an interesting form of the use of signs in this regard. It employs graphical means of articulation to mimic phonological means of articulation. Although this goes well for spacing, it requires the use of typographical conventions or special signs to mimic to some extend the other means of articulation [5]. And even then, it does retain some of the characteristics of the visual medium of writing, as it does exploit some of these characteristics in a way that is unique to writing.

A sign does not consist of an articulated signifier alone; the signifier needs to be correlated to a signified, as well. The nature of the signifier is different from the nature of the signified. The nature of the signifier is for a large part determined by the medium and the grammatical system the signifier is part of. This is not the case for the signified, which is a mental concept. The signified can be related to many different signifiers: verbal, visual or anything else. It is the conceptual nature of the signified that allows it to be mediated by various signifiers of different natures. However, certain concepts are better expressed by signifiers of a certain type or within specific grammatical systems. To be more specific: the nature of the signifier and its grammatical system is reflected in what it signifies, therefore some signifiers or signifying systems are better suited to express certain concepts. A good example of this is the representation of statistical information. While a verbal representation (numbers) will communicate the information very accurately, a visual representation such as a bar chart will communicate the relations among the different values more effectively.

Figure 1.3 - A tree

Figure 1.3 is another illustration of this. It is a picture of a tree [6]; it has as its signified ‘tree’. Although the word ‘tree’ also refers to the same signified it does so in a more ‘abstract’ way. The picture of a tree is more specific; it cannot be any tree, it has to be a tree with leaves and it is clearly not a palm-tree, the word ‘tree’ is indifferent to these nuances. The word ‘tree’ is very well equipped to express the abstract class of tree, while figure 1.3 does a very good job of showing this particular tree. Of course, the structure of figure 1.3 is many times more complex than the structure of the word ‘tree’, as it does not only show the tree but also many individual leaves and other qualities of this particular tree. But even when the visual tree is simplified, as is done in figure 1.4, it still cannot refer to all trees in the same way as the word can; figure 1.4 does not depict a palm-tree. Although a situation is conceivable where figure 1.4 can refer to a palm-tree, a fundamental difference remains between the word and the picture. The picture is not as unmotivated as the word; the picture is not as arbitrary. The picture is linked to the concept tree by a strong visual correspondence, only strong conventions can override this link.

Figure 1.4 - Another tree

I claim that the different means of articulation available to visual and verbal languages for a large part determine the difference in grammatical potential of visual and verbal languages. Different syntactic structures relate to conceptual structures differently, because their different character would activate different mental processes that build the structure. Whereas tt is rare, if possible at all, for two different syntactic structures to relate to exactly the same conceptual structure.

1.3 Articulation

Jacques Bertin (1983) has studied what he calls the visual variables of the image, which in my opinion are closely related to the visual means of articulation. He distinguishes eight visual variables that can be used to differentiate and structure visual signs. Figure 1.5, on the next page, is a visual overview of the variables (X and Y dimension count as two separate variables).

Figure 1.5 - The visual variables

All variables, except size and value, can also be used to strengthen the association between different elements of the image. For example, all red elements in a diagram will be thought of as belonging to one category that is different from the category to which, say, all blue elements in the same diagram belong. Likewise, all entries in the same row or column of a table will be grouped together.

An interesting aspect of Bertin’s study is that some of these variables are not only an effective means of differentiation and association, they can also be used to express meaning by the way they differ. For example the visual variable of size is also very good at representing an order or ratio of difference. That is to say that it can communicate difference (a, b and c have different sizes), order (a is larger than b but smaller than c), and a distinct ratio in size (a is twice as large as b). The X and Y dimension can also be used in this way, while according to Bertin value can only be used to signify a certain order, not a ratio (Bertin 1983: 186-188). These four functions of the visual variables (creating difference, association, order and ratio) can be said to be the structural function of all forms of articulation.

All visual variables are visual means of articulation. For the X dimension, Y dimension, colour and value this is most evident. If one thinks of an image as a set of coloured spots (or pixels, or brushstrokes) in a two dimensional plane then these four variables are the most fundamental means of visual articulation [7]. Shape, size, grain and orientation can be argued to be derived from location, colour and value; a difference in shape is the result of a different constellation of coloured dots. However, their visual effect can be so subtle that they function similar to the four evident means of articulation. The visual effect of a pointillist painting of a tree is quite different from the effect of a naturalist painting of the same tree. In this example the main difference is made by the visual variable grain. Most photo editing programs today are equipped with functions that allow one to experiment with grain and orientation. This can have drastic effects on the image (see figures 1.6-1.8). The variation of shape making an image appear to be curved, oblique or straight also has different connotations (cf. Kress & Van Leeuwen: 83-84). Size, and especially the thickness of outlines, has some strong connotations too (see below). The structural characteristics of visual means of articulation (creating order, ratio, association and difference) partly explain the mimetic character of images. Natural images, or pictorial images, use these characteristics to establish a direct correspondence between the ‘real’ location, size, colour, etc. of the depicted objects and their representation on the picture plane.

Figure 1.6 - Canvas tree

Figure 1.7 - Monochrome tree

Figure 1.8 - Impressionist tree

The possibility of correspondence further explored by Meyer Schapiro. He links the “qualities of the image-substance” to the “qualities of the represented objects”. I would say that the ‘qualities of the image substance’ are the result of the use of certain means of articulation. For example the visible patching of pigment of Impressionist paintings is a specific way of visual articulation (a typical distribution of differences in colour which partly negates the surfaces of the depicted objects or people, but which enhances the perception of the different colours these consist of). Schapiro links it to “the general effect of luminosity and air”. Such a correspondence is beyond mimetic qualities of the painting, and rather “constitute the poetry of the image, its musical aspect”. It is this kind of correspondence between the qualities of the visual means and the qualities of the depicted objects that causes an object to look more massive when depicted with a thicker outline, and more fragile when depicted with a thinner outline (in Innis, 1985: 223-224).

The link with poetry that Schapiro establishes can be further justified by an inspection of the mechanisms behind the poetic function as Roman Jakobson put these down in his article Linguistics and Poetics (in Innis, 1985). From this article one can derive that poetry exploits all means of articulation more extensively and effectively then non-poetic language. For example, Jakobson states that “only in poetry with its regular reiteration of equivalent units is the time of the speech flow experienced, as it is – to cite another semiotic pattern – with musical time” (ibid. 155). Thus poetry, when compared to non-poetic language, makes marked use of speed, rhythm and spacing, the temporal means of oral articulation. What is more, Jakobson repeatedly remarks that similarity in sound or rhythm causes words to be “drawn together in meaning” (ibid. 167) and he stresses the existence of ‘sound symbolism’, which I feel is quite similar to Schapiro’s thinner or thicker outlines:

Sound symbolism is an undeniably objective relation founded on a phenomenal connection between different sensory modes, in particular between the visual and the auditory experience. […] [W]hen, on testing, for example, such phonemic oppositions as grave versus acute we ask whether /i/ or /u/ is darker, some of the subjects may respond that this question makes no sense to them, but hardly one will state that /i/ is the darkest of the two (ibid. 169).

Thus, the means of articulation in both images and texts serve not only to differentiate signs within the image or text, but also to establish a correspondence between meaning and applied means of articulation. The nature of this correspondence might vary. It can be direct: the difference in means of articulation corresponds to a ‘real’ difference between the signifieds. An example of direct correspondence is the correspondence between the location of depicted objects on the plane of a natural image and their location in the depicted reality. It can be conventional: it is based on a strong convention that a certain difference in articulation can be correlated to a ‘real’ difference between the signifieds. A good example of this would be the highly conventional correspondence between the size or location and the abstract notion of value exploited in many diagrams. Bar charts are a good example of this: the height of the bars often refers to an abstract value such as monetary value, temperature or weight. Finally the correspondence can also be intuitive. Intuitive correspondence is based on the relation between the different sensory modes on which Jakobson founded his sound symbolism. A good visual example would be the use of ‘warm’ reds and ‘cold’ blues in coloured weather maps.

The difference between these types of correspondences is not clear-cut. For example the correspondence between time and location in diagrams is both conventional and can at least be said to be partly intuitive.

The nature of correspondence may also vary in level of accuracy. While in figure 1.9 only the order in the Y dimension of the named cities corresponds directly to their geographical order, figure 1.10 is more accurate in a sense that it also shows that Amsterdam is closer to Paris than Paris is to Rome. This is similar to the distinction between the ability to show order and ratio that Bertin makes. Something similar can be done verbally. Not only can the sequence of the cities uttered be made to correspond to the sequence of cities encountered when travelling from north to south, by ‘spacing’ the names the ratio between the intervals can also be made to correspond to the ratio of time required to travel from one place to the next.

Figure 1.9 - Ordered cities

Figure 1.10 - Cities on ratio

The most striking differences between the effects of visual and verbal representation can be attributed to the difference in the possible articulation of visual and verbal signs. Verbal representations rely almost exclusively on the distinction of signs by difference. Although poetry makes extensive use of means of articulation which also serve other roles, it remains quite weak and secondary in ‘natural’ language-use. Most visual signs, on the other hand, rely heavily on the meaning of their articulation while the use of conventional, language-like signs is limited 8. One might say that the means of articulation form a structure independent of syntactic structure. This structure of articulation also corresponds to the conceptual structure. Therefore the possibilities for correspondence between visual representations (including its syntactic and articulation structure) and conceptual structure differ from verbal representations. I will elaborate on this in chapter 2.

1.4 Discrete Infinity

To meet Chomsky’s second condition of his universal grammar (discrete infinity) visual signs need to be recurrent. Without a set of recurrent signs - and thus with a set of infinite signs - we would not be making infinite use of finite means. The idea that visual signs are recurrent has been actively opposed, especially from within the field of art. Richard Wollheim sums up the opposition and formulates one of the most fundamental critiques on the existence of recurrent visual signs. He argues that the value of art lies in the fact that each individual work of art is unique and therefore not recurrent. To abstract a piece of art in order to be able to fit it in an abstract model of grammar would do no justice to its individual qualities. And according to his view it is necessary to abstract visual signs if we are to come up with a finite set (in Thompson 1996).

One thing should become immediately clear from this critique. Wollheim appeals to the infinite nature of visual representations when he talks about the unique and non-recurrent nature of art. In fact the same can be said of language. The infinity of visual representations is not at issue. It is the discreteness of the units of visual language that needs to be addressed. In order to counter this critique we need to examine more closely what these ‘individual qualities’ consist of, and to what extent we need to abstract individual visual signs.

To start with the former, the individual qualities of a painting can easily lie in the unique combination and articulation of a limited set of visual signs. The real issue here is scope. At one point in his essay Wollheim considers whether paintings should be compared to whole sentences or parts of sentences (ibid. 25-27). I am inclined to say neither. Many paintings have the complexity and contain a number of signs that exceeds the complexity and number of signs of a sentence. A paragraph, poem, or even a whole text is what we are looking for if we are to look for a verbal equivalent of a painting [9]. Given a representation with a large number of signs, an artist has plenty of (in fact infinite) room for individual expression using only the combination and articulation of a limited set of signs.

To return to the abstraction of signs, we must take into account that signs are not material objects. We are dealing with a constellation of signifiers and signifieds, both of which are psychological entities. This can be partly explained by looking at perception and cognition. The signified is a psychological reflection of a material object, and our senses do distort and abstract it to some extent. This distortion takes place on a low, subconscious level. The mechanism behind this have been well documented for the eye. Biological studies of the human eye have revealed that the eye itself is capable of gathering an enormous amount of information each second. This amount is approximately 107 bits. However, humans do not consciously process so much information; the number of ‘bits’ humans consciously process each second is 8 to 25 (Francke 1977). This indicates the tremendous reduction, processing and parsing of information that takes place between the eye and consciousness. This processing already starts in the eye itself, takes place in the nerves that connect the eye to the brain, and is completed in the brain itself.

One phenomenon that is responsible for this is lateral inhibition. Lateral inhibition dampens the stimulation of sensory cells that border sensory cells that are all highly stimulated, while enhancing the stimulation of cells that border cells that are not highly stimulated. In effect this phenomenon causes the enhancement of edges between areas that are highly stimulated and those areas that are not highly stimulated, see figure 1.11. Figure 1.12 is an exaggerated example of this effect. Similar processes also occur in other locations in the eye, the optic nerve and the brain (Reisberg 2001: 36-39). Thus there is always a difference between the actual and perceived object. Our senses add artificial structure to the stimuli. This might cause a unique object to be perceived less ‘uniquely’.

Figure 1.11 - Lateral inhibition

Figure 1.12 - The effects of lateral inhibition

To return to the structural level, the use of abstract signs is a necessary condition for human signification and communication. This is an idea that can be traced back a long time. It was already apparent in the work of the philosopher John Locke who first conceived of the term semiotics in the contemporary sense of the word. He noted that almost all words are general words that can indicate many particular things. According to Locke this is not “the effect of neglect or chance, but of reason and necessity” (Locke 1975: 408). The reason for this is made clear in the following passage:

Men making abstract Ideas, and settling them in their Minds with names annexed to them, do thereby enable themselves to consider Things, and discourse of them, as it were in bundles, for the easier and readier improvement, and communication of their Knowledge, which would advance but slowly were their words and thoughts confined only to Particulars (ibid. 420).

Anthony Flew interprets this by stating that a language that would only consist of particular signs would only allow us to indicate what we want to speak about, not what we want to say about those things (Flew, 1989: 433).

The units of visual representations can be, and often are, interpreted abstractly. This is less obvious or less marked when compared to verbal representations, however they are. Figures 1.13 and 1.14 illustrate this abstract interpretation. These are billboard advertisements part of a governmental campaign to stimulate the acceptation of various social and ethnic groups in the Netherlands. The caption translates as ‘keeping an eye out for others can make a big difference”. There is a contradiction between the caption and the image itself, because the eyes of the girl and the man are covered by their scarf and hat respectively. During the time this campaign ran the scarf was a profound symbol of Islamic dogmatism in The Netherlands. At the time there were many debates over wearing scarves in public schools, whether or not public officials should be allowed to wear them, and whether or not the scarf was a means of suppressing women. Religious tradition clashed with the ideals of the open, liberal society. It is this symbol that covers the girl’s eyes. Consequently, a more general interpretation would be that ‘the Islamic faith hinders integration into Dutch society’. Given the ‘neutral’ representation of the other people in the campaign this is probably an unintended message that arises from a generalized (and abstracted) interpretation of the image. In a sense there it is not a white-with-blue-swirls-patterned scarf that is represented in figure 1.14 but symbolic scarf that stands for a whole set of religious traditions and ideas. A symbol that incidentally has a white-with-blue-swirls pattern.

Figure 1.13 - Muslim girl

Figure 1.14 - Dutch farmer

The intended message of this campaign is also symbolic (for I take the strong symbolic message discussed above to be unintended). To establish this symbolic reading the pictures in the campaign include some hints. First, the girl and the man are easily recognized as typical examples of social groups. The other images in the campaign depict typical examples of Africans, Asians and youngsters. The marked background, angle and lighting adds to the ‘salience’ of the depicted people and helps the viewer to judge them as symbols for a whole group, as well as the fact that their eyes are not visible which is stressed by the caption [10]. Most importantly these features are recurrent throughout the whole campaign, even though the signs themselves are different. They are recurrent because the mechanisms and their effects are consistent throughout the campaign: a distinct pattern is established. Similar recurrence of patterns can be found in many (groups of) pictures. Some of these patterns are even highly conventional and most definitely language-like. Examples of this can be found in iconographic traditions of depicting particular characters from bible stories and classical myths. A limited group of sign-types are used and used again even though their individual expression might vary, and possibly vary more than their verbal counterparts.

To conclude this chapter I think it is safe to say that there exists a common ground between verbal language and images, enough to speak of visual language. Visual language meets the criteria for language laid down by Chomsky. Visual language is structured similar to verbal language. This structure is also correlated to a deep, conceptual structure. The elements of this structure (signs) can be used abstractly and recurrently and therefore can be used to achieve discrete infinity. However, the differences between visual and verbal languages must not be forgotten. Verbal and visual language make use of different means of articulation, opening up a different potential for expression within both types of languages.


[1] I use the predicate ‘verbal’ here, and throughout this thesis, to distinguish spoken and written language from visual language. Although Saussure and his initial followers were more interested in spoken language over written language I will refer to both types of language when I use the word verbal.

[2] For Saussure the arbitrary nature of the sign is one of the sign’s most fundamental characteristics even though he agrees that not all signs share this characteristic. Also see section 1.2.

[3] Saussure spoke of semiology rather than semiotics. Semiology used to refer to the French school of semiotics where semiotics referred to the American school. Today semiotics commonly refers to both.

[4] I have borrowed the term participant from Gunter Kress and Theo van Leeuwen who use it to refer to distinct groups of signs that perform a single role within a visual representation (Kress & Van Leeuwen, 1996: 47-48). Also see section 4.4.

[5] One of my favorite examples of this is the use of emoticons on the internet. These are frequently used to convey irony or sarcasm where intonation would be used if the same text was spoken.

[6] Or maybe two trees. At a first glance I am inclined that figure 1.3 depicts one tree. However two stems are visible, while the roots are obscured by a low stone wall.

[7] Bertin distinguishes (gray)value from colour, which both comprise what we commonly call colour. However it is conventional to distinguish between hue, brightness and saturation.

[8] Conventional signs should not be confused with conventional articulation discussed above. Conventional signs tend to make use of articulation only to establish difference between the individual signs.

[9] Michel Foucault starts his book The Order of Things with what conceivably might be called a verbal equivalent of a painting (Las Meninas by Velasquez) which is 14 pages long!

[10] The characteristics by which an image can serve as a symbol have been noted before. Kress & Van Leeuwen, for example, describe four of them: salience, ‘markedness’, convention and explicit indication (1996: 108). These and similar mechanisms will be discussed in detail in chapter 4.

[Contents] [Previous: Introduction] [Next: Chapter 2 Architecture]