Multimodal Text Development

Information Guide for Teachers

Big image


What Is a Multimodal Text?

In their article, "Helping Teachers to Explore Multimodal Texts" (2010), Anstey and Bull stated, "Multimodal texts can be delivered via different media or technologies. They may be live, paper, or digital electronic." (What are multimodal texts? section, para. 3) They shared:
In Understanding Digital Literacies: A Practical Introduction, Jones and Hafner (2012) explained that due to technological developments in digital media, there is a shift in types of texts encountered, particularly on screens. Digital texts are becoming less textual and much more visual. Digital texts that incorporate various modes such as layout, formatting, font, color, images, graphics, animation, video, and sound are defined as being multimodal texts. This practice of combining multiple modes in a text is known as multimodality.

Affordances & Constraints of Multimodal Texts

Jones and Hafner (2012) explained that affordances of multimodal texts are quite apparent. Audio and video may be embedded, unlike in traditional printed texts. Creativity is fostered with the ease of use of the various modes listed above and the abundant availability of online resources. The use of images in multimodal texts has a more direct effect on viewers because images are spatial/simultaneous meaning that all of the information in an image is seen by the viewer at the exact same moment with the spacing of the different elements all in relationship to one another. However, this same affordance can also be considered a constraint due to the image being able to relate several messages at the same time challenging the viewer to have to discern between the possible meanings of the image and integrate the information accordingly.

Designing the Visual Layout

Jones and Hafner (2012) discussed how designers of a multimodal text must make specific choices about the visual layout of their text including the size and location of elements. Most multimodal texts have similar structures and are divided into three general regions.

  • Given information, or information that the viewer is presumed to already know, is located at the left of the page, and new information is located at the right.

  • Factual or real information is found at the bottom of a page, while ideal information or information that is hoped for or strived for is found at the top.

  • The very center of multimodal text is reserved for the most important or dominant element of the page with other elements or information surrounding it in the margins.

The Power of the Image

Jones and Hafner (2012) further explained that in a multimodal text, images and text have a reciprocal, interactive relationship framing each other and providing contextualized information to help clarify the intended meaning of the message.

  • Concurrence is where the message of the text and the image reinforce each other and share the same meaning.
  • Complementary is where the message of the text or the image gives slightly different information. This new information helps to complete the details of the other mode clarifying the meaning. Images can complement a text by explaining the how and why of a text, by providing additional information, or by restating or specifying what is in the text.
  • Divergence is where the mood and tone of an image differs from the mood and tone of the text. This technique is often meant to create a message of irony or humor.

The authors continued by sharing how images can be very powerful in eliciting strong emotions from the viewer and creating an interpersonal relationship greatly affecting the viewer’s attitude towards the subject matter. Photographers or creators of images employ certain techniques for their purpose.

  • The first technique is involvement. They may engage or involve the viewer based on the subject’s gaze, the distance of the shot from the viewer, or the particular camera angle. Demand images have the subject looking directly at the viewer demanding their attention. An offer image has the subject looking away and makes a request more for reflection or thought.
  • The second technique is power relations which specifically focuses on the angle of the camera. A shot taken from above makes the viewer feel more dominant, but a shot taken from below may make the subject of the image more powerful and dominant than the viewer.
  • The third technique is modality which focuses on color saturation, contextualization, representation, depth, illumination, brightness, etc. which can all be used to create tone and mood.

Videos incorporate the same design principles as images listed above but with the addition of pacing, sound effects, and music.

Online Language and Social Interaction

A New Frontier

In the video “How Is the Internet Changing Language Today?” (2010), linguist David Crystal enumerated the various methods used to communicate electronically in the 21st century. Cell phones, email, instant messaging, social networking sites, webpages, and blogs are examples of the modern technology people use to keep in touch. He made the point that any time in history that there have been advances in technology, changes have been seen in the way that the English language is used. The same is true now as new forms of electronic communication become a part of daily routines and social culture. New styles of the English language are evolving for each new mode of communication. In the video Crystal stated, “Language has become expressively richer as a result of the internet.”

Keeping in Touch

Of the numerous technologies available for communicating, which is best? Synchronous, asynchronous, and face-to-face interaction all have their place in our 21st century society.

Jones and Hafner (2012) shared the many affordances of synchronous and asynchronous interactions. These types of messages can be multilingual and multi-scriptural using a mix of languages and linguistic features that allow for creativity such as emoticons, shortened forms of words, acronyms, letter homophones, lexicalization of sounds, creative punctuation, and more. These digital communications allow for interaction between the person sending the message and the person receiving the message. These types of messages are extremely convenient and can be responded to in real time or at a later, more convenient time with less transaction cost of a face-to-face interaction. A person can keep it short and to the point, and there is no need for pleasantries or pretenses.

However, synchronous and asynchronous interactions have their constraints as well. The limited space or characters of many digital messages may be too restrictive to get one’s message across, or there may be a delay in the communication. Messages sent via digital communication can be misconstrued due to lack of media richness meaning facial expressions, gestures, and tone of voice all of which help convey emotion, purpose, and the overall context of the message.

On the other hand, a face-to-face interaction is quite different because it offers the full contextualization of the message with facial features, gestures, and tone of voice to help the receiver of the message interpret the sender’s purpose and intent. Although, many see the transaction cost of face-to-face interactions as being too high. One must make time for the interaction, exchange pleasantries, monitor facial expressions, gestures, emotions, and attentiveness.

For the Record

Leaders in their fields share their opinions on the impact of digital communication and online language. What does the future hold?

David Crystal - Linguist

“Some people dislike texting. Some are bemused by it. Some love it. I am fascinated by it, for it is the latest manifestation of the human ability to be linguistically creative and to adapt language to suit the demands of diverse settings. In texting, what we are seeing, in a small way, is language in evolution.” (Crystal, 2008, p. 11)

Andrea Lunsford - Stanford University

" 'I think we're in the midst of a literacy revolution the likes of which we haven't seen since Greek civilization,' she says. For Lunsford, technology isn't killing our ability to write. It's reviving it—and pushing our literacy in bold new directions." (Thompson, 2009, para. 3)

John Humphrys - BBC

"It is the relentless onward march of the texters, the SMS (Short Message Service) vandals who are doing to our language what Genghis Khan did to his neighbours eight hundred years ago. They are destroying it: pillaging our punctuation; savaging our sentences; raping our vocabulary. And they must be stopped. This, I grant you, is a tall order. The texters have many more arrows in their quiver than we who defend the old way." (Humphrys, 2007, para. 15-17)

Adolescents and Multimodal Texts

Digital Native or Digitally Deprived?

Just because a child has been born into the digital world does not automatically make him or her a digital native. In her article, “Shrek Meets Vygotsky: Rethinking Adolescents’ Multimodal Literacy Practices in Schools”, Mills (2010) explained the important role of socioeconomic class and ethnicity in a child’s digital literacy development. Mills shared research that supports the findings that children of middle class families have more exposure to the Internet and family digital tools; therefore, they have better online and digital skills than children of working class families or low-income families. Research supports the same is true of children of Caucasian families. Caucasian families ranked the highest in Internet usage surpassing Hispanics, African Americans, Native Americans, and Asians.

So, assuming a 21st century learner is inherently a digital native can be a gross misjudgment of a student’s digital skills. It is the same as assuming that a student from any English-speaking family will inherently have a vast, rich, formal vocabulary. Teachers know that the socioeconomic background of the family and often ethnicity play a huge role in the development of a student’s vocabulary and language skills. Mills’ article supports that the same is true with digital literacy skills.

Implications for Schools

What is the best means for schools to incorporate and scaffold the teaching of multimodal literacy development? Mills (2010) continued her article sharing the different viewpoints of Dewey and Vygotsky.
"The current drive toward including the literacies of youth is reminiscent of Dewey...who emphasized learners' readiness to learn and the need to take into account the knowledge, competencies, and interests of the learner as the launching point of instruction. This view becomes problematic if all of education is situated in youths' out-of-school literacy experiences." (p. 37)

"Vygotsky convincingly argued that adults should bridge the distance between learners' current levels of understanding and levels that can be achieved through collaboration with experts and powerful artifacts. This principle resolves the tension between the multimodal and popular literacy practices of youths and school-sanctioned literacies." (p. 38)

Though students come with various backgrounds and skill levels, Mills (2010) explained that students are able to bring their own personal textual and multimodal experiences from their home and social worlds into the classroom and apply those experiences to new digital literacy activities and lessons. Schools must provide the expert resources that students need such as teachers, books, and technologies to help students become competent in the multimodal and digital literacy skills that they are not able to master on their own in their home and social worlds. Mills stated, “The goal of literacy education is to point youth in the right direction so that they can extend their current practices to a wider range of productive purposes.” (p.40)

Teaching in Action

Mills (2010) explained that it is up to teachers to serve as role models for students and experts in multimodal and digital literacy practices. Students require direct instruction that introduces them to new digital skills, forms of communication, the creation of multimodal texts, and multimedia production.
Below, Scott DeWitt, an associate professor at Ohio State University, shared his insight into working with students regarding multimodality and composition.
Scott DeWitt on multimodality and engagement
Churches (2012) shared that the Bloom's Digital Taxonomy is designed to help facilitate learning, not teach how to use specific technologies. Rubrics should be designed to assess processes or end products. The embedded link below provides an in-depth look at each level of the Bloom's Digital Taxonomy.
Big image
"Teachers of English need to do more than incorporate the out-of-school literacy practices, interests, and predilections of youth. They must also extend the range of multimodal practices with which students are conversant. Teachers can extend the multimodal literacies that are valued in youth networks to give students recognition in the global communications environment." (Mills, 2010, p. 42)


Anstey, M., & Bull, G. (2010). Helping teachers to explore multimodal texts. Curriculum & Leadership Journal: An electronic journal for leaders in education 8, (16). Retrieved from,31522.html?issueID=12141

Churches, A. (2012). Bloom’s digital taxonomy. Retrieved from

Crystal, D. (2010). How is the internet changing language today? [Video file]. Retrieved from

Crystal, D. (2008). Txting: frnd or foe? The Linguist, 47, (6), 8-11. Retrieved from

Humphreys, J. (2007) I h8 txt msgs: How texting is wrecking our language. Mail Online. Retrieved from

Jones, R.H., & Hafner, C.A. (2012). Understanding digital literacies: A practical introduction. New York, NY: Routledge.

Mills, K. (2010). Shrek meets vygotsky: Rethinking adolescents’ multimodal literacy practices in schools. Journal of Adolescent & Adult Literacy, 54 (1), 35-45.

Scott DeWitt on Multimodality and Engagement [Video file]. Retrieved from

Smore. (n.d.). Retrieved from http//

Thompson, C. (2009). Clive thompson on the new literacy. Wired Magazine. Retrieved from