MERLOT
Journal of Online Learning and Teaching |
Vol. 2,
No. 4, December 2006
|
|
Learning by Tagging: The
Role of Social Tagging in
Group Knowledge Formation
Jude Yew
Doctoral Student
School of Information
University of Michigan
Ann Arbor, MI
USA
jyew@umich.edu
Faison P. Gibson
Assistant Professor
School of Business
Eastern Michigan University
Ypsilanti, MI
USA
fgibson@emich.edu
Stephanie D. Teasley
Research Associate Professor
School of Information
University of Michigan
Ann Arbor, MI
USA
steasley@umich.edu
Abstract
This
research presents a case study on the use of Social Tagging in
an undergraduate classroom at the University of Michigan during
the Fall 2005 semester. Students were between 20 and 22 years
of age. Students tagged their individual blog posts to
contribute to themes and conversations in an online learning
environment. Using content analysis of the blog posts and tags
as well as semi-structured interviews, the study examines the
role of online social tagging for tracking and aiding group
knowledge formation.
Introduction
This paper
presents a case study from an ongoing research project that
investigates knowledge and community formation in online
learning environments that employ social tagging. These learning
environments allow the user to organize and display online
content, such as blogposts and bookmarks, with meaningful
keywords or tags presented in a public and collaborative manner.
Such labeling of online content potentially allows the
individual learner and the community to use technology and
social conventions to organize knowledge, coordinate with
others, and facilitates the sensemaking efforts of the community
(Mathes, 2004).
This study
makes the argument that social tagging systems employed within a
learning community can both facilitate the process and provide
evidence of knowledge formation within the group. To investigate
this, we first put forward a theoretical argument for why social
tagging systems should be employed to facilitate the production
of group knowledge. We then present an analysis of an
undergraduate business school class’ online learning environment
that utilized social tagging.
The case for
social tagging
Tagging
describes the activity of marking online content with keywords,
called “tags”, as a way to organize content for future
navigation, filtering or search. Tags are not based on a
controlled vocabulary, but rather are left to the user’s wishes,
although as shown in this study group norms and social processes
can play a significant role in an individual’s choice of tags
leading to fairly consistent assignment of specific tags (Mathes,
2004). This act of assigning tags to categorize an object is an
act of knowledge production as it makes apparent the mental
models, or internal representations of knowledge, that one uses
to associate with the object (Pauen, 2002). The argument being
made here is that allowing students to associate keywords to
objects we are enacting the associative structure of knowledge
formation (von Anh & Dabbish, 2004). New knowledge is formed in
the allocation of tags, as the individual has to make sense of
the new object by associating it with prior understandings and
classification of objects. For instance, by categorizing a
digital photograph with the tag ‘vacation’, we are immediately
providing information about the content of the photograph
without actually having to view it. Also, the tag “vacation”
provides information to others about how we have contextualized
the photo. Thus, the use of tags can function both as a way to
facilitate the formation of new knowledge as well as to provide
evidence of how this knowledge evolves over time.
Tagging is
social because the tags are visible to the whole group with the
potential for influencing the tags adopted by each group member.
We believe that social tagging systems employed within a
learning community can facilitate knowledge formation within the
group. In addition, social tagging can provide evidence of
knowledge formation to both the group members and to
researchers/analysts. In a class, the tags used by individual
students to categorize online content also functioned as a
“repository” of how that particular student made sense of and
assimilated the material being taught in the class (Argote,
1999; Weick, Sutcliffe & Obstfield, 2005). When tags are made
public and shared, other students in the class are able to tap
into the knowledge being formed by the individual student.
Students are able to view the tags used by others and employ
those tags to inform their own understanding, creating an
iterative learning loop (Russell, Stefik, Pirolli & Card, 1993).
Additionally, the tags employed by one member of the class can
“self-propagate” and become a “linguistic meme” that enables the
entire class to organize and coordinate their online discussion,
and in the process of doing so, establishes a common
understanding of the material being taught (Heath & Seidel,
undated).
Methodology
The setting
This study
took place in Business Information Technology 320 (BIT320), a
database and Information class offered at the University of
Michigan. The class was offered to undergraduates aged 20 to 22
at the Business school and a large part of the class was devoted
to group work where students were expected to create information
databases based on the technologies taught within the syllabus.
BIT320 also used blogs and RSS (an XML format for syndicating
blog content) to create an online space where both the professor
and the students could share their knowledge. The class website
was dubbed the “Class Remix” to encourage participants to
improve upon, change, integrate, or otherwise “remix” the
group’s knowledge contributions similar to Lessig’s notions of
a remix culture (Koman, 2005). Participation in the Class Remix
was mandated through a class policy that stipulated 5 blogposts
per week that were then aggregated in the site (Here
on the web and pictured in Fig. 1). Students were encouraged to
create a vibrant learning community where group knowledge was
built collectively by sharing relevant links, questions,
answers, and observations of the material taught in the class.
In this
environment, students could post about new ideas, or they could
effectively respond to the contributions of others by writing a
response in their own blog and linking back to the original
poster. In this way, conversations (initial post, comment,
response to comment, etc.) effectively occurred across student
blogs. When engaging in these sorts of conversations, students
were encouraged to reuse at least some of the tags that previous
posters had used, as well as, adding any new tags they might
find relevant. In this way, whole conversations came to be
grouped by tag and were made findable by tag. A limitation of
the system was that once a post was tagged and saved, the tags
could not be changed.
Figure 1: Screen capture of class "remix" website (04/14/06)
Unlike more
orthodox and prescribed forms of classification, social tagging
allowed the users in the community to assign any
keyword/category to their contribution that they deemed
relevant. Various visualizations, such as the use of tag clouds
on the class website (highlighted in blue lower right corner of
Figure 1), helped members of the class to be aware of the
current and most frequently submitted topics/posts. The class
remix website can be seen as an archive of the students
contributions, and can be used to document the students’
evolving understanding and knowledge formation that has taken
place during the course.
Data & Methodology
Data for
this study were composed of participants’ contributions to the
class remix website and in-person interviews. To better
understand the role of the remix site in the particpants’
learning, content analysis was performed on the student blog
posts and the tags they employed to describe these posts.
Additionally, the students’ grades in the class and
semi-structured interviews with seven out of the eleven
participants in the class provided complementary data. In the
following section, the server log analysis, the key findings
generated by the interviews, and the content analysis of the
blogposts are reported.
Findings
Table 1
outlines the total number of blogposts made by each student in
the class during the term, the total number of tags that they
associated with their blogposts and the average number of tags
per blog post contributed to the class website.
The majority
of the students adhered to the instructor’s requirements that
they contribute five blogposts a week to the class website. With
the exception of three students, everyone in the class met the
minimum requirements of 5 blogposts a week that was stipulated
by the instructor (highlighted in Table 1 by the red line).
Source |
Total
Posts |
Total
Tags |
Avg.
Tags/Post |
The
Blogstar |
36 |
75 |
2.0833 |
Musings of William h |
l">
1.7561 |
Matt’s
Musings |
61 |
156 |
2.5574 |
jb's
blog |
65 |
150 |
2.3077 |
zee124 |
66 |
124 |
1.8788 |
Shady
Waters |
66 |
219 |
3.3182 |
Supriya |
66 |
146 |
2.2121 |
Pink
Footsie |
68 |
154 |
2.2647 |
Tigerlily's Blog |
69 |
119 |
1.7246 |
Kevin’s Blog |
70 |
137 |
1.9571 |
SuperMatt |
72 |
230 |
3.1944 |
Blogonautic Solutions (instructor) |
74 |
198 |
2.6757 |
Table 1: Total blog posts and tags and avg. tags per post
(13 weeks x 5 blog posts/week = 65 minimum required posts)
The
instructor’s purpose for stipulating a minimum requirement of
contributions was to encourage the students to fully utilize the
system, and to ensure sustained participation from the students.
The instructor’s rationale for mandating participation online is
illustrated in the following quote:
… This is
one of those things where initially people have some hesitation
… I mean there's just all that group anxiety that comes into
play and so you got to get over that hump, you got to get over
it early and just start making it happen. It’s also practice
(that) makes it better … (Inst1interview, 0:32:50)
As shown by
the Average Tags/Post column in the Table 1, participants tended
to use more than one tag to describe the content of each blog
contribution, a common practice in this type of system (Kroski,
2005). Because of the great number of tags being employed, one
issue that emerges is that of the vocabulary problem (Furnas,
Landauer, Gomez & Dumais, 1987). This problem highlights the
issue that there are multiple ways to describe an object/idea
and that random pairs of people label an object similarly at
most 20% of the time (Furnas et al, 1987). Because of the
vocabulary problem, participants in the class are forced to
determine exactly what should be the common vocabulary for
describing their blog posts. One student described how the group
made sense of multiple tags as follows:
So when you
have hundreds of tags, it's really the case that only a few of
them are important. And that was the case here. And so people
were able to figure that out, and that we had sort of themes. So
at any given point in time, maybe 10 tags would be important.
(Stud2 interview, 0:13:51)
This pattern
was reflected in the analysis of the server logs. In total 143
distinct tags were used 1780 times during the term. However not
all tags were used equally. As indicated by the quote from
Student 2 above, there were a small number of keywords that were
used more frequently than others. Figure 2 highlights the ‘Long
Tail’, or the exponential distribution, phenomena (Anderson,
2004) where a large proportion of the 143 keywords contributed
were used only once or twice.
Figure 2: Tag Frequency Distribution
Of the 20
most frequently used tags shown in Table 2, the top four tags
(highlighted in Table 2 below) were used at least three times
more frequently the others.
Tag/
Keyword |
Frequency |
Technology |
280 |
Opinionslug |
270 |
Classquestions |
183 |
Blogging |
145 |
Microsoft |
40 |
XML |
38 |
Internet |
38 |
Blog |
Remixing |
34 |
Project2 |
30 |
Databases |
29 |
NewInventions |
27 |
Project1 |
26 |
WordPress |
25 |
Google |
23 |
SQL |
22 |
ClassIssues |
22 |
DenaliFlavours |
21 |
Ipod |
20 |
Normalization |
17 |
Weblogs |
17 |
Table 2: Top 20 Distinct Tags by frequency used
By
investigating the timing of when certain tags were adopted and
their patterns of use, the formation of group knowledge and
convention can be represented. As shown in Table 3, the top
four tags were adopted by the students early on in the semester
and their continual use resulted in them becoming conventions
for the students in the class to talk about specific subjects in
their blog contributions.
Tag |
Source |
Earliest date published |
Technology |
Kevin’s blog |
09-11-2005 |
Opinionslug |
Pink
Footsie |
09-14-2005 |
Blogging |
jb’s
blog |
09-14-2005 |
Classquestions |
Tigerlily’s blog |
09-15-2005 |
Table 3: Top 4 tags by source and earliest date published
Other more
specific tags like SQL, XML and Databases were used only during
the part of the term where that subject was the most heavily
discussed in class. The instructor of the class represented the
phenomenon as follows,
… a tag
winds out being a term or label that people introduce. They
introduced it to have a shorthand for referring to some
phenomenon. And then if they re- use this term at given points
in time, they're saying that phenomenon is there. And so what
winds up happening is you see that there are themes, and
basically these are recurring uses of tags. (Inst1 interview,
0:15:47)
The
formation of “themes” within the class suggests how social
tagging aids with the formation of group knowledge around
specific course content. The frequency of use of the top four
tags and the instructor’s comments support the claim that those
tags are functioning as artifacts/repositories of the shared
understanding between the individuals in the class (Argote,
1999). And because these tags have been used by every member of
the class at one point or another during the term, group
knowledge or shared understanding has been formed as a result of
the “learning loop” that occurs through their use (Russell et
al, 1993).
The
differential use of tags
Content
coding of the student interviews revealed that not all tags were
used in the same way. There were two kinds of tags; functional
tags (e.g. “opinionslug” or “classquestions”) and content tags
(e.g. “technology” and “XML”).
Functional
tags are labels that indicate some form of utility or function
to the members of the class. For example, the “classquestions”
tag was deliberately used by the instructor of the class as a
way to easily indicate and highlight questions or problems that
the students may be having with the material being taught. One
functional tag, “opinionslug”, was a keyword first coined by a
student, Pink Footsie. “Opinionslug” was used to indicate
contributions that were personal opinions or views of both the
content matter or administrative aspects of the class. According
to Student 2,
… at first
it was only Pink Footsie who used that ... cause she was the one
who invented it ... but then as we started reading more and
understanding what she meant by 'opinionslug' ... we definitely
all started using it ... but if you just started looking at this
(tag) you would probably have no idea what it was ... So it was
a kind of inner group understanding. (Stud2 Interview, 0:27:58)
From the
illustration of the use of the “opinionslug” tag, we can see
that an explicit purpose/function is signaled through its use
and it prepares the reader of the contribution to both
understand and react appropriately to what is being said in the
blogpost.
Another
example of a functional tag is “classquestions” which seemed to
be a term coined by Tigerlily’s blog but was actually
stipulated by the instructor to create threads of interaction
that could be retrieved by the students later on. Student 2
indicated that,
he (the
instructor) told us that if ever we had a class question we had
to call it "classquestion" ... and if you actually clicked on
classquestions you would actually see a stream. (Stud2
interview, 0:33:48)
The adoption
of tags to continue a thread of interaction was practiced by
Student 2, who explained that the popularity of certain tags
had to do with the fact that they highlighted interesting
threads of conversation:
It
definitely had to do with the fact that she (a classmate) would
have had to have an interesting enough post where I would reply
to it or I would make a post about her post ... and so then when
I was picking out my tags I would look at what she called it
...just because I am conscious of that and want to make sure
that you could find out stream of conversation ... if it was
something really boring that no one answered then it probably
wouldn't catch on. (Stud2, 0:29:26)
Thus we can
see that functional tags like “opinionslug” and “classquestions”
signaled an explicit purpose and their high frequency of use
points to the fact that the convention of using these tags to
highlight the function of a blog post became a social norm with
in the class.
In contrast,
content tags were topics that the class dealt with explicitly.
There was a certain amount of ambiguity in how content tags were
used and perceived by the students in the class. This ambiguity
could be because content tags embodied meanings that went beyond
the shared understandings of the students and have significance
outside of the class as well. An example of a content tag and
how it is used can be seen in the Student 1’s comparison
of how her use of the “XML” tag differs from the “opinionslug”
tag:
Well with
XML it's harder ... if I had a question about XML and someone
answered it and put XML in the tags... it's fine but there's so
many different things to call it ... you know it could have been
about databases, it could have been about writing code ...
whereas with "opinionslug" it was very obvious you were going to
call opinionslug because you were basically preaching on your
opinions. (Stud1 interview, 0:30:40)
This
sentiment was shared by Student 3, who used the content
tag “technology” in the following way;
For example,
when I first started my blog, I was trying to come up with a
common thread to a lot of the things, so I use the word
"Technology" a lot in my blog. That's such a vague word you know
... And at the same time if I was just looking, or had a couple
of minutes to spend, then I would say, "give me something
interesting about ‘technology’ that's going on" and I wanted
that broad topic. (Stud3 interview, 0:26:30)
What is
highlighted from the student quotes, is the issue of polysemy,
or the multiple meanings of words (Furnas et al., 1987).
Polysemy is a double-edged sword in the use of social tagging
systems. It would seem that the use of popular content tags like
“technology” were deliberately used to signal the content of the
blog post and appeal broadly to as many individuals as possible.
However the problem with such tags is that they are also highly
ambiguous and often have to be paired up with other terms such
as “ipod” and “Microsoft” to qualify their meaning. As
highng. As a result, many
tags associated with blog posts tended to be used only once or inute position in the tag cloud.
From the
analysis of how tags are used by the students, we can see that
it is much more difficult to base assertions of group knowledge
formation around popular or frequently used tags. What is shown
is that the students used tags according to a shared notion of
the tags’ function. Very often, tags were used to continue
threads of conversation and to signal the content of the
blogpost. As a result, the group knowledge that is formed around
the students’ use of tags does not necessarily represent their
understanding of the content but rather the shared understanding
of how the tags are used to signal norms of participation within
the class.
To further
explore how tags were used, content analysis of the text in the
students’ blog posts was conducted to determine the correlation
of ideas and concepts in the text of the students’ blog posts
with the tags that were used. However, it is obvious from the
previous section that keywords like “technology” were broad and
that the content analysis of the students’ blog post would not
necessarily reveal any correlation between the content of the
students’ contribution with the keywords chosen.
For example,
one particular blog post contributed by Matt’s Musings
was labeled with the following tags; “opinionslug”,
“technology”, and “blogging”. Content analysis of the text in
the blog post produced a word frequency analysis that
highlighted only one co-occurrence of the tags used with the
content of the post. The tag “technology” was a word that was
appeared once in the textual content of the blog post. The
subject of the blog post was mainly about cellular phone
technology between the US and other countries. So in general
the “Technology” tag only represented the post very broadly.
What is interesting to note is that functional tags such as
“opinionslug” tend not to co-occur in the body of the post as
they represent the function, not the content of the post. Again
this highlights the differentiation between the purpose and use
of content versus functional tags.
The idea of
a shared vocabulary is crucial to the formation of group
knowledge. Having a common language enables the processes of
establishing mutual beliefs and mutual assumptions in group
communication, processes that are essential to the formation of
a community (Clark & Brennan, 1991). As had been indicated in
the previous section, tags like “opinionslug” and
“classquestions” functioned as a way for the students to
communicate and interact with each other. It was a way for them
to signal the intentions of their contribution and to publicly
solicit and provide help to each other. Student 3 articulates
this sentiment in the following comment;
On the
occasions when I answered questions, which was rare, or when I
responded to somebody else's blog, I tried to use the same tags
that they (the other students) used when they wrote ... I would
intentionally try and incorporate those into my tags, and maybe
if it had to do with something else, also include the other tags
just to try to cover my bases so that somebody else could follow
the same kind of logic or thread-line, get to their blog and
then my blog. (Stud3 interview, 0:21:08)
Thus, the
tags proved useful to learning because they provided a common
vocabulary with which the students are able to interact with
each other. This aspect of interaction seemed to be the
predominant learning benefit that the students experienced
during the term.
It was these
interactions, made public on the class “remix” website through
the tags, that the students valued. For them, the system added a
new layer of social interactions on top of the physical
interactions that were going on during the class. Student 2
makes this point as follows:
I think that
this contributed to the class so much ... you know it made us
more friendly with each other ... we'd come in the next day and
we'd be like "Oh my god! Did you read what Student x wrote."
Literally, it was so nerdy but we did. And ... the professor
would start cracking jokes like "Student Y mis-spelled this word
in her blog" and he would mispronounce it during lecture on
purpose ... and we all got the joke cause we all read the blog.
It really contributed to the bonding and how we got along with
each other. (Stud2, 0:45:26)
The role of
blogging in learning
While the
focus of this study concentrated on the use of social tagging,
an important premise made was that group blogging might help
students learn. One way to explore this premise is to test the
extent to which blogging performance was correlated with
performance in other aspects of the class. Fortunately, the
case study provides data to perform this test. As part of the
grading process, the instructor computed a blog index for each
student (Table 4). This index consisted of the instructor’s
rating of the quality of each student’s overall blog output
multiplied by the total number of posts the student produced.
Quality was a function of the length and relevance of student
posts. This index showed a significant correlation (r(9)
= .663, p < .05) between the blog index and the students’
final grades less the blogging component of the course.
Examining the components of the blogging index reveal that total
posts is significantly correlated with the grade in other
components of the course (r(9) = .692, p < .05).
However, the quality of posts is not significantly correlated
with the students’ final grade (r(9) =.383, p >
.05). These correlations suggest that students who interacted
more often, by posting blog contributions to the learning remix
website, tended to achieve better performance.
Total Posts |
Post Quality |
Blog Index
(Total posts * Post quality) |
Final Grade less Blogging Component |
72 |
1.75 |
126 |
63 |
68 |
1.50 |
102 |
63 |
66 |
1.50 |
99 |
57 |
61 |
1.50 |
91.5 |
56 |
72 |
1.25 |
90 |
60 |
66 |
1.25 |
82.5 |
57 |
65 |
1.25 |
81.25 |
58 |
69 |
1.00 |
69 |
55 |
66 |
1.00 |
66 |
63 |
36 |
1.25 |
45 |
53 |
41 |
1.00 |
41
le="font-size: 10.0pt; font-family: Arial; font-weight: normal">
Table 4: Class performance with blog index & final grade
The reasons
for improved performance may be varied. For one, these measures
may all simply be correlated with underlying traits of the
learner such as diligence and intelligence. However, learning
in higher education is by its nature an intensely social
process. People communicate and process information
interactively. The blogging environment, along with the use of
social tagging, provided students with an environment that
offered greater opportunities to interact regarding class
material than could be afforded during the allotted class time.
Those who took advantage of this opportunity more often
performed better in other aspects of the class.
Discussion
The main
hypothesis of this study is that the use of social tagging can
aid with group knowledge formation in the classroom. The
findings indicate that social tagging enabled the process
of group knowledge formation as well as the labeling of that
content. Social tagging enabled the students in the class to not
only interact with each other through a shared vocabulary, but
also develop a set of common norms and practices. For instance,
the use of functional tags provided members of the class with a
means to indicate the purpose of their blogposts. Blogposts
tagged with “opinionslug” highlighted that the author would be
getting on his personal soapbox and airing his views. This
enabled other students to make a choice of either avoiding or
reading that particular posting, without the need to look at the
title or the body of the blogpost. Additionally, the use of the
tags was a way students kept track of their interactions with
each other. The class norm of using the same tags as the post
that one is responding to enabled students to identify and track
the interactions they had with each other.
Thus the
evidence presented by this analysis strongly shows that, through
the use of social tagging, the students built shared vocabulary
and norms for interacting with each other in the online learning
environment. This can be understood as the mechanism by which
group knowledge can begin to form. Instead of uncovering the
“what” of group knowledge (its content), this study
uncovered instead, the “how” (its process).
References
Anderson, Chris. “The Long Tail.” Wired, 12.10 October 2004.
Retrieved on Oct. 13th, 2005 from
http://www.wired.com/wired/archive/12.10/tail.html.
Argote, L. (1999). “Organizational Learning: Creating,
Retaining, and Transferring Knowledge”. In, Organizational
Memory. Kluwer Academic Publishers, pp. 67-97.
Clark, H.
H., & Brennan, S. E. (1991). Grounding in Communication. In
Resnick, L. B., Levine, J. M., & Teasley, S. D. (Eds.)
Perspectives on Socially Shared Cognition (pp. 127-149),
Washington, DC: American Psychological Association.
Furnas, G. W., Landauer, T. K., Gomez, L. M., Dumais, S. T.,
(1987) "The vocabulary problem in human-system communication."
Communications of the Association for Computing Machinery,
30 (11), Nov 1987: 964-971.
Heath, Chip and Victor Seidel. (Undated) Language as a
coordinating mechanism: How linguistic memes help direct
appropriate action. Working paper,
http://www.si.umich.edu/ICOS/Linguisticmemes4.2.pdf
Koman, R.
(2005). Remixing Culture: An Interview with Lawrence Lessig.
Retrieved October 19th, 2005, from
http://www.oreillynet.com/pub/a/policy/2005/02/24/lessig.html.
Kroski, E. (2005). The Hive Mind: Folksonomies and User-Based
Tagging. Infotangle, December 7th, 2005. Retrieved on Jan. 2nd
2006 from
http://infotangle.blogsome.com/2005/12/07/the-hive-mind-folksonomies-and-user-based-tagging/
Mathes, Adam (2004). Folksonomies - Cooperative Classification
and Communication Through Shared Metadata, December, 2004.
Retrieved on Dec. 1, 2006 from
http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html
Pauen, S. (2002). "Biobehavioral Development, Perception, and
Action: Evidence for Knowledge–Based Category Discrimination in
Infancy". In Child Development Volume 73 Issue 4 (July/August
2002). Retreived on 16th December 2005 from
http://www.blackwell-synergy.com/links/doi/10.1111/1467-8624.00454/abs/
Russell, D. M., Stefik, M. J., Pirolli, P., Card, S. K. (1993)
"Cost structure of sensemaking" Proceedings of the Conference on
Human Factors in Computing Systems - INTERACT '93 and CHI '93.
ACM, New York, NY, USA: 269-276.
von Ahn, L. and L. Dabbish (2004). Labeling Images with a
Computer Game. In, Proceedings of ACM CHI 2004, pp. 319-326.
Weick, K., Sutcliffe, K. & Obstfield, D. (2005) Organizing and
the Process of Sensemaking. Organizational Science, Vol.
16, No. 4, July – August 2005, pp. 409-421.
Manuscript received 31 Aug 2006; revision received 5 Dec 2006.
This work is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 2.5
License
|
|