Content analysis as a research method

Lecture



Content analysis (from English, contens - content) is a special, rather strict method of qualitative and quantitative analysis of the content of documents in order to identify or measure social facts and trends reflected in these documents. Its peculiarity is that it examines documents in their social context.

Content analysis can be used as the main research method (for example, in the study of the social orientation of a newspaper); parallel, i.e. in combination with other methods (for example, in the study of the effectiveness of the functioning of the media); auxiliary or control (for example, when classifying answers to open-ended questionnaires).

Not all documents can be the object of content analysis. It is necessary that the analyzed content allows you to specify an unambiguous rule for reliable recording of the desired characteristics (the principle of formalization), as well as the content of interest to the researcher met with a sufficient frequency (the principle of statistical significance). Most often, the objects of research through content analysis are press reports, radio, television, mass oral agitation and propaganda, meeting minutes, letters, orders, orders, etc., as well as free interview data and open-ended questionnaires.

There are three main areas of content analysis:

a) identification of what existed before the text and what was in some way reflected in it (text as an indicator of certain aspects of the object being studied - surrounding reality, the author or the addressee);

b) the definition of what exists only in the text as such (various characteristics of the form - language, structure and genre of the message, rhythm and tone of speech);

c) identifying what will exist after the text, i.e. after his perception by the addressee (evaluation of various effects of exposure).

There are several stages in the development and practical application of content analysis. After the topic, tasks and research hypotheses are formulated, the categories of analysis are determined, i.e. the most common, key concepts relevant research tasks. The system of categories plays the role of questions in the questionnaire and indicates which answers should be found in the text. In the practice of Soviet content-analytical research, a fairly stable system of categories developed among which we can name such as a sign, goals, values, a theme, a hero, an author, a genre, etc. The content analysis of mass media messages is becoming more and more widespread. based on the paradigmatic approach, according to which the studied features of the texts (the content of the problem, the causes of its occurrence, the problem-forming subject, the degree of tension of the problem, ways to solve it, etc.) are considered as EFINITIONS manner organized structure. Content analysis categories should be exhaustive (i.e. cover all parts of the content determined by the objectives of this study); mutually exclusive (the same parts should not belong to different categories); reliable (i.e., there should be no disagreement between coders on which parts of the content should be assigned to a particular category); relevant (i.e. correspond to the task and the content under study).

When choosing categories it is necessary to avoid two extremes: choosing too numerous and fractional categories, almost repeating the text, and choosing too large categories, because This may lead to a simplified, superficial analysis. Sometimes it is necessary to take into account the missing elements of the text that may be significant.

After the categories are formulated, it is necessary to select the appropriate unit of analysis - the linguistic unit of speech or the element of content that serves in the text as an indicator of phenomena of interest to the researcher. Complicated types of content analysis usually operate not with one, but simultaneously with several units of analysis.

The units of analysis, taken in isolation, may not always be correctly interpreted, so they are considered against the background of broader linguistic or meaningful structures indicating the character of the division of the text, within which the presence or absence of analysis units — contextual units — is identified. For example, for the unit of analysis “word” the contextual unit is “sentence”.

Finally, it is necessary to establish the unit of account - a quantitative measure of the relationship between textual and extra-textual phenomena. The most common units of account are time-space (number of lines, square centimeters, minutes, broadcast time, etc.), the appearance of signs in the text, the frequency of their appearance (intensity).

What is important is the choice of the necessary sources subjected to content analysis. The sampling problem comprises the choice of the source, the number of messages, the date of the message and the content to be examined. All these sampling parameters are determined by the objectives and scope of the study. Most often, content analysis is carried out on a one-year sample: if this is the study of meeting minutes, then 12 minutes are enough (by the number of months), if the study of media reports is 12–16 issues of a newspaper or radio-radio. Typically, a sample of media messages is 200-600 texts.

A necessary condition for content research is the development of a content analysis table - the main working paper with which it is conducted. The type of the table is determined by the research stage. Thus, while developing a categorical apparatus, the analyst draws up a table, which is a system of coordinated and subordinated categories of analysis. Such a table outwardly resembles a questionnaire: each category (question) implies a number of signs (answers), according to which the text content is quantified. Table-form can be quite voluminous.

For the registration of units of analysis, another table is compiled - a coding matrix:

Sign of Text
one 2 3 n Σn
BUT +
AT + +
WITH + +
...
n
Σn

If the sample size is large enough (over 100 units), then the encoder, as a rule, works with a notebook of matrix sheets. If the sample is relatively small (up to 100 units), then a two-dimensional or even multi-dimensional analysis can be performed. In this case, each text must have its own coding matrix. However, this work is very laborious and laborious; therefore, with large sample sizes, the comparison of the features of interest to the researcher is carried out on a computer.

Sometimes a table may be necessary at the stage of quantitative data processing. For example, when using the analysis of accidents developed by the American social psychologist C. Osgood, the so-called construction is constructed. randomness matrix:

Real match Expected match
BUT AT WITH n Σn
BUT - 0.15 0.02
AT 0.05 - 0.06
WITH 0.08 0.12 -
... -
n -
Σn -

With the help of such a matrix, measures of randomness of coincidence of each classification unit with all the others are revealed. For example, unit A is found in 30% of the analyzed texts (P = 0.3), and unit B - in 50% of texts (P = 0.5), then the expected frequency of the joint occurrence of these units will be equal to: PAB = RA • Pb = 0.3 • 0.5 = 0.15. In fact, signs A and B met together only in 5% of the texts AB = 0.05. Comparing the expected and real coincidences of features, it is possible to determine which actual dependencies were not random (for example, it can be seen from the table above that the combined appearance of A and B units is random, because the real match is less than expected, and B and C units - not random, i.e. real match is higher than expected). The purpose of using this matrix can be different: to trace the randomness-non-randomness of coincidence of signs for testing a hypothesis, to note stable-unstable pair combinations of signs, which may be important for characterizing the activity of the sender of information, etc.

An important condition is K.-A. is the development of instructions to the encoder - a system of rules and explanations for the one who will collect empirical information by coding (registering) the specified units of analysis. The instructions accurately and unambiguously set out the algorithm for the coder’s actions, give operational definitions of categories and units of analysis, the rules for coding them, give specific examples from the texts that are the object of research, stipulate how to proceed in controversial cases, etc.

Counting procedure in quantitative content analysis. in general, the classification by distinguished groupings of ranking and association measurement is similar to the standard methods. There are also special counting procedures for content analysis, for example, the formula for the Janis coefficient (c), designed to calculate the ratio of positive and negative (relative to the chosen position) estimates, judgments, arguments. In the case when the number of positive ratings exceeds the number of negative ones, the coefficient of Janis is calculated by the formula

  Content analysis as a research method

where; - the number of positive ratings; n is the number of negative evaluations; g - the volume of the content of the text that is directly related to the problem being taught; t - the total amount of the analyzed text.

In the case when the number of positive ratings is less than negative, the Janis coefficient is found by the formula

  Content analysis as a research method

There are more simple ways to measure. The weight of a category can be calculated using the formula

  Content analysis as a research method


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Mathematical Methods in Psychology

Terms: Mathematical Methods in Psychology