|
|
GUIDELINES FOR
| ||||||||||||||||||||||||||||||||||||||||||||||||
|
AGE |
|||||
|
16-24 |
25-34 |
35-49 |
50+ |
Total |
|
|
% |
% |
% |
% |
% |
|
|
Suburb A |
10 |
8 |
6 |
4 |
28 |
|
Suburb B |
9 |
10 |
6 |
5 |
30 |
|
Suburb C |
8 |
7 |
4 |
4 |
23 |
|
Suburb D |
6 |
6 |
5 |
2 |
19 |
|
Total |
33 |
31 |
21 |
15 |
100 |
In the above example that covers four suburbs, 10% of the universe lives in Suburb A and is in the age group 16-24. When stratification by suburb and age is used, 10% of the sample will also be 16-24 years old and live in Suburb A, and so on. More variables can be added to the above matrix, in which case the number of cells will increase. The number of cells is determined by the product of the number of categories which is used for the individual variables. In the above example, 4 suburbs and 4 age groups are used and provide a matrix with 4 x 4 = 16 cells. Note that there are certain statistical minimum requirements such as preferably you must have 20 (but not less than 10) respondents in each cell, while the ratio between the smallest and largest cell should not exceed 2:1. In other words, if the smallest cell has 10 respondents, than the largest one should not have more than 20.
5.2.3 Systematic sampling
Systematic sampling is another way to ensure that the sample is spread evenly across the entire population, and not biased towards certain sub-populations and under-representing others.
If a systematic sample of 1 000 has to be drawn from a universe of 10 000, the following procedure will be followed:
Determine the sampling interval:
If 1 000 numbers have to be drawn from 10 000, every 10th number (universe divided by the sample, in this case 10 000 ÷ 1 000 =10) has to be used to ensure that the sample is spread evenly/systematically between 1 and 10 000. The sampling interval in this case is 10.
Choose a random starting point:
For the above example, the sampling interval is 10, which means that every 10th number must be selected, starting from a random starting point from 1 to 10. This point is chosen as is described under random sampling (throw the numbers 1 to 10 in a box and draw one).
Selecting the sampling points:
Say that 7 is selected as the starting point, then numbers 7, 17, 27, 37 .through to 9 997 will be used, resulting in 1 000 sampling points, the intended sample.
Selecting the addresses:
In formal areas the Nielsen Geoframe is used to select the addresses to be included. All addresses in urban areas (total population 500+) in the universe are included in this computerized geographic information system (GIS). From this source it is possible to identify the required addresses (in our example those that correspond to the above numbers 7, 17, etc).
In rural areas with a population of less than 500 people, including the dispersed farming population, the sampling points are selected by using maps obtained from the Surveyor General’s office in Pretoria. The coordinates (North/South and East/West) of each sampling point are then provided to the interviewers as well as an electronic Global Positioning System (GPS) device that is used to guide them to the specific point. Strict instructions are given on how to select an address closest to each point. GPS technology is also used in squatter camps in urban areas where no identifiable physical addresses are available.
5.2.4 Cluster sampling
Cluster sampling is usually used for economic reasons. If any of the previous three methods are used alone, particularly if a large geographic area has to be covered, the sampling points will be spread individually across the entire area which would increase the cost. Cluster sampling can be used in such cases to reduce the cost of travel and accommodation.
Cluster sampling implies that fewer than the required points be selected and that more than one address is selected in the vicinity of each point. Such points are referred to as clusters.
However, it should be remembered that cluster sampling decreases the precision of the sample to the universe.
5.2.5 Disproportionate sampling
When it is important to obtain a large enough sample for separate analysis of one or more small sub-populations, for instance to report separately for a community radio station with a relatively small footprint area, that area can be over-sampled.
Before reporting the results, such over-sampling should be down-weighted to reflect the correct proportion in the total population. To retain statistical validity, the level of over-sampling should not exceed a ratio of 2:1, as was also mentioned under stratified sampling.
5.2.6 Sample size
There is a general misconception that the size of the sample is determined by the size of the universe and that large populations can only be researched by using large samples.
Statistically, there is no direct relationship between the size of the universe and the size of the sample required to estimate certain aspects of that universe accurately.
The size of the sample is determined by the following factors:
· The number of factors that can cause variation in the results. If it is, for instance, expected that everybody in the population will respond more or less the same, a small sample will accurately estimate this response. However, if it is expected that males and females will respond differently, a larger sample is required. If it is expected that males and females as well as people of different ages will behave differently, an even larger sample will be required.
· The level of detail to which the results will be reported. If the aim of the survey is only to report the behaviour of the total population, it can be done relatively accurately by using a small sample. If, however, it is important to report details of certain sub-populations separately, e.g. geographic regions, demographic sub-groups such as gender and age, a much larger sample is required.
· The level of accuracy of the results which is required. Because there is an inversely proportional correlation between the size of the sample and the accuracy of the results, a large sample should provide more accurate results than a smaller one.
Given the above, the following minimum requirements are usually set:
· Each weighting cell should preferably contain at least 20 respondents.
· The ratio of weights should not be more than 2:1 if disproportionate sampling is used.
· Each reporting cell should have at least 40 respondents that claimed positively to the question, in the case of radio audience estimates, at least 40 listeners.
5.3 Substitution
For a variety of reasons, it always happens that not all the initially selected respondents will form part of the final sample. Some respondents will refuse to participate, whilst others might be difficult to contact. To compensate for this and to ensure that the final sample size and structure will be the same as the selected sample, substitution can be used. However, if the level of substitution is high and if it is not controlled properly, it can bias the results. The following requirements are usually set:
· Substitutes must be selected by using a probability sampling technique, similar to the procedures which were used to select the initial respondents.
· Substitution should only be allowed when it is absolutely certain that the initially selected respondent cannot be included. This implies that every attempt should be made to include persons who are difficult to contact. For SAARF surveys, the original visit plus three call-backs on different days of the week and at different times of the day are required before substitution is allowed.
· When reporting the results, the level of substitution should be provided and, if it is high, the possible effect on the results should be indicated.
Finally, it is recommended that a statistician be consulted.
5.4 Questionnaire design
A questionnaire which is used to collect valid information is more than just a list of questions. Therefore, it is important that attention be given to not only the formulation of every question, but also to how they are arranged to ensure a logical, simple, understandable and unbiased interview. Furthermore, it is important that the questionnaire be tested and improved using a mini-sample before it is used in the final survey. The following guidelines can be used:
· Use the language of the target group. This does not only imply that the interview should be conducted in the respondent's language of preference, but also that the level of the language should be simple and easily understandable. Researchers frequently tend to communicate using marketing language rather than common language.
· Keep the questions - and the questionnaire - as short as possible. Avoid asking ‘nice to know' questions which do not really contribute to the aim of the survey.
· Bias through ‘leading' questions must be avoided. Even a simple statement such as "I am doing this research on behalf of ..... (name of a radio station).." can bias the results.
· Care should be taken to avoid general research problems such as over-claiming on certain questions, under-claiming on others, the probability of a rotation effect, and many others. Consultation with experienced researchers in media audience research will help to reduce such possible biases.
5.5 Fieldwork Standards
Data collection is the most crucial aspect of all research, because mistakes which are made when collecting the information, very often cannot be corrected afterwards. Therefore, it is important that:
1. Interviewers be properly selected during recruitment to ensure that they have the abilities which are required to do high quality work;
2. All selected interviewers should be trained properly in the basics of scientific data collecting;
3. Interviewers should be briefed in the application of every specific questionnaire, and pilot interviews should be done before they commence with the real interviews;
4. Strict control measures should be applied to ensure that respondent errors, interviewer mistakes, misunderstanding of the questions and situation errors (for example the telephone or door bell rings, food that is burning, a baby that is crying, etc.) are limited.
5. To ensure a high level of accuracy of the results, a minimum of 10% of all interviews are usually checked back. During the back-checking, both the selection of the sampling point (where applicable correct substitution) and of the correct respondent must be checked, as well as that the information in the questionnaire has been recorded accurately. The check-backs should include the work of all interviewers.
5.6 Editing and weighting of the results
5.6.1 Data editing
In addition to controlling fieldwork standards and back-checking as outlined in paragraph 4.4 above, manual and computerized editing should also be done to ‘clean’ the results. For this purpose, as many as possible logic checks are used on the raw unweighted data. Some examples follow:
· If there is no cell phone in your home, you are not suppose to claim to own a cell phone;
· If there is no water in your home or on your plot, you are not suppose to have a flush toilet;
· If you don’t have electricity, you cannot claim having a microwave oven or other electric appliances;
· A male is not supposed to use sanitary protection.
If you run through all the questions in your survey and evaluate them critically to determine impossibilities, you will be astonished to see how many similar cases than in the above examples you will find.
The final check normally is to make sure that the total of all the response options add up to the total sample, or if there was not an option for ‘don’t know/unsure, the responses can be lower than the total sample size, within acceptable margins.
5.6.2 Weighting of the data
It was stated in Paragraph 4.2.2 that ideally the sample must be designed in such a way that it reflects the characteristics of the population. In practice, even if this principle is applied, due to refusals, substitution and other factors the realized sample will not be the same as the selected sample. Therefore, weighting has to be applied to correct for any deviations from the characteristics of the population. This also applies when over-sampling is used in which case the results will have to be down-weighted to the correct proportions in the population.
The current SAARF RAMS® survey is post weighted by using the so-called cell weighting method. The simplified grid in Paragraph 4.2.2 using 4 suburbs and 4 age breaks demonstrates the principle of cell weighting.
RAMS is currently weighted by the following variables, all interlaced (the number of categories for each variable are shown between brackets:
Province, all nine (9)
Community size, metro; city/large town; small town/village; rural (4)
Gender, male, Female (2)
Age, 16-24; 25-34; 35-49; 50+ (4)
It was also mentioned in Paragraph 5.2.2 that the total number of cells in your grid is the product of the number of categories to be controlled. In the above case:
9 provinces x 4 community sizes x 2 genders x 4 age breaks = 288
(In reality it is a few more as some provinces has more than one metro – Gauteng has the most namely three i.e. Pretoria; Johannesburg/Rand and Vaal).
Similarly to how the sample ratios were determined in Paragraph 4.2 the weighting ratios are determined. It entails that the population in a cell be divided by the number of respondents in that cell to determine the weight that each respondent in that cell will receive.
Example (Metro: Johannesburg in Gauteng, male, 16-24)
The above example namely 16-24 year old males in Johannesburg in Gauteng is one of the cells. Take the number of 16-24 year old males in Johannesburg and divide that by the number of respondents in the same cell. The answer gives the weight that will be allocated to every male respondent 16-24 in Johannesburg.
As indicated in Paragraph 4.2.2, each cell must have preferably 20 but not less than 10 respondents, while the ratio of the largest to the smallest weight should not exceed 2:1. This means that the more cells are used the larger the required sample, which is one of the important limitations of cell weighting.
For more sophisticated weighting procedures such as the reiterative method (RIM) weighting, the above is not a complete limitation. Variables are separated into so-called RIMs and isolated from other variables, and weighted separately after a number of independent iterations all variables are either balanced or closely balanced. However, more than one variable can be included in a RIM, e.g. age and gender. Consequently, RIM weighting is a combination of interlaced and independent variables.
RIM weighting allows for the inclusion of more variables (such as LSMs) which are not included in the current RAMS® weighting matrix.
SAARF already implemented this procedure in SAARF TAMS® and possible implementation in RAMS® is under investigation.
5.7 Different Audience Measures
The following audience measures are used in the RAMS® diary reports:
Average ¼-hour Audience:
The average ¼-hour audience is an arithmetical average (an ordinary average) across more than one ¼-hour. It is calculated by adding-up the audiences of the quarter hours to be reflected, and dividing by the number of quarter hours for which the numbers were added. This measure is usually used to estimate advertising channel audiences (e.g. for 30 min, an hour, etc time slots) which can be used in determining advertising rates for different times of the day and days of the week. This figure is also used to estimate the potential audience which would be reached if an advertisement is placed in that time slot.
Net Audience:
The net audience reflects the number of different people who listened during a specified time period. In the advertising industry this is referred to as the ‘reach' or ‘coverage'. To calculate the net reach, persons who listened during two or more quarter hours during the time under consideration, are counted only once. The net audience is, thus, an unduplicated audience
Gross audience:
The gross audience of a channel or programme is the sum of the relevant quarter hour audiences, irrespective of duplication of persons. The same person is counted once for each quarter hour that he/she listened to during the channel or programme. If this figure is divided by the net audience, the average duration of listening is obtained.
Cumulative audience:
When audiences are calculated across more than one day, for instance Monday to Friday, it can either be done by calculating the arithmetical average, or by calculating a cumulative audience. The cumulative audience is the net audience of the first day in the calculation, plus new listeners on the other days. It is, in other words, the net or unduplicated audience across more than one day.
The above audience measures are related. For instance, if there is no flow of audience during a time-slot, in other words if all persons who listened at the beginning still listen at the end, and no new listeners enter, the average ¼-hour audience and the net audience will be the same. If the net audience is twice as large as the average ¼-hour figure, then the average listener has listened for only half of the total time.
When audience estimates are quoted, it is essential that the measure used also be mentioned. The average Monday to Friday and Cumulative Monday to Friday audiences usually differ. Similarly, the average daily audience will differ from the net daily audiences; the average 7-day and cumulative 7-day audiences will differ, etc. Without knowing which measure is referred to, it is impossible to correctly interpret the data.
6. CALCULATION OF REACH, FREQUENCY AND GROSS RATING POINTS (GRPs)
The main aim of media planning basically is to select those stations which reach as many of a specific target market as possible a given number of times (called the frequency) in a certain time period (e.g. seven days). Research which does not provide a reach and frequency estimate will be of little value for buying and selling advertising time.
When a seven-day diary is used, the daily reach can be calculated for every single day (or quarter hour or combination of quarter hours in that day), and for any combination of days (or quarter hours or combination of quarter hours within the day or days) up to seven days. The frequency is the number of days on which a person listened at a given time/times. This information is also used to calculate how many people you will reach with a campaign on any given number of days and a selection of quarter hours on each day, as well as what the average frequency will be.
The gross rating points or GRPs are the total number of opportunities to hear (OTS) or potential exposures created by a campaign. This can be calculated by multiplying reach and frequency.
When telephone interviewing is done and recall of only ‘yesterday' listening is used, the frequency of listening across more than one day cannot be calculated.
7. MARGIN OF ERROR
All research for which samples are used to estimate the behaviour, attitudes, etc., of the population is subject to sampling (or statistical) errors. If probability samples such as the samples which are described in Paragraph 4.2 are used, the size (or margin) of the error can be calculated. The margin of error can be calculated for different levels of accuracy, but in most research, including SAARF RAMS®, the 95% confidence level is used. If the margin of error is calculated at the 95% confidence level, it means that if 100 similar samples are used, the error for 95 of them would be within the relevant margin, whilst 5 could fall outside this figure.
Two variables determine the size of the margin of error, namely the size of the sample and the degree of unanimity of the response. The latter refers to the ratio of the proportion of the sample who responded positively (in this instance the listeners) and those who responded negatively (non-listeners).
The formula for calculating the error is:
s = √ p x (100-p) x 1.96
n
Where:
s = Standard Error
p = Penetration (% who listened)
n = Sample Size
8. ETHICAL ASPECTS
As in many other countries, the South African Market Research Industry strives for a high standard of research, as well as to protect the interests of the different stake-holders. Most of the leading researchers are members of the Southern African Market Research Association (SAMRA). SAMRA is a professional association and has a code of conduct, to which its members subscribe.
The stakeholders are the general public who provide the information, the client who pays for it, and the researcher.
Furthermore, there is an umbrella body to which many of the large research providers belong. The Association of Market and Social Research Organisation (AMSRO).
Individual researchers who are members of SAMRA and research providers who are members of AMSRO are obliged to the best of their ability, to ensure that the research practitioner(s) with whom they are associated and the people conducting research on their behalf adhere to this Code of Conduct. More information on SAMRA can be obtained from them at:
P O Box 1713
RANDBURG
2125
Tel: (011) 886 3771
Fax: (011) 886 9721
9. THE ROLE OF SAARF
The South African Advertising Research Foundation was founded in 1974 as a non-profit industry body. SAARF was formed because of a need in the marketing and advertising communities for a comprehensive, unbiased, reliable, regular and technically excellent media audience and products survey. Its purpose is to provide information about the population's use of the media, products, services and brands as well as their characteristics and demographic composition so as to enable reliable targeting for advertising purposes.
The data is in such a format that it is used, among others, for the buying and selling of advertising time and space in the media and for strategic editorial and programme planning.
Over the years, the SAARF All Media and Products Survey (SAARF AMPS®), the SAARF Radio Audience Measurements Survey (SAARF RAMS®) and the SAARF Television Audience Measurement Survey (SAARF TAMS®) have established themselves as reliable, valid and credible research vehicles.
Apart from commissioning the above surveys, SAARF® also assists media owners, advertisers and advertising agencies in a series of other areas such as offering training courses and the development of segmentation instruments such as the SAARF Universal Living Standards Measure (SAARF SU-LSM®).
SAARF's mission is to serve its members, and other interested persons and people are invited to liaise with SAARF about any aspect relating to media audiences, products, services and brands as well as target marketing and segmentation tools.
10. SOURCES CONSULTED
a. AMPS Diary Technical Report, South African Advertising Research Foundation (SAARF, Johannesburg, 1994).
b. RAMS Diary Technical Report, South African Advertising Research Foundation (SAARF, Johannesburg, 2007).
c. Guidelines for Market Research, Advertising Research Foundation (ARF), New York, (August 2003).
d. SAMRA Yearbook, Southern African Market Research Association, Johannesburg, 2006.
Please check with saarf@saarf.co.za as most of the above publications can be consulted in the SAARF Library.
APPENDIX A
Examples of RAMS® diary page
|
|