<this is supplied as figEx1_1>
Reproduced from http://easyweb.easynet.co.uk/~philipdnoble/snow.html, courtesy of Philip Noble.
Some people say instantly ‘Oh it’s a picture of a man – so what?’. Many others (including me when I first encountered it) take a very long time to see a specific ‘thing’ in it. If you concentrate on the centre of the picture you should eventually see the top half of a man. If you imagine a beret right on top of the picture in the centre this would be correctly positioned on the man’s forehead and he would look a lot like Che Guevara. Many people have seen the picture as one of Christ with a long flowing beard. It could also be a cavalier. His face is lit as if from the right hand side and so there is a lot of shadow. If you have problems with it try looking at it with friends. Someone will spot it and help you to see the whole figure.
I won’t provide any precise detail on where it originates. It seems certain that it is a picture of ground and snow, possibly on a mountainside and probably taken in China or Japan. When I claimed (as I had been told) that it was a mountain one time, when teaching a class, a student told me it was taken by her grandfather and that it was in fact snow on a hedge. I had no reason to distrust the student but I have no independent evidence. Many people of course have claimed a ‘miraculous’ sighting of Christ but with Che Guevara, a cavalier and even Dave Lee Travis! All possible, you’ll have decide for yourself.
The main point of the demonstration though is this. When the man finally pops out at you, you will never again be able to see the picture as just a load of black and white blobs. You will have constructed and maintained a ‘template’ – a best bet as to what the picture is of – and this will remain as an automatic reaction in your perceptual system. Most of the time, in science and in everyday life, when we approach visual (and other sensory) material, we have a ‘best bet’ all ready and we are not aware of the perceptual system’s operation of ‘calculating’ what sensory data represent in the world.
Give a meaning for, and an example of, the following words. Press ‘reveal’ to see some model answers. Hopefully your answers will be similar in meaning to these.
<’reveal’ buttons display the ‘answers’ highlighted in grey below>
Deduction |
To come to a conclusion through logical reasoning from premises. For example, if all dogs have worms and my pet is a dog then it has worms; if the number 42 bus goes either from George Circus or from Green Square but the bus stop is not at Green Square then it must go from George Circus. |
Empirical study |
A research programme that seeks to provide support for a hypothesis using observable data gathered fairly with a replicable and publicly described procedure. The study by Gabrenya, Wang and Latané (1985), described in Chapter 1, tested and supported the hypothesis that children in collectivist societies tend to work harder when in a group than when alone, contrary to the US finding that people tend to 'loaf' in groups compared with when alone. |
Falsifiability |
The principle of falsifiability is that any proposed theory must be set in terms that render it possible to disconfirm it (or crudely ‘disprove’ it). This doesn’t mean the theory must or will be proved wrong. After all, it might be true. The proposer of the theory just has to give others the means to show it to be false – in case it is false. For example, I might claim that I am holding an invisible cat. When you ask to stroke it I say it is also unfeelable (and of course, unsmellable, unhearable, etc.). This is a pretty useless and uninteresting theory! |
Induction |
To reason from particular instance to a general conclusion. For example, all sheep I have so far seen have four legs, therefore I’m assuming that all sheep have four legs (but I could be wrong); Most people with Asperger Syndrome (AS) I have so far come across have trouble holding eye contact. Hence I’m assuming this is a central feature of AS. |
Sample |
A group of people selected as representative of a larger population. If something works on them we assume that it will also work on a larger range of people. This is called generalising our results – in the same way as a medical trial seeks to establish that drug A is effective in reducing the symptoms of illness B and then is used to support the administration of drug A to the wider patient population. We assume that if caffeine increases memory ability in our sample then it will do so for people in general. |
Disconfirming theories – a ‘lateral thinking’ problem
Pages 17–19 of the book discuss the attempt to disconfirm theories as a powerful aspect of scientific reasoning. One of the best ‘awkward’ problems I have come across is shown below. Read the problem and have a think before revealing the answer below.
Three philosophy professors (A, B and C) are applying for a prestigious chair of philosophy post. There is little to choose between them so the interview panel sets a logical reasoning task. The questioner gives the following instruction: ‘I am going to draw either a blue or a white spot on each of your foreheads. I will then reveal the spots to you all simultaneously. If you see a blue spot on another person’s head put your hand up. As soon as you think you can say what colour spot you have on your own forehead please speak up with your answer’. He proceeds to draw a blue spot on each forehead. When the spots are all revealed to the candidates each one, of course, puts up a hand. After a brief moment’s hesitation professor A lowers her arm and says ‘I must have a blue spot’. How did she work this out?
Problems like this one are sometimes included in the general group of ‘lateral thinking’ problems. However, you do not have to think ‘laterally’ or particularly creatively to get the answer. You do, however, have to kind of think upside down. Before rushing on to get the answer do try to think about how the professor knew what wasn’t true rather than how she knew what was true.
Answer
<hidden text>
The answer is that she conducted a theory disconfirmation task. She thought ‘What if I had a white spot? If I did then B would quickly see that C could only have their arm up because B must have a blue spot, since my own spot, which each of them can see, would be white. But neither of them did respond quickly (remember all three are excellent at logical thinking) therefore I must have a blue spot.’ Professor A got the job!
<end hidden text>
Creating variables to measure psychological constructs
In this exercise try to give at least one operationally defined measure to assess the psychological construct in the list below. Examples are provided if you click ‘reveal’ but these are not the ‘correct’ answers, just some possibilities to demonstrate strict measurement.
<‘Reveal answers’ buttons needed – answers are highlighted in grey>
|
Example answer: |
Anxiety |
1. Total score on an anxiety scale which includes such items as: ‘I often lie awake thinking about tomorrow’s issues.’ The response scale might be ‘Strongly agree, Agree, Disagree, Strongly disagree’. 2. Person’s self-rating on a scale of 1 to 10 of their current level of anxiety (e.g., as they approach or think about a feared object). |
Conformity |
Difference between number of beans participant estimates are in a jar and the number they were told was agreed by a previous group. (The lower the difference the more they ‘conform’.) |
Assertiveness |
1. Participant completes story which requires assertiveness from main character to bring about a successful conclusion. Endings are coded according to scheme on which raters are intensively trained. 2. Number of people going back to cashier in a store after they have been deliberately short changed. |
Stress |
1. Number of single days taken off sick in one year. 2. Total score on ‘hassles’ scale. 3. Increase in errors made as task demands are increased. |
Self-esteem |
Difference in number of points scored on self-assessment ‘as I am’ and ‘how I would like to be’. |
Identifying sample types
Match the appropriate term with the sampling method described.
<Ex2-2_Matching.txt>
The nature of experiments
This is a True/False quiz to test your knowledge of the advantages of the experiment as a research method.
<Ex3-1_truefalse.xsl>
Identifying experimental designs
In this short quiz you will need to read each research description and identify the specific experimental design.
<Ex3-2_MCQs.xls>
Tabatha and her validity threats
In this chapter of the book there is a description of a rather naff research project carried out by Tabatha. Here it is again. As you read this passage try to identify, and even name if possible, every threat to validity that she has either introduced or failed to control in her design. A list is provided in the answers below.
Tabatha feels she can train people to draw better. To do this, she asks student friends to be participants in her study, which involves training one group and having the other as a control. She tells friends that the training will take quite some time so those who are rather busy are placed in the control group and need only turn up for the test sessions. Both groups of participants are tested for artistic ability at the beginning and end of the training period, and improvement is measured as the difference between these two scores. The test is to copy a drawing of Mickey Mouse. A slight problem occurs in that Tabatha lost the original pre-test cartoon, but she was fairly confident that her post-test one was much the same. She also found the training was too much for her to conduct on her own so she had to get an artist acquaintance to help, after giving him a rough idea of how her training method worked.
Those in the trained group have had ten sessions of one hour and, at the end of this period, Tabatha feels she has got on very well with her own group, even though rather a lot have dropped out because of the time needed. One of the control group participants even remarks on how matey they all seem to be and that some members of the control group had noted that the training group seemed to have a good time in the bar each week after the sessions. Some of her trainees sign up for a class in drawing because they want to do well in the final test. Quite a few others are on an HND Health Studies course and started a module on creative art during the training, which they thought was quite fortunate.
The final difference between groups was quite small but the trained group did better. Tabatha loathes statistics so she decides to present the raw data just as they were recorded. She hasn’t yet reached the recommended reading on significance tests in her RUC self-study pack.
Answers: Possible threats to validity in the study:
<hidden text>
Name of threat |
Issue in text |
Non-equivalent groups |
Busy students go into the control group. |
Non-equivalent measures |
Different Mickey Mouse pre- and post-test; a form of construct validity threat. |
Non-equivalent procedures |
Training method not clearly and operationally defined for her artist acquaintance. |
Mortality |
More participants dropped out of the training group than from the control group. |
Rivalry |
Control group participants note – some trainee group participants go for extra training in order to do well. |
History effect |
Some participants in the training group receive creative art training on their new HND module. |
Statistical conclusion validity |
Not a misapplication of statistical analysis but no analysis at all! |
<end hidden text>
Spotting the confounding variables
A confounding variable is one that varies with the independent (or assumed causal) variable and is partly responsible for changes in the dependent variable, thus camouflaging the real effect. Try to spot the possible confounding variables in the following research designs. That is, look for a factor that might well have been responsible for the difference or correlation found, other than the one that the researchers assume is responsible. If possible, think of an alteration to the design that might eliminate the confounding factor. Possible factors will be revealed under each example.
a. Participants are given either a set of 20 eight-word sentences or a set of 20 sixteen-word sentences. They are asked to paraphrase each sentence. At the end of this task they are unexpectedly asked to recall key words that appeared in the sentences. The sixteen-word sentence group performed significantly worse. It is assumed that the greater processing capacity used in paraphrasing sixteen words left less capacity to store individual words.
Answer:<hidden text>Could be the extra time taken by the second task caused greater fatigue or confusion<end hidden text>
b. Male and female dreams were recorded for a week and then analysed by the researcher who was testing the hypothesis that male dream content is more aggressive than female dream content.
Answer: <hidden text>The researcher knew the expected result, hence researcher expectancy is a possible cause of difference. Solution is to introduce a single blind. <end hidden text>
c. People who were fearful of motorway driving were given several sessions of anxiety reduction therapy involving simulated motorway driving. Compared with control participants who received no therapy, the therapy participants were significantly less fearful of motorway driving after a three-month period.
Answer: <hidden text>There was no placebo group. It could be that the therapy participants improved only because they were receiving attention. Need an ‘attention placebo’ group. <end hidden text>
d. After a two-year period depressed adolescents were found to be more obese than non-depressed adolescents and it was assumed that depression was the major cause of the obesity increase.
Answer: <hidden text>Depression will probably correlate with lowered physical activity and this factor may be responsible. Needs depressed adolescents to be compared with similarly inactive non-depressed adolescents. <end hidden text>
e. People regularly logging onto Chat ’n Share, an internet site permitting the sharing of personal information with others on a protected, one-to-one basis, were found to be more lonely after one year’s use than non-users. It was assumed that using the site was a cause of loneliness.
Answer: <hidden text>Those using the site had less time to spend interacting with other people off-line; need to compare with people spending equal time on other online activities. <end hidden text>
f. Participants are asked to sort cards into piles under two conditions. First they sort cards with attractive people on them, then they sort ordinary playing cards. The first task takes much longer. The researchers argue that the pictures of people formed an inevitable distraction, which delayed decision time.
Answer: <hidden text>Order effect! The researcher has not counter-balanced conditions. The participants may simply have learned to perform the task faster in the second condition through practice on the first. <end hidden text>
g. It is found that young people who are under the age limit for the violent electronic games they have been allowed to play are more aggressive than children who have only played games intended for their age group. It is assumed that the violent game playing is a factor in their increased aggression.
Answer: <hidden text>This is only a correlation and there may be a third causal variable that is linked to both variables. Perhaps the socio-economic areas in which children are permitted to play under age are also those areas where aggression is more likely to be a positive social norm. <end hidden text>
Some outlines of research studies are given below and your task is to decide which one of the following research designs each study used (some of which are taken from previous exercises). In the absence of specific information assume studies are conducted in a laboratory.
Research design |
Full description |
Lab experiment (true) |
True experiment conducted in a laboratory. |
Lab quasi |
Quasi experiment conducted in a laboratory. |
Lab non-experiment |
Non-experiment conducted in a laboratory. |
Field experiment (true) |
Field experiment (true). |
Field quasi |
Field quasi experiment. |
Field non-experiment |
Field research study, which is not an experiment. |
1. A researcher’s confederate sang identical songs on two separate days, one day dressed scruffily and the other day smartly dressed. Passers-by were asked to rate the busker’s performance on the two separate days.
Answer: <hidden text>Field quasi<end hidden text>
2. Participants were allocated at random to one of two conditions of an experiment. In one condition participants were asked to learn a list of 20 words with accompanying pictures. In the other condition, the participants were asked to learn the words without the pictures.
Answer: <hidden text>Lab experiment (true)<end hidden text>
3. Children in a nursery were randomly allocated either to a condition where they were shown a film in which several adults behaved quite aggressively or to a condition in which they were shown a nature film. Both groups were then observed for aggressive behaviour.
Answer: <hidden text>Field experiment (true)<end hidden text>
4. Male and female dreams were recorded at home by participants for a week and then analysed by a researcher who was testing the hypothesis that male dream content is more aggressive than female dream content.
Answer: <hidden text>Field non-experiment<end hidden text>
5. People attending a health clinic and who were fearful of motorway driving were given several sessions of anxiety reduction therapy involving simulated motorway driving. A control group was formed by people on a waiting list who had only recently applied for therapeutic help with the same problem. The therapy participants were significantly less fearful of motorway driving after a three-month period than the control group.
Answer: <hidden text> Field quasi<end hidden text>
6. People who had recently experienced a post-traumatic stress disorder were asked by a psychological researcher to undergo a battery of psycho-motor test trials. Compared with non-stressed participants the performed significantly worse.
Answer: <hidden text> Lab non-experiment<end hidden text>
7. Psychology students were invited to volunteer for a research study. Because the researcher did not want participants from one condition to discuss the procedure with participants in the other, he asked students from one course to detect stimuli under stressful condition and student from the other course to do the same task under non-stressful conditions.
Answer: <hidden text>Lab quasi<end hidden text>
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer. Hopefully your answers will be similar in meaning to these.
<’Reveal’ to see the answers highlighted in grey>
Give a meaning for the following: |
Possible answer: |
Archival data |
Data that exist as records and which can be used to test hypotheses about human behaviour, e.g., crime or traffic accident data. |
Case study |
Study of one individual, group or organisation in depth. |
Coding |
Categorising behaviour according to pre-arranged criteria, often to make quantitative analysis possible. |
Diary method |
Data gathering by recording experiences or observations on a regular (often daily) basis. |
Inter-observer reliability |
Level of agreement between two or more trained observers of the same events. |
Naturalistic observation |
Observation carried out on behaviour as it occurs naturally in the person’s or animal’s own environment. |
Participant observation |
Observing as a participant in the observed group. |
Reactivity |
Tendency for people to behave differently because they know they are being observed. |
Structured observation |
Observation that is organised, where behaviour is strictly coded and where extraneous variables are controlled. |
Preparing an interview schedule
Prepare a set of questions for an interview investigating the issue of assisted suicide. Imagine that this is for a piece of qualitative research where you wish to explore the concept fully in terms of people’s attitudes to the issue. You particularly want to know how people rationalise their positions. Make sure that your questions cover a wide area of possible perspectives – look at the issue from different people’s points of view. After you have prepared your interview schedule as fully as you think you can, have a look at the points below (click the ‘reveal hints’ buttons) and see if you have covered all these areas and perhaps produce some that I didn’t think of.
Hint 1 <hidden text>How many of your questions will, produce short answers (e.g., ‘do you believe in assisted suicide?’ or ‘would you ever assist someone to commit suicide?’ – these are closed questions and may only produce single answers of ‘yes’ or ‘no’).<end hidden text>
Hint 2 <hidden text>Have you used prompts and probes to facilitate elaboration of shorter answers? (e.g., ‘If no, could you tell me why?’)<end hidden text>
Hint 3 <hidden text> Have you investigated:
<end hidden text>
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer. Hopefully your answers will be similar in meaning to these.
<‘Reveal’ to see the answers highlighted in grey>
Give a meaning for the following: |
|
Closed questions |
Questions to which the appropriate answer is one of a finite set, e.g., ‘do you believe in ghosts?’ (yes/no) or ‘what is your lucky number?’ |
Focus group |
Group selected for discussion of an issue because they share a common interest. |
Open questions |
Questions allowing the respondent to answer at length, e.g., ‘tell me what that was like’. |
Panel |
Group of people selected to represent a range of views and who are often consulted on a regular basis. |
Semi-structured interview |
Interview type in which the researcher has prepared a schedule of questions to ask but the order in which these will be presented is not fixed. The interviewer tries to keep the session as close to normal conversation as possible. If a respondent naturally produces a full enough answer, questioning on that item will end, otherwise prompts and probes may be employed. |
Survey |
A large-scale data gathering exercise where accurate sampling is very important since results are usually taken as indicative of the general population’s attitudes, behaviour, etc. |
Problematic items in psychological scales
Some proposed items for different kinds of psychological scales are listed below. In each case select the kind of error (from the list in the box below) that is being made with the item (see p. 207 of the book for explanations).
Leading question |
Ambiguous |
Technical terms |
(too) Complex |
(too) Emotive |
(too) Personal |
Double-barrelled |
Double negative |
Inappropriate scale |
1. Violent video games can have a negative effect on children’s socio-psychological development.
Answer: <hidden text> technical terms<end hidden text>
Explanation: <hidden text>Would respondents understand socio-psychological? <end hidden text>
2. Boxers earn a lot of money (in an attitude to boxing scale).
Answer:<hidden text>ambiguous<end hidden text>
Explanation: <hidden text>That boxers earn a lot of money is a fact but it gives no indication of a person’s views on boxing. Both sides will agree so the item has no discriminatory power. <end hidden text>
3. Boxing is barbaric and should be banned.
Answer: <hidden text> double barrelled<end hidden text>
Explanation: <hidden text>May agree it is barbaric but argue against a ban. <end hidden text>
4. Hunters should not terrify poor defenceless little animals.
Answer: <hidden text>emotive<end hidden text>
Explanation: <hidden text> Could be an item in some scales but is a bit OTT on emotional tugging here. <end hidden text>
5. I thought the advertisement was:
Good Average Poor Very Poor
Answer:<hidden text>inappropriate scale<end hidden text>
Explanation: <hidden text>There are two negative ‘poor’ choices but only one positive choice. Also, what is meant by ‘average’? <end hidden text>
6. Have you ever suffered from a mental disorder?
Answer: <hidden text> too personal<end hidden text>
Explanation: <hidden text>Should not need to ask this and may not get an honest reply. Might be relevant in specialised research but would probably not be approached so bluntly. <end hidden text>
7. People have a natural tendency to learn though encountering problems, seeking information and testing hypotheses about the world and therefore education should be about providing resources for discovery rather than about top-down delivery and testing of knowledge.
Answer: <hidden text>too complex<end hidden text>
Explanation: <hidden text>Though the statement makes sense it may well need reading a few times and may tax some respondents with its vocabulary and length. <end hidden text>
8. Don’t you think the government will miss its child poverty targets?
Answer: <hidden text>leading question<end hidden text>
Explanation: <hidden text>Probably wouldn’t be as blatantly leading as this but even ‘Do you think…’ invites agreement. <end hidden text>
9. There are no grounds upon which a child should not be given a right to education.
Answer: <hidden text>double negative<end hidden text>
Explanation: <hidden text>Should be understood by most but double negatives do make respondents have to think twice. <end hidden text>
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer. Hopefully your answers will be similar in meaning to these.
<‘Reveal’ to see the answers highlighted in grey>
Chapter term |
Explanation |
External reliability |
The extent to which a psychological measure produces the same results when administered to the same people on different occasions. |
Factor analysis |
Process by which clusters of correlations are identified among many measured variables such that they can be taken as statistical evidence for the existence of psychological constructs. |
Internal reliability |
The internal consistency of a test, assessed by measuring whether people tend to score in the same direction, and to the same strength, as they did on all other items. |
Psychometric test |
A measuring ‘instrument’ seen as a scientifically devised measure of a human characteristic or aspect of behaviour. |
Response acquiescence |
Tendency for people to agree with items that are positively worded. |
Standardisation |
Process of ‘fitting’ a psychological scale to a normal distribution and establishing statistical norms for specific groups of people. |
Validity |
Extent to which a psychological scale measures the construct that it was intended to measure. |
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer.
<‘Reveal’ to see the answers highlighted in grey>
Give a meaning for the following: |
|
Cohort effect |
A cross-sectional study comparing a group of 15 year olds with a group of 9 year olds may be invalid and confounded because one group might have had a significantly different experience – for instance, the younger group may have experienced a significantly changed national curriculum that has introduced special emphasis on mathematics and language skills. |
Cross-generational problem |
A problem in longitudinal studies where one group, studied longitudinally in terms of social or cognitive development, may have experienced quite a different social environment from a previously studied group with which this group is being compared. |
Cross-sectional study |
Study in which a ‘snapshot’ is taken of different groups of people at the same time. For instance, very common would be a study of reading abilities in 7, 9 and 11 year olds at the same time (e.g., in February 2010). This could be extended so that the same ages are again studied two or three years later – a design known as a time-lag study. |
Cross-cultural study |
A study that attempts to compare effects in one culture with those in another, either in order to extend our knowledge of a psychological construct (has it the same strength and direction in culture B as it does in culture A), or to use the second culture as a level of an independent variable in order to test a hypothesis. For instance, culture B might use a language that has number terms only up to four or five and we can see whether language is important for memorising numbers of objects. |
Cross-lagged correlation |
Correlation of a variable at time 12 with another variable at time 2 or vice versa. For instance, researchers might compare measured levels of parental reading at time 1 with their child’s verbal abilities at time 2 and vice versa, along with the time 1 and time 2 correlations, in order to obtain a clearer picture of the causal effect of parental reading (i.e., do children do better if parents read, or do parents read if children are more verbal to start with?). |
Cultural relativity |
The belief that it is almost impossible to transfer psychological constructs and measure from one culture to another. Cultures can only be properly or sensibly understood by outsiders (i.e., researchers) through long-term study immersed in that culture (i.e., by living with them for several months if not years). |
Longitudinal study |
A study that follows the same group of people through a long period and periodically re-assesses psychological characteristics or skills. The aim is to observe changes in development along the way. |
Panel design |
A kind of longitudinal design where the same group is measured after a certain interval and perhaps over more intervals. |
Time-lag study |
A study of a specified group of people and repeated at long intervals, say every three years. E.g., a group of nine year olds might be assessed for attitudes to authority in 2010 and then another group of nine year olds will be similarly assessed in 2013 and comparisons made. |
Matching approaches to principles
Chapter 10 introduces several well-defined approaches to the collection and analysis of qualitative data that have developed over the last few decades. Below are brief descriptions of the principles of each approach. For each one, try to select the appropriate approach from the list below.
Grounded theory |
Interpretive phenomenological analysis |
Discourse analysis |
Thematic analysis |
Ethnography |
Action research |
Narrative analysis |
1. An approach that encourages the development of theory through the data emerging from the analysis of qualitative data patterns and are not imposed on the data before they are gathered. Data are analysed until saturated.
Answer: <hidden text> grounded theory<end hidden text>
2. This approach holds that what people say is not a source of evidence for what they have in their heads or minds. It analyses speech as people’s ways of constructing their perceptions and memories of the world as they see it. Speech is used to construct one’s ‘stake’.
Answer: <hidden text>discourse analysis<end hidden text>
3. In this approach an organisation or a culture is studied intensively from within.
Answer: <hidden text> ethnography<end hidden text>
4. An approach that analyses text for themes and which can be theory driven (theory emerges from the data) or top-down (testing hypotheses or seeking to confirm previous findings and theories). Highly versatile and not allied to any specific philosophical position.
Answer: <hidden text> thematic analysis<end hidden text>
5. This approach sees the role of psychological research as one of intervention to produce change for human benefit. An important aspect of the approach is the emphasis by the researcher on a collaborative project.
Answer: <hidden text> action research<end hidden text>
6. In this approach researchers try to access the perceptions and thoughts of the researched persons and to reflect and understand these in a way that is as close as possible to the way that the persons themselves interpret the world. The data analysis is usually a search for themes among interview data.
Answer: <hidden text> Interpretive phenomenological analysis<end hidden text>
7. This approach studies the ways in which people construct memories of their lives through stories. A central principle is that people generally establish their identity through the method of construction through re-telling even if this is to themselves.
Answer: <hidden text> narrative analysis<end hidden text>
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer.
<‘Reveal’ to see the answers highlighted in grey>
Give a meaning for the following: |
|
Constructivism |
Theory that ‘facts’ in the world are social constructions, created through human interaction, e.g., reconstruction of memory. |
Endogenous research |
Research where a group collaboratively researches its own customs, history, social norms, etc. |
Feminist psychology |
A movement within psychology that emphasised women’s perspectives and the general absence of women from the mainstream psychological picture of human behaviour, except as contrasts; in particular feminist research drew on methods opposed to ‘masculinist’ quantification and emphasis on scientific instruments, emphasising qualitative approaches and questioning in human interaction. |
Paradigm |
The currently accepted model within any science, one that is likely to come under threat eventually from a new emergent paradigm as did Newtonian physics from Einstein’s models. ‘New Paradigm’ research claimed that ‘hard’, scientific psychology with quantitative methods was in need of radical change towards a more human and holistically oriented approach. |
Realism |
Belief that the world consists of facts that can be unambiguously discovered or demonstrated through empirical research. Only one ‘real world’ exists. |
Reflexivity |
Recognition that the researcher’s own position, and the research design or constructed research question, can influence the construction of knowledge claimed in a research project. |
Theoretical sampling |
A version of purposive sampling. People are selected for the research project on the basis of the research question and/or the ongoing data analysis and its implications in a qualitative project. |
Ethical issues in research designs
What are the main ethical issues involved in the following possible research designs? Please have a good think before you reveal the answers.
1. A researcher arranges for shoppers to be given either too much or too little change when making a purchase in a department store. A record is taken of how many return to the cash desk and those that do are asked to complete a short questionnaire and are then debriefed as to the purpose of the study.
Answer: <hidden text> The ‘participants’ in this study were not able to give their informed consent before participating. They have been mildly delayed by having to return to the cash desk but are also under undue pressure to then complete the questionnaire. <end hidden text>
2. Participants are given a general intelligence test and are then given false feedback about their performance. They are told either that they did very well and significantly above average or that they did rather poorly and significantly below average.
Answer: <hidden text>There is a possible issue of some psychological harm in that some participants are told they have produced poor intelligence scores. OK – the effect is short-lived, but psychologists have to consider whether the knowledge gained from the experiment will be worth the perhaps mild distress caused to the participants but also the effect this has on the credibility and trustworthiness of psychologists in general, in the public view. <end hidden text>
3. In-depth semi-structured interviews are conducted with seven middle managers in an organisation where there has been some discord between middle and senior management. The researcher has been contracted to highlight possible causes of resentment and reasons for frustration that have been expressed quite widely. The researcher published a full report including demographics of the participants three of whom are women, one of whom one is Asian.
Answer: <hidden text>There is a problem here with anonymity. It will be easy for the senior managers to identify the sole Asian woman. In cases like these full information has to be compromised in order to preserve privacy and to protect individuals whose lives could be seriously affected by disclosure. <end hidden text>
4. Participants volunteer for an experiment and are first shown slides that have a theme of sweets, nuts or beans. They are then asked to put their hands into bags which contain either jelly beans, peanuts or kidney beans. The researcher is interested in whether the slides influence the participant’s identification of the items in the bag.
Answer: <hidden text>The description does not make clear whether the participants were asked before participating whether they might suffer from any allergies. Most importantly there is a risk of an anaphylactic reaction from the peanuts. This then is in contravention of the principle of not putting participants at any physical risk. <end hidden text>
5. A researcher conducting an experiment is quite attracted to one of the participants. At the end of the session, when the experiment is over, he asks her for a date.
Answer: <hidden text>The researcher is in a position of special power over the participant and should not exploit this by mixing professional activity with personal life. The two might of course meet up somehow outside the professional context but making this approach in the context of the experiment puts the psychologist at risk of contravening professional ethics. <end hidden text>
Defining some key terms used in the chapter
Can you give a meaning for the following terms? Click to see a model answer.
Analytic induction |
In Grounded Theory the emergent theory is modified by the addition of new cases and the movement is from particulars to a general overview. |
Analytic procedure |
The method used in a specific qualitative project, usually described under this heading in the method section of the report. |
Coding unit |
The type of category that is to be used in a content analysis approach to qualitative data analysis. It might be decided to work at the level of individual words, or with certain phrases or, higher still, with certain sets of meanings (e.g., each time the prime minister refers to greed among highly paid executives, no matter what terms are used in the description). |
Content analysis |
Coding of qualitative data often into units so that quantitative analysis can be performed. |
Idiographic |
View of individual as unique with unique un-measurable characteristics. |
Nomothetic |
Quantitative approach that looks to measure human characteristics but still claims individual can be unique in having unrepeated combination of characteristics at different levels. |
Inductive analysis |
Working with qualitative data so that theory emerges from instance within it. This approach is contrasted with the hypothetico-deductive method of looking for pre-reasoned patterns in the data. |
Respondent validation or ‘member checking’ |
Sharing an early analysis and interpretation of data with the people who were participants in the study so that they can verify, disagree with or shed new light on the assumptions and conclusions that are being made. |
triangulation |
Checking one’s analysis and conclusions from qualitative data with another perspective – e.g., a different researcher’s analysis or using member checking. |
The early part of Chapter 13 introduces levels of measurement and first talks of categorical and measured variables. We then look at the traditional division of scales into four types, nominal, ordinal, interval and ratio. In fact, when attempting the analysis of data and trying to decide which statistical treatment is appropriate you will never need to decide whether data are at a ratio level and you will rarely come across ordinal level data. There are two major decisions most of the time: first, whether your variable for analysis is categorical or measured; and second, if measured, whether it can safely be treated as interval level data or whether you should employ tests that are appropriate for ordinal level data. We deal with these distinctions here but they will be put into practice when deciding whether data are suitable for parametric testing as described in Chapter 19.
Identifying categorical and measured variables
Decide in each case below the type of variable for which data have been recorded: Categorical or Measured. Questions 1, 5, 9, 10 and 11 have further explanations that can be found by clicking on the reveal button.
1. Numbers of people who are extroverted or introverted
Answer:<hidden text>categorical<end hidden text>
Further explanation:<hidden text>The numbers of people counted are on an interval scale but for each participant we have only a category; never confuse the frequencies with the measurement method used for each person/case.<end hidden text>
2. Scores on an extroversion scale
Answer:<hidden text>measured<end hidden text>
3. Number of words recalled from a learned 20 item list
Answer:<hidden text> measured<end hidden text>
4. Whether people stopped at a red traffic light or not
Answer: <hidden text> categorical<end hidden text>
5. Grams of caffeine administered to participants
Answer: <hidden text> measured<end hidden text>
Further explanation:<hidden text>Grams are units on a clearly measured scale. However, we could be conducting an experiment where we give 0 grams 50 grams or 200 grams to participants, in which case we would be using three categories; it is always possible to use an interval scale but create categories like these. If, for instance, we recorded in each case only whether participants solved a problem or not we would have a 3 x 2 cross tabs table for a c2 analysis – see Chapter 18.<end hidden text>
6. Number of errors made in completing a maze
Answer: <hidden text> measured<end hidden text>
7. Whether people were recorded as employed, self-employed, unemployed or retired
Answer: <hidden text> categorical<end hidden text>
8. Number of cigarettes smoked per day
Answer: <hidden text> measured<end hidden text>
9. Whether people smoked none, 1–15, 15–30, or more than 30 cigarettes per day
Answer:<hidden text>categorical<end hidden text>
10. Number of aggressive responses recorded by an observer of one child
Answer:<hidden text>measured<end hidden text>
Further explanation:<hidden text>Here again a measured variable has been reduced to a set of categories. <end hidden text>
11. Whether a child was recorded as strong aggressive or moderate aggressive or non-aggressive
Answer:<hidden text>categorical<end hidden text>
Further explanation:<hidden text>The same thing could have happened here too. <end hidden text>
Finding descriptive statistics
For those able to use IBM SPSS or any other spreadsheet software that will find descriptive statistics the file psychology test scores.sav (SPSS) or psychology test scores.xls (Excel) contains data on 132 cases that you can work with. For those working by hand this is rather a lot of data so I have provided a smaller data set below for you.
<Please provide link to data files provided>
psychology test scores.sav <Psychology-test-2-scores.sav>
psychology test scores.xls <psychology-test-scores-small-set>
Working on SPSS or equivalent with the psychology test score data find the:
Mean Answer: <hidden text>37.02<end hidden text>
Median Answer:<hidden text>38<end hidden text>
Mode Answer: <hidden text>39<end hidden text>
Range Answer: <hidden text> 38<end hidden text> Note: <hidden text>1 has been added to the SPSS answer for the range (37) for the reasons given on p. 352 of the book. We assume the range runs from the lower end of the lower interval to the upper end of the upper interval. <end hidden text>
Semi-interquartile range Answer: <hidden text>3.5<end hidden text> Note: <hidden text> Find the semi-interquartile range in SPSS by selecting Analyze/Descriptives/Frequencies and selecting the statistics box to select quartiles. The output will call these ‘percentiles’ but the 25th and 75th will be provided so you can take the difference between these two and halve it. <end hidden text>
Standard deviation Answer: <hidden text>6.83<end hidden text>
Click on each item to see the correct answer.
For those working by hand here is a simpler data set:
14 15 16 18 18 19 21 22 22 22 23 23 23 24 24 25 25 26 27 27 35 38 39
Try to calculate the same statistics (and read the notes above about calculations):
Mean Answer: <hidden text>23.65<end hidden text>
Median Answer: <hidden text>23<end hidden text>
Mode Answer: <hidden text>22<end hidden text>
Range Answer: <hidden text>27<end hidden text>
Semi-interquartile range Answer: <hidden text>3.5<end hidden text> Note: <hidden text> Excel gives slightly different answers for quartiles and percentiles and hence the semi-interquartile range value will be different – 2.75<end hidden text>
Standard deviation Answer: <hidden text>6.58<end hidden text>
Click on each item to see the correct answer.
A bar chart
Students on an organisational psychology course have taken part in an experiment in which they have first conducted an interview while being observed by a visiting lecturer. Half the students are told the visitor was an expert in human relations and half of this group are given positive feedback by the visitor while the other half are given negative feedback. The other half of the students are told their visiting observer is simply ‘an academic’ and the same two types of feedback are given by the visitor to this group. Students were then asked to rate their own interview performance on a scale of 1 (very poor) to 10 (excellent). The results are displayed in the combined bar chart below.
<Ex14_Fig1 here>
Please describe the results as accurately as you can (no specific numerical values are required) and offer some possible explanation of the findings.
Answer <hidden text>In general positive feedback has greater effect than negative feedback. In addition there appears to be a greater effect from the expert than from the academic. However, there also seems to be an interaction in that the academic’s negative feedback appears to have had a greater lowering effect than that of the expert. Perhaps the students receiving expert negative feedback would rationalise that the expert would be particularly harsh and have therefore discounted some of the feedback. <end hidden text>
A histogram
<Ex14-2fig1 here>
A psychology lecturer has given her students a class test where the maximum mark possible is 50. The histogram above shows the distribution of the test score data. The median of this distribution is 38:
1. Was the test easy or hard?
Answer<hidden text>Easy. The distribution is negatively skewed and shows a ‘ceiling effect’ with many scores near the top end of the scale.<end hidden text>
2. Why is the mean lower than the median?
Answer<hidden text>Because the distribution is negatively skewed and therefore there are more extreme low scores in the tail pulling the mean (37.02) lower than the median (38). <end hidden text>
3. What is the modal category of scores?
Answer<hidden text>38–40<end hidden text>
Note: The data for this histogram are contained in the file used in the Chapter 13 exercises.
z scores
A reading ability scale has a mean of 40 and a standard deviation of 10 and scores on it are normally distributed.
1. What reading score does a person get who has a z score of 1.5?
Answer<hidden text>55<end hidden text>
2. If a person has a raw score of 35 what is their z score?
Answer<hidden text>-0.5<end hidden text>
3. How many standard deviations from the mean is a person achieving a z score of 2.5?
Answer<hidden text>2.5 above the mean<end hidden text>
4. What percentage of people score above 50 on the test?
Answer<hidden text>15.87%<end hidden text>
5. What percentage of people score below 27?
Answer<hidden text>9.68%<end hidden text>
6. What is the z score and raw score of someone on the 68th percentile?
Answer<hidden text>z is where 18% (or .18) are above the mean. z is .47 and this is 4.7 above 40 = 44.7 <end hidden text>
7. At what percentile is a person who has a raw score of 33?
Answer<hidden text>24th (24.2%)<end hidden text>
Standard error
1. If a sample of 30 people produces a mean target detection score of 17 with a standard deviation of 4.5, what is our best estimate of the standard error of the sampling distribution of similar means?
Answer: <hidden text> <insert Ex15-2_Image.png> <end hidden text>
2. Using the result of question 1, find the 95% confidence interval for the population mean.
Answer: <hidden text> 15.39 to 18.61<end hidden text>
Explanation: <hidden text>For 95% limits z must be -1.96 to +1.96; 1.96 x the se = 1.96 x 0.82 = 1.61
Hence we have 95% confidence that the true mean lies between 17 ± 1.61 <end hidden text>
One- or two-tailed tests
In each case below decide whether the research prediction permits a one-tailed test or whether a two-tailed test is obligatory.
1. There will be a difference between imagery and rehearsal recall scores.
Answer: <hidden text>two-tailed<end hidden text>
2. Self-confidence will correlate with self-esteem
Answer: <hidden text>two-tailed<end hidden text>
3. Extroverts will have higher comfort scores than introverts
Answer: <hidden text>one-tailed<end hidden text>
4. Children on the anti-bullying programme will improve their attitude to bullying compared with the control group
Answer: <hidden text>one-tailed<end hidden text>
5. Children on the anti-bullying programme will differ from the control group children on empathy
Answer: <hidden text>two-tailed<end hidden text>
6. Anxiety will correlate negatively with self-esteem
Answer: <hidden text>one-tailed<end hidden text>
7. Participants before an audience will make more errors than participants alone
Answer: <hidden text>one-tailed<end hidden text>
8. Increased caffeine will produce a difference in reaction times
Answer: <hidden text>two-tailed<end hidden text>
Type I and Type II errors
Please answer true or false for each item.
<Ex16-2_truefalse.xsl>
z values and significance
In the chapter we looked at a value of z and found the probability that a z that high or higher would be produced at random under the null hypothesis. We do that by taking the probability remaining to the right of the z value on the normal distribution in Appendix table 2 (if the z is negative we look at the other tail as in a mirror). Following this process, in the table below enter the exact value of p that you find from Appendix table 2. Don’t forget that with a two-tailed test we use the probabilities at both ends of the distribution. Enter your value with a decimal point and four decimal places exactly as in the table. Decide whether a z of this value would be declared significant with p £ .05
|
z value |
One or Two tailed |
p = |
Significant? |
a |
0.78 |
One |
|
|
b |
1.97 |
Two |
|
|
c |
2.56 |
Two |
|
|
d |
-2.24 |
Two |
|
|
e |
1.56 |
One |
|
|
f |
-1.82 |
Two |
|
|
Answers:
<hidden text>
|
z value |
One or Two tailed |
p = |
Significant? |
a |
0.78 |
One |
.2177 |
No |
b |
1.97 |
Two |
.0488 |
Yes |
c |
2.56 |
Two |
.0104 |
Yes |
d |
-2.24 |
Two |
.0250 |
Yes |
e |
1.56 |
One |
.0594 |
No |
f |
-1.82 |
Two |
.0688 |
No |
<end hidden text>
Here are the data sets, in SPSS and in MS Excel, for the results that are calculated by hand in this chapter of the book. The related t, unrelated t and single sample t data sets are all contained in the Excel file t test data sheets.xls. The files in SPSS are unrelated t sleep data.sav, related t imagery data.sav and single sample t test data.sav.
<please enter links to these data files here>
unrelated t sleep data.sav
related t imagery data.sav
single sample t test data.sav
t test data sheets.xls
The data files for the non-parametric tests are linked below. The excel file nonparametric test data.xls contains the data for the Mann-Whitney, Wilcoxon and Sign test calculations. The SPSS files are, respectively, mannwhitney stereotype data.sav, wilcoxon module ratings data.sav and sign test therapy data.sav.
<please enter links to these data files here>
mannwhitney stereotype data.sav
wilcoxon module ratings data.sav
sign test therapy data.sav
nonparametric test data.xls
t tests on further data sets
<please provide links to the following files>
t test scenario 1 data.sav
t test scenario 2 data.sav
t test scenario 3 data.sav
t test scenario data.xls
Data sets are provided here that correspond with the three research designs described below. Your first task is to identify which type of t test should be performed on the data for each design: unrelated t test, related t test, or single sample t test.
Scenario 1: Participants are asked to solve one set of anagrams in a noisy room and then solve an equivalent set in a quiet room. The prediction is that participants will perform worse in the noisy room. Data are given in seconds.
Answer: <hidden text> related t test <end hidden text>
Scenario 2: A sample of children is selected from a ‘free’ school where the educational policy is radically different from the norm and where students are allowed to attend classes when they like and are also involved in deciding what lessons will be provided by staff. It is suspected their IQ scores may be lower than the average.
Answer: <hidden text> single sample t test <end hidden text>
Scenario 3: One group of participants is asked to complete a scale concerning attitudes to people with disabilities. A second group of children is shown a film about the experiences of people with disabilities and then asked to complete the attitude scale a week later. The research is trying to show that changes in attitude last beyond the limits of the typical short-term laboratory experiment.
Answer: <hidden text> Unrelated t test <end hidden text>
Now conduct the appropriate test on each data set and give a full report of the result including: t value, df, p value (either exact or in the ‘p less than …’ format), 95% confidence limits for the mean difference and effect size.
Answers:
<hidden text>
Scenario 1 (related t)
Scenario 2 (single sample t )
The difference between the sample mean and the population mean was small (2.32, 95% CI: -6.26 to 1.62, Cohen’s d = 0.15).
Note that here the known population standard deviation of 15 points has been used, so d is 2.32/15 = 0.15
Scenario 3 (unrelated t)
Note: Effect size is calculated using <equation supplied as EqnChap_17.tif> where s is the mean standard deviation for the two groups ( sample sizes are equal).
<end hidden text>
Non-parametric tests on the scenario data sets
Select below the appropriate non-parametric tests that can be used on the Scenario 1 and 3 data from the t test exercises. In one scenario more than one appropriate test can be selected.
Scenario 1 (Anagrams in noisy and quiet rooms)
Wilcoxon Mann-Whitney Sign test
Answer: <hidden text>Wilcoxon and Sign test<end hidden text>
Scenario 3 (Control and film groups’ attitudes towards disabled people)
Wilcoxon Mann-Whitney Sign test
Answer: <hidden text>Mann-Whitney<end hidden text>
Now conduct the appropriate test on each data set and give a full report of the result including: T or U, appropriate N values, p value (either exact or in the ‘p less than …’ format) and effect size.
Answers:
<hidden text>
Scenario 1: (Wilcoxon)
The differences between time taken to solve anagram in the noisy room and time taken in the quiet room were ranked according to size for each participant. A Wilcoxon T analysis on the difference ranks showed a rank total of 139 where noisy room times were higher than quiet room times and a rank total of 71 where quiet room times were higher. Hence, quiet rooms times were generally lower than noisy room times but this difference was not significant, T (N = 20) = 71, p = .204. The estimated effect size was small to moderate, r= 0.28.
Scenario 1 (Sign test)
For each participant the difference between noisy room and quiet room time was found and the sign of this difference recorded. The 13 cases where quiet room score was less than noisy room score were contrasted with the 7 cases where the difference was in the opposite direction using a sign test analysis. The difference was found not to be significant with S (N = 20) = 7, p = .263.
Scenario 3: Mann-Whitney
The children’s disability attitude scores were ranked as one group. The rank total for the control group was 339.5 whereas the total for the film trained group was 480.5. Using a Mann-Whitney analysis significance was very nearly achieved with U (N = 40)= 129.5, p = .056. The effect size was moderate, r = 0.3
<end hidden text>
A 2 x 2 chi-square analysis
Individual passers-by, approaching a pedestrian crossing, are targeted by observers who record whether the person crosses against the red man under two conditions, when no one at the crossing disobeys the red man and when at least two people disobey. The results are recorded in the table below.
|
No jaywalker |
At least two jaywalkers |
|
Target disobeys light |
16 |
27 |
43 |
Target obeys light |
43 |
33 |
76 |
|
59 |
60 |
119 |
1. Calculate the expected frequencies for a chi-square analysis. Copy the table below and enter your results.
|
No jaywalker |
At least two jaywalkers |
|
Target disobeys light |
|
|
43 |
Target obeys light |
|
|
76 |
|
59 |
60 |
119 |
2. Now conduct the chi-square analysis. The data set has not been supplied here since the data are so simple. However, if using SPSS don’t forget to weight cases as described on p. 501 of the book. You need a variable called jaywalkers with two values, ‘none’ and ‘two’. You need a second variable, obeys, with two values ‘no’ and ‘yes’. Make your datasheet show one case for each possible combination and enter the appropriate data into a third column called count. Then select Data/Weight cases and drop the variable count into the weight cases box to the right.
Now enter your result into the spaces below. In each case use three places of decimals and don’t worry if you’re a fraction out. This could be because of rounding decimals in your calculations.
c2 (1, N = 119) |
|
p value |
|
Answers:
<hidden text>
1.
|
No jaywalker |
At least two jaywalkers |
|
Target disobeys light |
37.7 |
38.3 |
43 |
Target obeys light |
21.3 |
21.7 |
76 |
|
59 |
60 |
119 |
2.
c2 (1, N = 119) |
4.122 |
p value |
.042 |
<end hidden text>
A loglinear analysis
Suppose that the research in Exercise 18.1 is extended to include an extra condition of five or more jaywalkers and to include a new variable of gender. The table below gives fictitious data for such an observational study. Conduct a loglinear analysis on the data outlining all significant results in your results report.
Males |
|
|
|
|
|
No jaywalker |
At least two jaywalkers |
Five or more jaywalkers |
|
Target disobeys light |
21 |
25 |
38 |
84 |
Target obeys light |
38 |
35 |
22 |
95 |
|
59 |
60 |
59 |
179 |
Females |
|
|
|
|
|
No jaywalker |
At least two jaywalkers |
Five or more jaywalkers |
|
Target disobeys light |
22 |
23 |
29 |
74 |
Target obeys light |
39 |
37 |
31 |
107 |
|
61 |
60 |
60 |
181 |
Answer:
<hidden text>
A three-way backward elimination loglinear analysis was performed on the frequency data in the table above produced by combining frequencies for jaywalkers, obedience and gender. One-way effects were not significant, likelihood ratio c2 (4) = 5.40, p = .248; two-way effects were significant, likelihood ratio c2 (5) = 11.968, p = .035; the three-way effect was not significant, c2 (2) = 1.789, p = .409. Only the jaywalkers x obedience interaction was significant c2 (2) = 10.845, p = .004. More people crossed against the light when there were more jaywalkers present.
<end hidden text>
Scatter plots
Have a look at the scatter plots below and select a description in terms of strength (weak, moderate, strong) and direction (positive, negative or curvilinear).
Figure 1
|
<Ex19-1_Fig1.png here>
Figure 2
<Ex19-1_Fig2 here>
Figure 3
<Ex19-1_Fig3 here>
Answers
<hidden text>
Figure 1: strong, positive
Figure 2: moderate, negative
Figure 3: strong, curvilinear
<end hidden text>
Calculating Pearson’s and Spearman’s correlations
You’ll need these data sets for this exercise
correlation.sav <link to correlation.sav>
correlation.xls <link to correlation.xls>
The data set in the file correlation.sav (SPSS) or correlation.xls (Excel) is for you to use to calculate Pearson’s rand Spearman’s r(two-tailed) either by hand or using SPSS or a spreadsheet programme.Copy the table below and enter, using either p = or p £.
Don’t worry if your answer is out by a small amount as this might be due to rounding errors.
Pearson’s r = |
|
p = |
p £ |
Spearman’s r = |
|
p = |
p £ |
Answers:
<hidden text>
(please note negative values for correlations)
Pearson’s r = |
-.48 |
p = 0.005 |
p £ 0.01 |
Spearman’s r = |
-.492 |
p = 0.004 |
p £ 0.01 |
<end hidden text>
A few questions on correlation
1. Jarrod wants to correlate scores on a general health questionnaire with the subject that students have chosen for their first degree. Why can’t he?
Answer:
<hidden text>
First degree choice is a categorical variable.
<end hidden text>
2. Amy wants to correlate people’s scores on an anxiety questionnaire with their status – married or not married. Can she?
Answer:
<hidden text>
Yes, she can use the point biserial correlation coefficient (though better to conduct a difference test e.g., unrelated t).
<end hidden text>
3. As the number in a sample increases the critical value required for significance with p £ .05 increases or decreases?
Answer:
<hidden text>
Decreases.
<end hidden text>
More multiple regression practice
The data set for the multiple regression analysis conducted in the book is called multiple regression data (book).sav and a link to this file is provided below.
multiple regression data (book).sav <link to multiple-regression-data-(book).sav>
multiple regression data (book).xls <link to multiple-regression-data-(book).xsl>
A further exercise in multiple regression can be performed using the file multiple regression ex.sav, which is also provided below. Imagine here that an occupational psychologist has measured ambition, work attitude and absences over the last year and used these to predict productivity over the last three months. If there is good predictive power the set of tests might be used in the selection process for new employees.
multiple regression exercise.sav <link to multiple-regression-exercise.sav>
multiple regression exercise.xls <link to multiple-regression-exercise.xsl>
In this exercise please perform the multiple regression analysis in SPSS if you have the programme and then answer the following multiple choice questions:
<Ex19-4_MCQs>
Calculating one-way unrelated ANOVA
You will need this data sets for this exercise
1-way unrelated anova ex.sav <link to 1-way-unrelated-anova-ex.sav>
1-way unrelated anova ex.xls <link to 1-way-unrelated-anova-ex.xls>
These are fictitious data supposedly collected from an experiment in which participants are given (with their permission) either Red Bull (a high caffeine drink), Diet Coke (moderate caffeine) or decaffeinated Coke (no caffeine, i.e., control group). They are then asked to complete a maze task where they have to trace round a maze to find the exit as quickly as possible.
Carry out a one-way ANOVA analysis on the data either in SPSS, using a spreadsheet programme or even by hand and make a full report of results. Include the use of a Tukeyb post-hoc test if possible. If you are calculating by hand you could conduct simple effect t tests between two samples at a time and adjust alpha accordingly.
Answer:
<hidden text>
F (2, 34) = 6.661, p = .004 (or < .01)
Scores in the Red Bull group are significant higher than scores in the caffeine-free group. This is shown by the Tukeyb test, which says that Red Bull and caffeine-free samples are in different subsets (non-homogenous) or by t tests. The simple t (22) is 3.76 or calculated by the Bonferroni method t (22) is 3.65. Either way this is highly significant (p < .01).
<end hidden text>
Interpreting SPSS results for a one-way ANOVA
Shown below is the SPSS output after a one-way ANOVA has been performed on data where patients leaving hospital have been treated in three different ways, 1. traditionally (the control group), 2. with extra information (leaflet and video) given as they leave hospital and 3. with this information and a home visit from a health professional. The scores represent an assessment of their quality of recovery after three months. Have a go at answering the multiple-choice questions that appear below.
Test of homogeneity of variances |
|||
Score |
|
|
|
Levene Statistic |
df1 |
df2 |
Sig. |
5.191 |
2 |
36 |
.010 |
ANOVA |
|||||
Score |
|
|
|
|
|
|
Sum of Squares |
df |
Mean Square |
F |
Sig. |
Between Groups |
30.974 |
2 |
15.487 |
5.771 |
.007 |
Within Groups |
96.615 |
36 |
2.684 |
|
|
Total |
127.590 |
38 |
|
|
|
score |
|||
Tukey B |
|
|
|
Type of post-op care |
N |
Subset for alpha = 0.05 |
|
1 |
2 |
||
Trad. care |
13 |
5.3846 |
|
Trad. care + inform. |
13 |
6.7692 |
6.7692 |
trad. care + inform. + visit |
13 |
|
7.5385 |
Means for groups in homogeneous subsets are displayed. |
<Ex20-2_MCQs>
The features of post hoc tests
Try the ‘matching’ quiz – match the test with the appropriate description.
<Ex20-3_Matching.txt>
The Jonckheere trend test
On p. 590 of the book there is a description of the Jonckheere trend test, which directs the reader here for the means of calculation. Below is a table of fictitious (and very minimal) data upon which we will conduct the test. This will tell us whether there is a significant trend for scores to increase across the three conditions from left to right. Assume that participants have been given information about a fictitious person including one criterion piece – in condition 1 that the person doesn’t care about global warming, in condition 2 no information about the person’s attitude is given, and in condition 3 the person cares a lot about global warming. The scores in the columns are the participant’s rating of how likely they are to like the person.
|
Experimental conditions |
|
1. Doesn’t care |
Values to right |
2. No information |
Values to right |
3. Cares |
Participant |
|
|
|
|
|
A |
3 |
7 |
2 |
4 |
10 |
B |
5 |
7 |
7 |
3 |
8 |
C |
6 |
7 |
9 |
2 |
7 |
D |
3 |
7 |
8 |
2 |
11 |
Totals: |
28 |
|
11 |
|
Procedure |
Calculation |
1. For each score count the number of scores that exceed it to the right. Start at the left-hand column. |
See the table above. Example: The score of 5 for participant B in the ‘Doesn’t care’ column is exceeded by 7, 9 and 8 in the next column and 10, 8, 7 and 11 in the right hand column, making 7 scores in all. |
2. Add the two count columns |
See the table above (‘totals’) |
3. Add the two totals and call this value X |
X = 28 + 11 = 39 |
OK that was the easy part. Now things are rather tricky when we want to check if our value of X is significant. There are tables for this test but they only go up to n = 10 in each condition and you have to have the same number in each condition – a rare circumstance. We need to enter our value of X then into the following equation (take a deep breath):
<equation supplied as EqnChap_20 (1).tif>
Sninj means multiply all possible combinations of sample size. If we had sample sizes of 4, 6 and 7 this would mean we found (4x6) + (4x7) + (6x7). In our case here though this is just 4x4 + 4x4 + 4x4 = 48.
S(n2) would be 42 + 62 + 72 but in our case is 3 x 42 = 48
S(n3) would be 43 + 63 + 73 but for us it is 3 x 43 = 132
N is the total sample size, i.e., 3 x 4 = 12
Our z value then is = 29/Ö[1/18 x 3888 – 144 – 264] = 29/Ö3480/18 = 29/Ö193.33 = 29/13.9 = 2.09 <first part of the equation supplied as EqnChap_20 (2).tif>
A z value of 2.09 cuts off .0183 of the area of the normal distribution at either end (check in Appendix table 2 of the book) and this means that our overall p is 2 x .0183 for a two-tailed test = .0366 so we have a significant trend!
Calculating two way unrelated ANOVA on a new data set
The data set used to calculate the example of a two-way unrelated ANOVA in this chapter is provided below and is named two way unrelated (book).sav. An Excel file with the same name is also provided.
Two way unrelated (book).sav <link to two-way-unrelated-(book).sav>
Two way unrelated (book).xls <link to Two-way-unrelated-(book).xls>
The data set provided below (two-way unrelated ex) is one of fictitious data from a research project on leadership styles. Each participant has an LPC score, which stands for ‘least preferred co-worker’. People with high scores on this variable are able to get along with and accept relatively uncritically even those workers whom they least prefer to interact with. Such people make good leaders when situations at work are difficult (they are ‘people oriented’). By contrast low LPC people make good task leaders and are particularly effective when working conditions are good but tend to do poorly as leaders when conditions are a little difficult.
The variables in the file are sitfav with levels of highly favourable and moderately favourable (work situation) and lpclead with levels of high and low being the categories of high and low LPC scorers. Hence in these results we would expect to find an interaction between situation favourability and LPC leadership category. High LPC people should do well in moderately favourable conditions whereas low LPC people should do well in highly favourable conditions. Let’s see what the spoof data say. Conduct a two-way unrelated ANOVA analysis, including relevant means and standard deviations, and checking for homogeneity of variance and for effect sizes and power for each test.
2 way unrelated ex.sav <link to 2way-unrelated-ex.sav>
2 way unrelated ex.xls <link to Two-way-unrelated-ex.xls>
The answers I got are revealed when you select the button below.
<Hidden text>
The main effect for LPC leadership is not significant (overall one type of leader did no better than the other), F1,20 = 0.381, p = .544. The main effect for situation was also not significant (leadership performances overall were similar for highly and moderately favourable conditions), F1,20 = 1.5, p = .366). However, there was a significant interaction between situation and leadership type. High LPC leaders (M = 5.33, SD = 1.03) scored lower than low LPC leaders (M = 6.5, SD = 1.64) in highly favourable conditions, whereas they scored higher (M = 7.0, SD = 1.41) than low LPC leaders (M = 5.5, SD = 1.05) in moderately favourable conditions, F1,20 = 6.214, p = .022. Levene’s test for homogeneity of variance was not significant so homogeneity was assumed. Partial h2 for the interaction was .237 with power estimated at .660.
<end hidden text>
Interpreting an SPSS output for a two-way unrelated analysis.
Here is part of the SPSS output data for a quasi-experiment in which participants were grouped according to their attitude towards students. This is the ‘attitude group’ variable in the display below. Each group was exposed to some information about a fictitious person including their position on reintroducing government grants to students. Participants were later asked to rate the person on several characteristics including ‘liking’. It can be assumed for instance that participants who were pro students would show a higher liking for someone who wanted to introduce grants than someone who didn’t. Study the print out and try to answer the questions below.
Levene's test of equality of error variancesa |
|||
Dependent Variable: liking |
|
||
F |
df1 |
df2 |
Sig. |
2.757 |
5 |
41 |
.031 |
|
Tests of between-subjects effects |
|||||
Dependent Variable: liking |
|
|
|
|
|
Source |
Type III sum of squares |
df |
Mean square |
F |
Sig. |
Corrected Model |
114.601a |
5 |
22.920 |
7.947 |
.000 |
Intercept |
1880.558 |
1 |
1880.558 |
652.033 |
.000 |
information |
3.670 |
2 |
1.835 |
.636 |
.534 |
attitudegroup |
15.953 |
1 |
15.953 |
5.531 |
.024 |
information * attitudegroup |
93.557 |
2 |
46.778 |
16.219 |
.000 |
Error |
118.250 |
41 |
2.884 |
|
|
Total |
2135.000 |
47 |
|
|
|
Corrected total |
232.851 |
46 |
|
|
|
a. R Squared = .492 (Adjusted R Squared = .430) |
|
|
|
<Ex21-2_MCQs>
The data sets used to calculate the repeated measures examples in this chapter are provided below.
<Please supply a link to the following data files:>
Repeated measures (book).sav
Repeated measures (book).xls
Mixed (book).sav
Mixed (book).xls
2way repeated (book).sav
2way repeated (book) .xls
Calculating a one-way repeated measures ANOVA example
You will need the following data set to complete this exercise:
repeated measures 1-way ex.sav <link to repeated-measures-1-way-ex.sav>
repeated measures 1-way ex.xls <link to repeated-measures-1-way-ex.xls>
The file repeated measures 1-way.sav (SPSS) or repeated measures 1-way.xls (Excel) contains data for a fictitious study in which new employees were assessed for efficiency in their similar jobs after one month, six months and twelve months. Calculate the one-way repeated measures results and compare with the answer given below. The repeated measures variable is contained in the columns entitled efficency1, efficiency6 and efficiency12. In SPSS use the General linear model menu item. Don’t forget to employ Mauchly’s test for sphericity (check the Options button).
Answer
<hidden text>
The means (and standard deviations) of the efficiency scores after 1 month, 6 months and 12 months respectively were M = 38.3 (3.91), M = 41.5 (6.51) and M = 46.1 (6.92). The means differed significantly with F2,30 = 8.247, p = .001, effect size (h2)= .355. Mauchly’s test was not significant, p = .49.
<end hidden text>
Calculating a two-way mixed design ANOVA example
You will need this data set to carry out this exercise:
repeated measures mixed.sav <link to repeated-measures-mixed.sav>
repeated measures mixed.xls <link to repeated-measures-mixed.xls>
In this exercise you can tackle a two-way mixed design where there is one repeated measures factor (efficiency from the last exercise) and one between groups factor. This new factor is one of training.
Imagine the new employees in the last exercise were randomly divided into a group that received no training, one that received training and one that received training and some team building exercises early on in their employment at the company. You need the file repeated measures mixed.sav (SPSS) or repeated measures mixed.xls (Excel).
Conduct the two-way analysis and see if you get the same findings as the report below. Ignore the column headed ‘graduate’ for now. Make sure you inspect the table of means (by asking for Descriptive statistics under the Options button in SPSS). You need the efficiencyaverage variable to calculate overall efficiency means for each training group.
Answer
<hidden text>
There was a main effect for efficiency with the means rising from M = 44.8, SD = 5.91 at one month, through M = 45.8, SD = 7.26 at six months to M = 47.8, SD = 6.80 at twelve months. F2.90 = 3.824, p = .025, effect size (partial h2 ) = .078.
There was a main effect for training with means of M = 42.3, SD = 3.41 for the untrained group, M = 47.1, SD = 4.31 for the trained group and M = 49.0, SD = 4.41 for the trained and team building group. F2,45 = 11.564, p < .001, effect size (partial h2 ) = .993.
The interaction was not significant. Sphericity was at an acceptable level (p = .299). Levene’s test for homogeneity of variance was significant for efficiency1 so equality of variances was not assumed for this variable.
<end hidden text>
If you’re really feeling adventurous you could try the three-way mixed ANOVA that is produced by including the factor of graduate. This tells us whether the participant was a graduate or not. I have only provided brief details of results below but enough to let you see you’ve performed the analysis correctly.
Answer:
<hidden text>
Main effect efficiency F2,84 = 4.018, p = .022, h2 = .087
Main effect training F2,42 = 15.433, p < .001, h2 = .424
Interaction efficiency x training not significant
Interaction efficiency x graduate not significant
Interaction training x graduate significant F2,42 = 7.708, p = .001, h2 = .268
(graduates better than non-graduates if not trained or trained but with team building too they are worse! – overall).
Three-way interaction efficiency x training x graduate just significant F4,84 = 2.492, p = .049, h2 = .106 (it seems that for team building and training, graduates improved more across the three times than non-graduates and, for training only, non-graduates improved more than graduates).
<end hidden text>
Calculation of a two-way repeated measures ANOVA
You will need this data set to carry out this exercise:
2-way repeated-ex.sav <link to 2way-repeated-ex.sav>
2-way repeated-ex.sav.xls <link to 2way-repeated-ex.xls>
The data set these files are based on is an experiment where participants undergo the Stroop experience. Stroop was the psychologist responsible for demonstrating the dramatic effect that occurs when people are asked to name the colour of the ink in which words are written – there is a big problem if the word whose colour you are naming is a different colour word (e.g., red written in green – an ‘incongruent’ colour word)! People take much longer to name the ink colour of a set of such words than they do to name the colours of ‘congruent’ words (colour words written in the ink colour of the word they spell).
A further refinement of the experiment, based on a theory of sub-vocal speech when reading, is the prediction that words that sound like colour words (such as ‘shack’ or ‘crown’) should also produce some interference, if incongruent, thus lengthening times to name ink colours. The Stroop factor of this experiment then involves three conditions: naming the ink colour of congruent words; naming the ink colour of incongruent words sounding like colour words; and naming the ink colour of incongruent colour words.
In the imaginary experiment here we have introduced a second factor, which is that people perform the three Stroop tasks both alone and then in front of an audience. The data are presented as a 2 x 3 repeated measures design so there are six columns of raw data, the numbers being number of seconds to read the list of words. Control is naming the ink colour of congruent words, rhyme uses words sounding like incongruent colour words and colour uses incongruent colour words. The end part of each variable refers to the audience conditions, alone if no audience and aud with an audience observing.
Remember that in SPSS you have to name the two repeated measures factors then carefully select columns when asked to define the levels of each variable. If you enter the repeated measures variable names as first ‘stroop’, then ‘audience’ you will be asked to identify variables in the order stroop 1, audience 1, stroop 1, audience 2 and so on, so that’s controlalone, controlaud, rhymealone … and so on. You will need the three extra mean columns when looking at the differences related to the Stroop main effect.
Carry out the two-way analysis, remembering to check Mauchly’s sphericity statistic and to ask for descriptive statistics so you can see the mean of each level of each variable.
Answer:
<hidden text>
The main effect for Stroop is basically massive (as it nearly always is). The overall means for the three conditions were control M = 44.5 (SD = 14.54), rhyme M = 58.9 (SD = 14.82) and colour M = 97.7 (SD = 20.47). F2,18 = 34.873, p < .001, partial h2 = .795
There was no effect for audience and the interaction stroop x audience was not significant. Sphericity was not a problem.
<end hidden text>
Questions on SPSS results for a two-way ANOVA
The table below shows part of the SPSS output for a two-way ANOVA calculation. Extroverts and introverts (factor extint) have been asked to perform a task more than once during the day to see whether extroverts improve through the day and introverts worsen.
|
df |
F |
p |
Effect size h2 |
Performance |
2 |
3.795 |
.026 |
.073 |
Performance x extint |
2 |
23.225 |
.000 |
.326 |
Error (performance) |
96 |
|
|
|
Extint |
1 |
.026 |
.872 |
.001 |
Error |
48 |
|
|
|
<Ex22-4_MCQs>
The Page trend test
On p. 631 of the book there is a short description of the Page trend test, which is used when you have three or more related sets of data and you want to see whether they follow a trend across conditions. We’ll use the (very minimal) data below as an example. Imagine children have been tested for reading improvement on three successive occasions. We want to see if there is significant improvement.
Reading scores for four children tested three times |
|||||
Score |
Rank |
Score |
Rank |
Score |
Rank |
3 |
2 |
2 |
1 |
10 |
3 |
5 |
1 |
7 |
2 |
8 |
3 |
6 |
1 |
9 |
3 |
7 |
2 |
3 |
1 |
8 |
2 |
11 |
3 |
Totals: |
Ra = 5 |
|
Rb = 8 |
|
Rc = 11 |
Step 1. First we calculate a statistic: <equation supplied as EqnChap_22 (1).tif> where Rk is the total of each rank column and K is the predicted order of that column. For instance when k is 1 the total is 5 and the predicted order of that column was 1 (we expect children to be lowest here).
Hence L = (5 x 1) + (8 x 2) + (11 x 3) = 54
Step 2. Again there are tables for Page but only going up to N = 10. For any value of N we can use the formula: <equation supplied as EqnChap_22 (2).tif> where n is the sample size and k is the number of conditions, so here we get: (12 x 54) – (36 x 16)/Ö(36 x 8 x 4) = 2.12
1.96 is the critical value for z at .05, two-tailed so this would be a significant trend.
Identifying simple two-condition designs
Some two-condition research designs are outlined below. Your job is to read the information (all of it!) and decide which test it is most appropriate to use. You should read the criteria for selecting tests contained in the first part of Chapter 23 before attempting the exercise. The tests that are possible are listed in the table below. Select parametric tests unless there is information contrary to their use.
Related t |
Unrelated t |
Single sample t |
Mann-Whitney U |
Wilcoxon T |
Pearson correlation |
Spearman correlation |
Chi-square |
Sign test |
1. Children are classified as high or low text-message senders and researchers investigate whether their scores on a reading test differ significantly.
Answer: <hidden text>unrelated t <end hidden text>
2. The same children as in (1) are recorded as extroverts or introverts with the enquiry being: do extroverts send more texts than introverts?
Answer: <hidden text>chi-square <end hidden text>
3. Students are tested for self-esteem before and after the exam period to see whether there is a significant change in self-esteem.
Answer: <hidden text>related t <end hidden text>
4. A researcher believes that stress has an effect on physical health and so measures people’s stress levels with a questionnaire and records the number of times they have visited the doctor with minor ailment over the past two years. She believes stress levels will predict number of visits.
Answer: <hidden text>Pearson correlation <end hidden text>
5. A class of school pupils is asked to solve a set of simple maths problems, each working on their own and with a prize for the best performance. They are then asked to solve similar problems, but this time they are told they are working as a group and will receive a prize if they beat other groups. The dependent variable is the difference between their two performances and it is found that these scores are very different from a normal distribution in terms of kurtosis.
Answer: <hidden text>Wilcoxon <end hidden text>
6. The same researcher as in (4) tests the hypothesis that stress affects self-esteem and expects higher stress to produce lowered self-esteem scores. In this project she finds the self-esteem scores are heavily skewed and cannot remove this with a transformation. Answer: <hidden text>Spearman’s correlation <end hidden text>
7. A sample of people is found who have just completed their second degree. A researcher is interested in whether their second degree category is better than their first degree category. Since degree grades are categorical the only record is whether the second degree was better or worse than the first.
Answer: <hidden text>Sign test<end hidden text>
8. Participants are divided into two groups. One group is asked to doodle (by filling in letters) while listening to a guest list of names invited to a party. A control group does the same task without doodling. It is predicted that the doodle group will perform better when asked to recall as many names as possible. The variances of the two groups are very different and there are quite different numbers of participants in each group.
Answer: <hidden text>Mann-Whitney U <end hidden text>
Identifying ANOVA designs
From the following brief descriptions of research designs try to identify the ANOVA type with factors and levels. For instance, an answer might be one-way, unrelated or 2 x 3 x 2 mixed.
One-way |
|
Unrelated |
2 x 3 |
|
Repeated measures |
3 x 3 |
|
Mixed |
2 x 3 x 2 |
|
|
3 x 4 |
|
|
1. Participants are asked to rate a fictitious person having been told they are either pro-hanging, anti-hanging or neutral.
Answer: <hidden text> One-way unrelated <end hidden text>
2. Participants are presented with both positive and negative traits for later recall and are induced into either a depressed, neutral or elated mood.
Answer: <hidden text> 2 x 3 mixed <end hidden text>
3. All participants are asked to name colours of colour patches, non-colour words and colour words.
Answer: <hidden text> One-way repeated measures<end hidden text>
4. Male and female clients experience either psychoanalysis, behaviour modification or humanistic therapy and effects are assessed.
Answer: <hidden text>2 x 3 unrelated <end hidden text>
5. Participants are given either coffee, alcohol or a placebo and are all asked to perform a visual monitoring task under conditions of loud, moderate, intermittent and no noise.
Answer: <hidden text>3 x 4 mixed <end hidden text>
6. Older or younger participants are asked to use one of three different memorising methods.
Answer: <hidden text>2 x 3 unrelated <end hidden text>
7. Extroverts and introverts are tested at weekly intervals and are asked to perform an energetic and then a dull task after being given first a stimulant, then a tranquiliser and then a placebo.
Answer: <hidden text>2 x 2 x 3 mixed <end hidden text>
8. Participants perform tasks in front of an audience and when alone. They are first asked to sort cards into three piles, then four piles and finally five piles.
Answer: <hidden text>2 x 3 repeated measures <end hidden text>
Identifying problematic report statements
In the extracts from students’ psychology practical reports below try to describe what is dubious about the statement before checking the answers.
1. An experiment to see whether giving people coffee, decaffeinated coffee or water will have an effect on their memory of 20 items in a list.
Answer: <hidden text>Far too long-winded and could be stripped nicely down to: ‘The experimental effect of caffeine on recall memory’. <end hidden text>
2. The design was an experiment using different types of drink …
Answer: <hidden text>What kind of experiment? It’s true that we might be told later what kinds of drink were used but why not just explicitly state the levels of the independent variable straight away? <end hidden text>
3. 20 students were selected at random and asked …
Answer: <hidden text>Hardly likely that they were selected truly at random. Probably ‘haphazardly’. <end hidden text>
4. Materials used were a distraction task, a questionnaire, mirrors …
Answer: <hidden text>Never list materials, use normal prose description. <end hidden text>
5. The results were tested with a t test …
Answer: <hidden text>Which results? In the simplest studies there are always several ways in which the data could be tested. We could, for instance test the difference between standard deviations rather than means. Usually though the reader needs to know explicitly which means were tested – there are usually more than just two, and anyway ‘results’ is just vague. <end hidden text>
6. Miller (2008) stated that “There is no such thing as a loving smack. The term is an oxymoron. No child feels love as they are being beaten or slapped.”
Answer: <hidden text>No page number for the quotation. <end hidden text>
7. The result proved that …
Answer: <hidden text>We never use ‘prove’ in psychology, or in most practical sciences for that matter. Findings usually support a hypothesis or theory, or they challenge it. <end hidden text>