Albert et al. (2018). Effect of Clustering Illusion during the Interaction with a Visual Analytics Environment. Research data of the 2017 study.
Creator: Dietrich, Albert
Contributor: Dietrich, Albert; Bedek, Michael; Huszar, Luca; Nussbaumer, Alexander
Funding: European Union 7th Framework Programme (FP7/2007-2013) under grant agreement no. FP7-IP-608142 https://cordis.europa.eu/project/rcn/188614_de.html
Title: Effect of Clustering Illusion during the Interaction with a Visual Analytics Environment. Research data of the 2017 study
Year of Publication: 2018
Citation: Albert, D., Bedek, M., Huszar, L., & Nussbaumer, A. (2018). Effect of Clustering Illusion during the Interaction with a Visual Analytics Environment. Research data of the 2017 study [Translated Title] (Version 1.0.0) [Data and Documentation]. Trier: Center for Research Data in Psychology: PsychData of the Leibniz Institute for Psychology ZPID. https://doi.org/10.5160/psychdata.dhat17ef10
Clustering Illusion is a cognitive bias and defined as the tendency to see patterns where no patterns exist (Gilovich, 1991; Gilovich, Vallone, & Tversky, 1985). This tendency can be observed when people interpret patterns or trends in random distributions. In the context of the VALCRI (Visual Analytics for Sense-making in CRiminal Intelligence analysis) project eight cognitive biases have been identified which may influence the decision-making process of the analysts. Assessment methods for other cognitive biases exist but this is not the case for the clustering illusion. Based on the study of Cook and Smallman (2007), who studied how cognitive biases affect a JIGSAW “Joint Intelligence Graphical Situation Awareness Web” system, a task that enables to detect the clustering illusion in a visual analytics environment was created. This task was as follows: Participants interacted with a selected set of tools from a visual analytics environment. These tools showed the spatial and chronological distribution of crime incidents in two city districts of Birmingham. In each city district, there were 30 crime incidents. A 2×2 design of “random vs. pattern condition” and “interactive vs static condition” was used to detect the influence of patterns and the level of interaction on the decision-making of the participants: In the random condition, the crime incidents have been randomly selected from a large set of incidents. In the pattern condition, the incidents have been selected in a way that there are increases or decreases over time and a spatial concentration of incidents in one of the two city districts. In the interactive condition, participants were allowed to interact with the tools to inspect the incidents from different perspectives. In the static condition, participants were asked to inspect the incidents as shown on the screen without interacting with the tools. After inspecting the incidents for ten minutes, the participants were asked (i) to evaluate if they would increase police presence either in city district A or in city district B, (ii) to evaluate the certainty of their decision, (iii) to announce if their decision was based on the data or patterns and trends in the data and if yes (iv) if they could argue their decision. The univariate analysis of variance showed no significant difference between the random and pattern conditions nor between the interactive and static condition and no interactions. A significant correlation between certainty of the decision and justifying the decision with facts (r=.364, p <.001) was found.
Justification of decision (Begründung der Entscheidung)
1 "yes (ja)"
2 "no (nein)"
999 "missing value (Fehlender Wert)"
The highest educational level of the participants (Höchste abgeschlossene Ausbildung)
2 "Higher School Certificate"
3 "University degree"
9 "missing value (Fehlender Wert)"
Number of task (Aufgabennummer)
1 "Data in Conventry is ordered in pattern, data in Wolverhampton ordered randomly (Daten in Conventry sind in Mustern, die Daten im Wolverhampton zufällig geordnet)"
2 "Data in Wolverhampton is ordered in pattern, data in Conventry ordered randomly (Daten in Wolverhampton sind in Mustern, die Daten im Conventry zufällig geordnet)"
3 "Data in both districts are ordered randomly (Daten sind zufällig geordnet)"
4 "Data in both districts are ordered randomly (Daten sind zufällig geordnet)"
1 "In the morning 4 -10 o'clock (In der Früh 4-10 Uhr)"
2 "During the day 10 - 16 o'clock (Tagsüber 10 - 16 Uhr)"
3 "In the evening 16 - 22 o'clock (Am Abend 16 - 22 Uhr)"
4 "At night 22 -4 o'clock (During the night 22-4 Uhr)"
999 "missing value (Fehlender Wert)"
Chosen day of the week - Sunday (Ausgewählter Wochentag - Sonntag)
Justification of the decision - open answer (coded) (Begründung der Entscheidung - offene Antwort (kodiert))
1 "time and number of cases in relation"
2 "time distribution"
3 "spatial and time distribution"
4 "spatial distribution"
6 "attempt to create a hypothesis"
7 "statement about tendencies"
8 "statement about distribution"
9 "couldn`t answer"
999 "missing value (Fehlender Wert)"
The following hypothesis were tested:
Participants will be more sure in their answers and could justify their answer in the pattern condition compared to the random condition.
Participants will be more sure in their answers and could justify their answer in the interactive condition compared to the static condition because they could – in principle – falsify the clustering illusion.
Research Design: Experimental design, Repeated Measurement Design, Laboratory Experiment; single measurement
The study has taken place in an examination room of the Graz University of Technology. First, all participants filled out an informed consent form. After that they were introduced to the Visual Analytics for Sense-making in CRiminal Intelligence analysis (VALCRI) platform by a short user manual. The participants had ten minutes time to become familiar with the user interface of the platform. This introduction phase was followed by a general instruction about the tasks. The participants had ten minutes time to complete each task. Participants were reminded by the instructor about the time limitation after five and after nine minutes. After each task, a small break was given. At the end of the fourth task the participants filled out a short demographic questionnaire. The order of the static and interactive condition and the four tasks were balanced. The experiment took one hour time and all the participants got 10 euro expense allowance.
Material and Setting:
For this study the VALCRI (http://valcri.org/about-valcri/) platform has been used. This platform aims to support the work of the law-enforcement agencies. It was designed to visualise a huge amount of data about criminal incidents. First, 240 crime incidents in Birmingham´s city districts Coventry and Wolverhampton from the last 6 months (January 2017 to Juli 2017) were randomly selected. After this pre-selection, four examples with 60 crime incidents, 30 per district, were created.
Two of the examples were selected randomly to show no temporal and spatial patterns. The other two examples were created with a noticeable temporal and spatial pattern in one of the two city districts. All of the selected data sets included the following tools: The Location tool, the Time tool and the List tool. The Location tool indicates the spatial distribution of crime incidents, the Time tool shows the temporal distribution of crime incidents and the List tool provides some details about the incidents. These tools have been selected because they are the most relevant ones for the daily work of analysts.
After each task the participants were asked i) to decide in which district they would increase the police presence, ii) how sure they were in their decision, iii) to announce if their decision was based on the data and if yes, iv) if they could justify their decision.
Data Collection Method:
Data collection in the presence of an experimenter – Specialized Apparatuses or Measuring Instruments, namely VALCRI Software
Population: Primarily university students
Survey Time Period:
Sample: Convenience sample
68,75% female subjects (n=22) 31,25% male subjects (n=10)
Subject Recruitment: The participants were recruited via different social media channels and the intern mail system of University of Graz. The participants recieved 10 Euro expense allowance.
Sample Size: 32 individuals
Bedek, M., Nussbaumer, A., Hillemann, E-C., & Albert, D. (2017). A Framework for Measuring Imagination in Visual Analytics Systems. In J. Bynielsson (Ed), Proceedings of the European Intelligence and Security Informatics Conference (pp. 151-154). Los Alamitos, Washington, Tokyo: IEEE Publications. doi: 10.1109/EISIC.2017.31
Cook, M. B., & Smallman, H. S. (2007). Visual evidence landscapes: Reducing bias in collaborative intelligence analysis. In Human Factors and Ergonomics Society (Ed.), Proceedings of the 51th Human Factors and Ergonomics Society Annual Meeting (pp. 303-307), Los Angeles, CA: SAGE Publications.
Gilovich, T. (1991). How we know what isn't so. The fallibility of human reason in everyday life. New York: The Free Press.
Gilovich, T., Vallone, R., & Tversky, A. (1985). The hot hand in basketball: On the misperception of random sequences. Cognitive psychology, 17(3), 295-314.
Nussbaumer, A., Verbert, K., Hillemann, E-C., Bedek, M., & Albert, D. (2016). A Framework for Cognitive Bias Detection and Feedback in a Visual Analytics Environment. In J. Brynielsson & F. Johansson (Eds.), Proceedings of the European Intelligence and Security Informatics Conference (EISIC) (pp. 148 - 151), Los Alamitos, Washington, Tokyo: IEEE Publications.