Books’ Impact in Digital Social Reading: Towards a Conceptual and Methodological Framework

Federico Pianzola; Marco Viviani; Alessandro Fossati; Peter Boot; Olivia Fialho; Marijn Koolen; Julia Neugarten; Willem Robert Van Hage; Simone Rebora; J. Berenike Herrmann; Thomas C. Messerli; Annett Jorschick; Srishti Sharma

Authorship

1. Federico Pianzola

University of Groningen, Netherlands, The
2. Marco Viviani

University of Milano-Bicocca, Italy
3. Alessandro Fossati

University of Milano-Bicocca, Italy
4. Peter Boot

Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)
5. Olivia Fialho

Huygens Institute for the History of the Netherlands, Netherlands, The; Utrecht University, Netherlands, The
6. Marijn Koolen

Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)
7. Julia Neugarten

Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)
8. Willem Robert Van Hage

Netherlands eScience Center, Netherlands, The
9. Simone Rebora

Institution Università degli Studi di Verona (University of Verona)
10. J. Berenike Herrmann

Universität Basel (University of Basel)
11. Thomas C. Messerli

Universität Basel (University of Basel)
12. Annett Jorschick

Universität Basel (University of Basel)
13. Srishti Sharma

Independent scholar

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The aim of this panel is to debate the challenges and opportunities offered by online reviews for measuring the impact that books can have on readers (Boot and Koolen, 2020). The focus is specifically on culture- and language-specificity, thus we will compare insights from the analysis of Korean, English, Italian, German, and Dutch reviews.
Digital social reading platforms – like Goodreads, Lovelybooks, or Naver Books – host millions of reviews and, thus, offer unique possibilities for research into literature, reading, and reader response (Rebora et al., 2021; Walsh and Antoniak, 2021). Computational tools are especially relevant, given the large amount of available data, but finding associations between textual features, cultural conventions (e.g. genre), and cognitive, affective, and aesthetic responses is not a straightforward task (Koolen et al., 2020; Pianzola et al., 2020).
By comparing research done with different platforms, datasets, and languages, we aim at improving the methods that we employ, in a dialogue involving both data-driven insight and theoretical reflection on literature and readers. Questions that we will address are: what aspects of a book’s impact on readers can reviews help us to measure? What are the limitations of online book reviews for studying impact? How do we know to what extent these review texts reflect the actual reading experiences? What are unwanted, confounding influences (e.g. reviewers projecting a favourable self-image, socially desired responses, aspects of identity formation, fake reviews). How do online book reviews differ from experimentally controlled gathering of reader responses (lab studies, questionnaires, psychologically validated scales) (Lendvai et al., 2020)? How do platforms for reviewing and social interactions around books influence reviewers and their perceptions? How do reviewers compare to other readers?
To answer such questions, we will present four case studies dealing with different languages and cultures, followed by an open discussion of the results and methods, reflecting on their generalizability, efficacy, and limitations.

Cross-cultural and Multilingual Book Reading and Reviewing: Building and Analyzing a Dataset for English, Italian, and Korean
Fossati, A., Pianzola, F., Viviani, M.
In this project, we analyze the differences between English, Italian, and Korean speaking readers in relation to the impact that a book has on them, namely in relation to the attitude that reviewers of different cultures have in providing their opinion about books on a digital platform. To this aim, we scraped reviews from the biggest reviewing platforms for each language: Amazon.com (270k reviews collected) and Goodreads (247k) for English (999 books collected), Amazon.it (93k) and Anobii (64k) for Italian (975 books), and Naver Books (40k) and Yes24 (67k) for Korean (900 books). We sampled one retail and one non-retail platform for each language because one of our goals was to reproduce the results by (Dimitrov et al., 2015; Newell et al., 2016) about the differences in readers’ behavior (Fig. 1).

Average rating from 3,000 books in English, Italian, and Korean taken from retail (blue) and no-retail (yellow) platforms. Reproduction of the results obtained for English by (Newell et al. 2016).

We are providing an important cross-cultural and multilingual resource and analytical contribution to this kind of research. We present the process of construction and preliminary analyses of this multilingual corpus, which is aimed both at containing common books (about a thousand for each language, with their respective reviews) in all three languages, and at highlighting reading preferences in terms of genres (books from Children, Romance, Sci-fi, Thriller and Fantasy genres) and authors that are peculiar for each language/culture (Fig. 2). This kind of dataset is necessary – but so far unavailable – to implement analyses that could reliably explore the impact that books have on both Western and Asian readers.

Average sentiment scores for books belonging to 5 different genres. Values are computed using a transformer-based multilingual model (XLM-R) specifically fine-tuned for sentiment analysis of book reviews.

Reading Impact in Online Book Reviews: Challenges and Prospects
Boot, P., Fialho, O., Koolen, M., Neugarten, J., Van Hage, W.R.
What is the impact of fiction? In the
Impact and Fiction project we investigate that question by measuring impact in a corpus of 500k+ online book reviews and relating this to high-level features (mood, topic, style, narrative) computationally extracted from a corpus of novels discussed in these reviews. In predicting the impact we also take reader features into account. We draw from a growing tradition of studies on the impact of reading fiction, including the phenomenological and experimental tradition (e.g. Miall and Kuiken, 1995; Kuijpers, 2014; Fialho, 2012, 2019), studies of literary evaluation (e.g., Von Heydebrand and Winko, 1996), of newspaper criticism (Linders, 2012), of online reviews, depending on site and book genre (Koolen et al., 2020; Newell et al., 2016), of the influence of reader gender on reviews (Thelwall and Boerrier, 2019), and of self-presentation in social media (e.g. Hollenbaugh, 2021).

Despite the wealth of research in these domains, there are many open questions on the impact of fiction in the context of book reviews. Are existing typologies of reading experiences applicable to the context of book reviews? Can impact be reliably measured from reviews? How do reviewer characteristics and textual features (e.g., genre, perspective) affect impact? Do genre effects influence review content? Are book reviews mostly about books, or do they primarily reflect the self-image readers want to present to the world? What forms of reflection occur in book reviews? In this presentation, we will offer a series of reflections on these issues.

Dealing with messy data. A methodological solution for analysing unbalanced social media datasets
Rebora, S., Hermann, J. B., Messerli, T., Jorschick, A.
In the study of natural social media data standard solutions for dealing with “messy” and high frequency data in hypothesis-testing statistics are still missing. This paper contributes to a solution for the issues of hypothesis testing of (a) big-scale and (b) unbalanced datasets. Building on the development of deep learning classifiers for the recognition of evaluative language and sentiment (Rebora et al., 2022), we thus present a possible methodological groundwork for the study of book impact at a large scale.
Our project focuses on German book reviews published on the
LovelyBooks platform (~1.3M reviews). Fig. 3 (evaluation) and 4 (sentiment) show the application of the two classifiers on the corpus.

Proportion of evaluative language per review

Proportion of sentiment per review (green = positive; red = negative)

In order to deal with the issue of a “messy” big-data corpus, we evaluated advanced statistical strategies. As Anova-style tests tend to increase type I errors, we applied Linear Mixed-Effects Models (Winter and Grice, 2019), using a subsection of our dataset (30,000 reviews, most popular genres). Book GENRE, RATING, and total Number of Reviews by User (NRU) were independent variables predicting the proportion of evaluative language. Table 1 shows significant main effects for
NRU, RATING, GENRE, and a significant interaction effect for
NRU and
GENRE. Figure 6 shows an interaction effect of GENRE, RATING and USERTYPE (high vs. low NRU), with high NRU users deviating for the novel GENRE. As such dynamics might often be missed in data analysis, our case study shall advocate for the use of advanced statistical modeling.

Anova type III table of main effects and interactions. Significant effects are bold.

Rating vs. evaluative language

A preregistered analysis investigating the relation between emotions in books and reviews
Sharma, S., Pianzola, F.
In this paper we look at the emotional impact a book can have, focusing on the relation between emotions in books and emotions in reviews. We test the psychological theory known as “framing effect” (Tversky and Kahneman, 1981) – which states that the response of the audience is influenced by the way in which the information is presented to them – and operationalize it as a task of computational text analysis. To do this, we use a dataset of 450 books, divided across 9 genres, and with more than 5 million English reviews. We conduct sentiment analysis of three different components: the average book sentiment, the average review sentiment of the corresponding book, and the emotional story arc of each book (Reagan et al., 2016; Jockers, 2017). We compare three different methods – distilBERT (Sanh et al., 2020), a dictionary-based model testing different lexica (Mohammad and Turney, 2013), and SentiArt, a vector space model (Jacobs and Kinder, 2019) – and reflect on their accuracy and interpretability in the context of a DH project, rather than as a general NLP task. This is among the first studies that quantitatively investigate relations between stories’ sentiment and reader response (Jacobs and Kinder, 2019; Pianzola et al., 2020), and it uses both state-of-the-art machine learning as well as hypothesis-testing statistics.

Bibliography

Dimitrov, S. et al. (2015). Goodreads versus Amazon: The Effect of Decoupling Book Reviewing and Book Selling.
Proceedings of the International AAAI Conference on Web and Social Media,
9(1), 602-05. https://ojs.aaai.org/index.php/ICWSM/article/view/14662

Fialho, O. (2012). Self-Modifying Experiences in Literary Reading: A Model for Reader Response. PhD Dissertation, University of Alberta. https://era.library.ualberta.ca/items/94ecceb5-56a7-4601-b1e5-bc2117d12e01

Fialho, O. (2019). What is literature for? The role of transformative reading.
Cogent Arts & Humanities,
6(1), special issue “The place of the cognitive in literary studies”, Kukkonen, K., Kuzmičová, A., Ledet Christiansen, S., and Polvinen, M. (eds.), https://doi.org/10.1080/23311983.2019.1692532

Hollenbaugh, E. E. (2021). Self-Presentation in Social Media: Review and Research Opportunities.
Review of Communication Research,
9, pp. 80-98.

Jacobs, A. M. and Kinder, A. (2019). Computing the Affective-Aesthetic Potential of Literary Texts.
AI,
1(1), pp. 11–27. 10.3390/ai1010002.

Jockers, M. (2017).
Introduction to the Syuzhet Package.
The Comprehensive R Archive Network. https://cran.r-project.org/web/packages/syuzhet/vignettes/syuzhet-vignette.html (accessed 29 May 2018).

Koolen, M., Boot, P., and Van Zundert, J. (2020). Online Book Reviews and the Computational Modelling of Reading Impact,
Computational Humanities Research 2020. http://ceur-ws.org/Vol-2723/long13.pdf

Kuijpers, M. (2014). Absorbing stories: The effects of textual devices on absorption and evaluative responses. PhD Dissertation, Utrecht University.

Linders, Y. (2012). Argumentation in Dutch literary criticism 1945–2005. In C. Perry and M. Szurawitzki (eds.),
Sprache und Kultur im Spiegel der Rezension (Frankfurt am M.: Peter Lang), pp. 261-68.

Miall, D. S. and Kuiken, D. (1995). Aspects of literary response: A new questionnaire',
Research in the Teaching of English, pp. 37-58.

Mohammad, S. and Turney, P. (2013). Crowdsourcing a Word-Emotion Association Lexicon.
Computational Intelligence,
29(3), pp. 436–65. http://dx.doi.org/10.1111/j.1467-8640.2012.00460.x.

Newell, E. et al. (2016). To Buy or to Read: How a Platform Shapes Reviewing Behavior. In
Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM 2016), pp. 643–6.

Pianzola, F., Rebora, S., and Lauer, G. (2020). Wattpad as a Resource for Literary Studies. Quantitative and Qualitative Examples of the Importance of Digital Social Reading and Readers’ Comments in the Margins.
PLoS ONE,
15(1). https://doi.org/10.1371/journal. pone.0226708.

Rebora, S., Messerli, T. C. and Herrmann, J. B. (2022). Towards a Computational Study of German Book Reviews. A Comparison between Emotion Dictionaries and Transfer Learning in Sentiment Analysis. DHd 2022 Kulturen des digitalen Gedächtnisses. 8. Tagung des Verbands "Digital Humanities im deutschsprachigen Raum" (DHd 2022), Potsdam. https://doi.org/10.5281/zenodo.6328141

Reagan, A. J. et al. (2016). The Emotional Arcs of Stories Are Dominated by Six Basic Shapes.
EPJ Data Sci.,
5, pp. 5–31. 10.1140/epjds/s13688-016-0093-1.

Sanh, V. et al. (2020). DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter.
ArXiv:1910.01108 [Cs]. http://arxiv.org/abs/1910.01108 (accessed 1 June 2020).

Thelwall, Mike and Bourrier, Karen (2019), The reading background of Goodreads book club members: a female fiction canon?
Journal of Documentation,
75(5), pp. 1139-61.

Tversky, A. and Kahneman, D. (1981). The Framing of Decisions and the Psychology of Choice.
Science,
211(4481), pp. 453–8. 10.1126/science.7455683.

Von Heydebrand, R. and Winko, S. (1996).
Einführung in die Wertung von Literatur: Systematik, Geschichte, Legitimation (Paderborn: Schöningh).

Winter, B. and Grice, M. (2019). Independence and generalizability in linguistics. OSF https://osf.io/zdrpc/ (accessed 24 September 2021).

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022

"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO