Repository logo
 
Publication

Retrieving, classifying and analysing narrative commentary in unstructured (glossy) annual reports published as PDF files

dc.contributor.authorEl-Haj, Mahmoud
dc.contributor.authorAlves, Paulo
dc.contributor.authorRayson, Paul
dc.contributor.authorWalker, Martin
dc.contributor.authorYoung, Steven
dc.date.accessioned2020-03-05T17:05:39Z
dc.date.available2020-03-05T17:05:39Z
dc.date.issued2019
dc.description.abstractWe provide a methodological contribution by developing, describing and evaluating a method for automatically retrieving and analysing text from digital PDF annual report files published by firms listed on the London Stock Exchange (LSE). The retrieval method retains information on document structure, enabling clear delineation between narrative and financial statement components of reports, and between individual sections within the narratives component. Retrieval accuracy exceeds 95% for manual validations using a random sample of 586 reports. Large-sample statistical validations using a comprehensive sample of reports published by non-financial LSE firms confirm that report length, narrative tone and (to a lesser degree) readability vary predictably with economic and regulatory factors. We demonstrate how the method is adaptable to non-English language documents and different regulatory regimes using a case study of Portuguese reports. We use the procedure to construct new research resources including corpora for commonly occurring annual report sections and a dataset of text properties for over 26,000 U.K. annual reports.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.citationEl-Haj, M., Alves, P., Rayson, P., Walker, M., Young, S. (2019). Retrieving, classifying and analysing narrative commentary in unstructured (glossy) annual reports published as PDF files. Accounting and Business Research, 50(1), 6-34pt_PT
dc.identifier.doi10.1080/00014788.2019.1609346pt_PT
dc.identifier.eid85076386957
dc.identifier.eissn2159-4260
dc.identifier.issn0001-4788
dc.identifier.urihttp://hdl.handle.net/10400.14/29853
dc.identifier.wos000479561200001
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherTaylor & Francispt_PT
dc.relationES/J012394/1pt_PT
dc.relationES/K002155/1pt_PT
dc.relationES/R003904/1pt_PT
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectAnnual reportspt_PT
dc.subjectTextual analysispt_PT
dc.subjectUnstructured documentspt_PT
dc.subjectNarrative reportingpt_PT
dc.titleRetrieving, classifying and analysing narrative commentary in unstructured (glossy) annual reports published as PDF filespt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage34pt_PT
oaire.citation.issue1pt_PT
oaire.citation.startPage6pt_PT
oaire.citation.titleAccounting and Business Researchpt_PT
oaire.citation.volume50pt_PT
person.familyNameAlves
person.familyNameYoung
person.givenNamePaulo
person.givenNameSteven Eric
person.identifier.ciencia-id8C12-EBAE-B385
person.identifier.orcid0000-0002-9421-854X
person.identifier.orcid0000-0002-4168-6200
person.identifier.scopus-author-id55206239400
person.identifier.scopus-author-id35243795800
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublication9d156779-61a9-4249-b112-5832c42094f4
relation.isAuthorOfPublication8afb8657-6342-40c0-bc41-be9c73bfde68
relation.isAuthorOfPublication.latestForDiscovery9d156779-61a9-4249-b112-5832c42094f4

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
19845394.pdf
Size:
2.47 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.44 KB
Format:
Item-specific license agreed upon to submission
Description: