Conceptual and technological grounds for big data modeling in Digital Humanities: textual corpora in history of science as case study
Extensive digitization projects conducted since the 1990s made large textual corpora easily available, even to remote access. More recently new computing tools began to be developed that afford novel approaches to theoretical studies and big data modeling likely to allow for more accurate standards for the organization and classification of knowledge. The present proposal is integrated within a larger international collaboration aiming at making work with documents relevant for history of science more effective. More specifically, the aim of the present project is to develop tools to locate and recognize concepts common to texts included in large databases specific for history of science research, as well as their change over time, for the purpose of indexing and classification.
Keywords: History of science; Digital Humanities; Organization and classification of knowledge; Textual corpora; Data mining and modelling; Computational linguistics