Manipulating and processing masses of digital data is never a purely technical activity. It requires an interpretative and exploratory outlook - already well known in the social sciences and the humanities - to convey intelligible results from data analysis algorithms and create new knowledge.
Big Data is based on an inquiry of several years within Proxem, a software publisher specializing in big data processing. The book examines how data scientists explore, interpret and visualize our digital traces to make sense of them, and to produce new knowledge. Grounded in epistemology and science and technology studies, Big Data offers a reflection on data in general, and on how they help us to better understand reality and decide on our daily actions.
Introduction vii
Chapter 1. From Trace to Web Data: An Ontology of the Digital Footprint1
1.1. The epistemology of the cultural sciences 7
1.2. The footprint in evidential sciences 9
1.3. The log or activity history 14
1.4. The digital footprint as a web log 18
1.5. The intentionality of digital footprints 20
1.6. Data as theoretically-loaded footprints 24
Chapter 2. Toward an Epistemic Continuity Anchored in the Cultural Sciences29
2.1. Digital technology in the cultural sciences 31
2.2. Field and corpus: two modes of access to reality 34
2.3. Virtual methods, a reconstruction of access to the field 38
2.4. The challenges of the technical revolution of the text 48
2.5. From the web as an object to the web as a corpus 59
2.6. Conclusion 69
Chapter 3. The Status of Computation in Data Sciences71
3.1. Making data computable 73
3.2. The field of computability 77
3.3. Computational thinking 81
3.4. Computation in the natural sciences 87
3.5. From exploratory analysis to data mining 98
3.6. The institutional and theoretical melting pot of data science 107
3.7. The contribution of artificial intelligence 115
3.8. Conclusion 122
Chapter 4. A Practical Big Data Use Case125
4.1. Presentation of the case study 126
4.2. Customer experience and coding of feedback131
4.3. From the representative approach to the big data project 134
4.4. Data preparation 137
4.5. Design of the coding plan 140
4.6. The constitution of linguistic resources 143
4.7. Constituting the coding plan 148
4.8. Visibility of the language activity 153
4.9. Storytelling and interpretation of the data 155
4.10. Conclusion 161
Chapter 5. From Narratives to Systems: How to Shape and Share Data Analysis165
5.1. Two epistemic configurations 166
5.2. The genesis of systems 172
5.3. Conclusion 183
Chapter 6. The Art of Data Visualization187
6.1. Graphic semiology 187
6.2. Data cartography 198
6.3. Representation as evidence 203
6.4. The visual language of design in system configuration 207
6.5. Materialization and interpretation of recommendations 214
Chapter 7. Knowledge and Decision219
7.1. Big data, a pragmatic epistemology? 220
7.2. Toward gradual validity of knowledge 227
7.3. Deciding, knowing and measuring 233
Conclusion 239
References 243
Index 257