Text Mining and the Social Sciences

Alan Maloney, Project Manager, Online Products

SJIf you wanted to read every book and journal article that SAGE published in 2012, and you read 24/7 at an average pace starting right now, you wouldn’t be done until February 2016. And that’s just one publisher among many. What do you do when there is simply more information than you – or anyone – could ever read and digest?

Predictably, when something is beyond human capability, we get a machine to do it – and that’s the principle behind using text and data mining to discover information hidden in massive bodies of content. This is not really news in scientific, medical and technical publishing, where researchers have been using computers and natural language processing for decades in order to discover insights like protein interactions and drug side-effects. But most of the books and journals that SAGE publishes in 2012 are from the humanities and social sciences, where text and data mining techniques are more experimental (and more interesting).

It’s an exciting time for text and data mining, with exponentially more content being made available online, and text and data mining tools becoming more and more sophisticated. But having a computer read and summarise text for you still has its challenges, especially in the humanities and social sciences. A knowledgeable human may tell you that Hamlet is a story of revenge and encourage you to check out Edward II, but a computer might see the word ‘gravity’ and recommend a work by Isaac Newton instead. Is an author talking about orange the colour, orange the fruit or Orange the company? That’s why we have to know our content really well – so that we can teach a computer rules and exceptions to make sense of things to the reader. Text mining will never give perfect results, but when you need an insight into hundreds of thousands of documents, it’s a good start.

SAGE wants to be a champion of text and data mining in the social sciences. This year we used text mining techniques to apply hundreds of thousands of keywords to our book content in SAGE Knowledge, as well as recommend related documents in SAGE Research Methods and SAGE Journals. None of the links were made by a human.

Over the coming months, SAGE will be rolling out a number of new and experimental enhancements to its online books and journals, so watch this space. And in the meantime, if you have any thoughts on how SAGE can do more in this area, do get in touch!

About SAGE Publications

Founded in 1965, SAGE is the world’s leading independent academic and professional publisher. Known for our commitment to quality and innovation, SAGE has helped inform and educate a global community of scholars, practitioners, researchers, and students across a broad range of subject areas. With over 1200 employees globally from principal offices in Los Angeles, London, New Delhi, Singapore, and Washington DC, our publishing programme includes more than 640 journals and over 800 books, reference works and databases a year in business, humanities, social sciences, science, technology and medicine. Believing passionately that engaged scholarship lies at the heart of any healthy society and that education is intrinsically valuable, SAGE aims to be the world’s leading independent academic and professional publisher. This means playing a creative role in society by disseminating teaching and research on a global scale, the cornerstones of which are good, long-term relationships, a focus on our markets, and an ability to combine quality and innovation. Leading authors, editors and societies should feel that SAGE is their natural home: we believe in meeting the range of their needs, and in publishing the best of their work. We are a growing company, and our financial success comes from thinking creatively about our markets and actively responding to the needs of our customers.
This entry was posted in Industry news, SAGE Connection, SAGE news. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s