Digital libraries that store scientific publications are becoming increasingly central to the research process. They are not only used for traditional tasks, such as finding and storing research outputs, but also as a source for discovering new research trends or evaluating research excellence. With the current growth of scientific publications deposited in digital libraries, it is no longer sufficient to provide only access to content. To aid research, it is especially important to leverage the potential of text and data mining technologies to improve the process of how research is being done.
This workshop aims to bring together people from different backgrounds who:
The topics of the workshop will be organised around the following themes:
Topics of interest relevant to theme 1 include but are not limited to:
Topics of interest relevant to theme 2 include, but are not limited to:
Topics of interest relevant to theme 3 include, but are not limited to:
We would like to invite the workshop participants to makes use of the CORE publications dataset containing large volume of research publications from a wide variety of research areas. The dataset contains not only full-texts, but also an enriched version of publications' metadata. This dataset provides a framework for developing and testing methods and tools addressing the workshop topics. The use of this dataset is not mandatory, however it is encouraged. The dataset is available through the CORE portal: here
The workshop on Mining Scientific Publications aims to bring together researchers, digital library developers and practitioners from government and industry to address the current challenges in the domain of mining scientific publications.
The 1st International Workshop on Mining Scientific Publications was held in conjunction with JCDL 2012. The 2nd run of this workshop was held in conjunction with JCDL 2013. The 3rd run was associated with DL 2014 in London. The 4th run took place together with JCDL 2015. Finally, the 5th run of this workshop was associated JCDL 2016. All runs of the workshop have been extremely successful in terms of attracting submissions and participants from leading institutions in the area including Cambridge University, Microsoft, British Library, Elsevier, National Library of Medicine, Library of Congress, University of Pennsylvania (CiteSeerX), Know-Center Graz, University of Athens (OpenAIRE project) and Mendeley.
We plan this workshop as a one whole-day event. The workshop is organized this year for the fifth time (the four previous workshops were also in association with JCDL) and is planned to take place yearly. The workshop will consist of two invited talks, a series of presentations followed by a short discussion, a short work in groups session dedicated to addressing specific issues in the field and a final round table discussion at the end of the day. The workshop participants will be also encouraged to visit and experience demonstrations that will be presented during coffee breaks. In the evening, the workshop participants will have the possibility to attend an informal dinner.
We invite submissions related to the workshop's topics. Long papers should not exceed 8 pages and short papers should not exceed 4 pages of the ACM style. Furthermore, we welcome demo presentations of systems or methods. A demonstration submission should consist of a maximum two-page description of the system, method or tool to be demonstrated. All submissions will be uploaded to EasyChair for a peer-review.
Papers should be submitted using the EasyChair system provided here:
Successful submissions will be published as a special issue in the D-Lib journal . See previous proceedings at here
All submissions will be peer-reviewed and meta-reviewed by members of the Programme Committee. Each publication will be assigned a score and the best publications will be selected. In this sense, the process will be the same as in the last years.
This year, we have applied for publishing accepted short and full papers in the ACM International Conference Proceedings Series (ICPS).
We are currently awaiting ACM's decision on the matter.
The proceedings of the special issues from
the last years are available at:
D-Lib July/August 2012 contents
D-Lib September/October 2013 contents
D-Lib November/December 2014 contents
D-Lib November/December 2015 contents
D-Lib September/October 2016 contents
Sunday, 23rd April 2017 11:59 (Hawaii time) - Submission deadline
Friday, 5th May 2017 11:59 (Hawaii time) - Extended Submission deadline
Thursday, 18th May 2017 - Notification of acceptance
Monday, 12th June 2017 - Camera-ready
Monday, 19th June 2017 - Workshop
9:00-9:10 | Introduction |
9:10-9:45 |
Keynote talk
Towards a more efficient, less painful discovery of scientific research findings Waleed Ammar |
9:45-10:05 |
Long paper
Analyzing Semantic Concept Patterns to Detect Academic Plagiarism Norman Meuschke, Nicolas Siebeck, Moritz Schubotz and Bela Gipp |
10:05-10:20 |
Short paper
Investigating Convolutional Networks and Domain-Specific Embeddings for Semantic Classification of Citations Anne Lauscher, Goran Glavas, Simone Paolo Ponzetto and Kai Eckert |
10:20-10:40 |
Long paper
AppTechMiner: Mining Applications and Techniques from Scientific Articles Mayank Singh, Soham Dan, Sanyam Agarwal, Pawan Goyal and Animesh Mukherjee |
10:40-11:10 | Break |
11:10-11:30 |
Long paper
Word importance-based similarity of documents metric (WISDM) Viktor Botev, Kaloyan Marinov and Florian Schäfer |
11:30-11:45 |
Short paper
Audience Based View of Publication Impact Robert Patton, Drahomira Herrmannova, Christopher Stahl, Jack Wells and Thomas Potok |
11:45-12:05 |
Long paper
Multi-level mining and visualization of scientific text collections. Exploring a bilingual scientific repository Pablo Accuosto, Francesco Ronzano, Daniel Ferrés and Horacio Saggion |
12:05-12:20 |
Demo paper
Content Analytics Toolbench (CAT): a flexible single point of access for content enhancement and data analytics across massive corpora Ron Daniel and Michael Lauruhn |
12:20-12:40 |
Long paper
Rapid Tagging and Reporting for Functional Language Extraction in Scientific Articles Mahmood Ramezani, Vijay Kalivarapu, Stephen Gilbert, Sarah Huffman, Elena Cotos and Annette O'Connor |
12:40-13:00 |
Invited talk
Towards effective research recommender systems Petr Knoth |
13:00-14:00 | Lunch |
14:00-14:35 |
Keynote talk
Viziometrics: building a figure-centric search engine for the scholarly literature Jevin West |
14:35-14:55 |
Long paper
HyPRec: a Weighted Hybrid Approach for Scientific Paper Recommendation Anas Alzoghbi, Mostafa M. Mohamed, Omar Nada, Ibrahim Alshibani, Victor Anthony Arrascue Ayala and Georg Lausen |
14:55-15:10 |
Short paper
Comparing citation numbers between articles at two stages of a Model Organism Database curation workflow Michael Lauruhn and Gillian Millburn |
15:10-15:30 |
Long paper
Methods for Synthesis of Funding Agency & Publisher Data Monica Ihli |
15:30-16:00 | Break |
16:00-16:20 |
Long paper
Geographical Distribution of Biomedical Research in the USA Yingjun Guan, Jing Du and Vetle Torvik |
16:20-16:35 |
Demo paper
Iris.AI - Science Assistant Viktor Botev |
16:35-16:50 |
Short paper
A Discipline-Enriched Dataset for Tracking the Computational Turn of European Universities Federico Nanni and Giulia Paci |
16:50-17:00 | Closing |
Petr Knoth, Knowledge Media institute, The Open University, UK
Robert Patton, Oak Ridge National Laboratory, USA
Drahomira Herrmannova, Oak Ridge National Laboratory, USA
David Pride, Knowledge Media institute, The Open University, UK
Anita Khadka, Knowledge Media institute, The Open University, UK
Iana Atanassova, CRIT, Université de Bourgogne Franche-Comté, France
Joeran Beel, Trinity College, University of Dublin, Ireland
Marc Bertin, Paris-Sorbonne University, France
Pável Calado, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Tanmoy Chakraborty, University of Maryland, USA
Aristotelis Charalampous, KMi,The Open University, UK
Daniel Duma, University of Edinburgh, UK
Shang Gao, Oak Ridge National Laboratory, USA
Christopher G. Harris, SUNY Oswego, USA
Saeed Ul Hassan, Information Technology University, Pakistan
Antoine Isaac, Europeana & VU University Amsterdam, The Netherlands
Roman Kern, Graz University of Technology,Austria
Martin Klein, Los Alamos National Laboratory, USA
Birger Larsen, Aalborg University Copenhagen, Denmark
Paolo Manghi, ISTI-CNR, Italy
Bruno Martins, Instituto Superior Técnico, Universidade de Lisboa, Portugal
Philipp Mayr, GESIS - Leibniz Institute for the Social Sciences, Germany
Peter Mutschke, GESIS - Leibniz Institute for the Social Sciences, Germany
Franco Maria Nardini, ISTI-CNR, Italy
Francesco Osborne, KMi, The Open University, UK
John X. Qiu, Oak Ridge National Laboratory/University of Tennessee, USA
Eloy Rodrigues, Universidade do Minho, Portugal
Angelo Antonio Salatino, KMi, The Open University, UK
Pavel Smrz, Brno University of Technology, Czech Republic
Mike Thelwall, University of Wolverhampton, UK
Vetle Torvik, University of Illinois, USA
Michael T. Young, Oak Ridge National Laboratory, USA
University of Toronto
College View Ave, Toronto
Canada