Unrestricted Text and Data Mining with allofPLOS

Publié le 1er décembre 2017, par Thérèse Hameau

Content mining, machine learning, text and data mining (TDM) and data analytics all refer to the process of obtaining information through machine-read material. Faster than a human possibly could, machine-learning approaches can analyze data, metadata and text content ; find structural similarities between research problems in unrelated fields ; and synthesize content from thousands of articles to suggest directions for further research explorations. In consideration of the continually expanding volume of peer-reviewed literature, the value of TDM should not be underappreciated. Text and data mining is a useful tool for developing new scientific insights and new ways to understand the story told by the published literature.

...

No Restrictions, No Conditions : allofPLOS

With more than 200,000 fully Open Access research articles available for content mining, PLOS can help advance the discussion and application of content mining through real-world experiences. Through our API we provide article text and meta-data in a single XML file format according to the Journal Article Tag Suite (JATS), the National Information Standards Organization (NISO) standard tag suite for archiving and exchanging journal article content.

The new allofPLOS project is a step forward in providing researchers easier opportunities for new discovery and illumination of non-obvious connections between data, research articles and fields of study. With allofPLOS, in addition to the content of every PLOS article (excluding Figures or Supplemental Data) provided in JATS XML format, the XML parsing tools are provided. By including tags, content and parsing tools together, we hope to simplify and streamline the process for those wanting to experiment with content mining and TDM tools.

...

L’information