Our paper "CoBERT: Scientific Collaboration Prediction via Sequential Recommendation" has been accepted at NeuRec@ICDM 2021
01.10.2021Our paper "CoBERT: Scientific Collaboration Prediction via Sequential Recommendation" has been accepted at the Second International Workshop on Neural Recommender Systems @ ICDM-21.
In this work, we reformulate the co-author prediction task from a classical link prediction approach on a co-author graph to a sequential approach. To do so, we adapt the sequential recommendation model BERT4REC to this setting. Finally we experiment with different methods to add further meta information about authors and publications to the model.
Abstract:
Collaborations are an important factor for scientific success, as the joint work leads to results individual scientists cannot easily reach. Recommending collaborations automatically can alleviate the time consuming and tedious search for potential collaborators. Usually, such recommendation systems rely on graph structures modeling co-authorship of papers and content-based relations such as similar paper keywords. Models are then trained to estimate the probability of links between certain authors in these graphs.
In this paper, we argue that the order of papers is crucial for reliably predicting future collaborations, which is not considered by graph-based recommendation systems. We thus propose to reformulate the task of collaboration recommendation as a sequential recommendation task. Here, we aim to predict the next co-author in a chronologically sorted sequence of an author's collaborators. We introduce CoBERT, a BERT4Rec inspired model, that predicts the sequence's next co-author and thus a potential collaborator. Since the order of co-authors of a single paper is not that important compared to the overall paper order, we leverage positional embeddings encoding paper positions instead of co-author positions in the sequence. Additionally, we inject content features about every paper and their co-authors. We evaluate CoBERT on two datasets consisting of papers from the field of Artificial Intelligence and the journal PlosOne. We show that CoBERT can outperform graph-based methods and BERT4Rec when predicting the co-authors of the next paper. We make our code and data available.