Projects

Pre-review to Peer review | Pitfalls of Automating Reviews using Large Language Models

Abstract Large Language Models are versatile general-task solvers and their capabilities can truly assist people with scholarly peer review as $\textit{pre-review}$ agents if not fully autonomous $\textit{peer-review}$ agents. While incredibly beneficial automating academic peer-review as a concept raises concerns surrounding safety, research integrity and validity of the academic peer-review process. Majority of the studies performing a systematic evaluation of frontier LLMs generating reviews across science disciplines miss the mark on addressing the alignment/misalignment question and never place emphasis on assessing the effect of reviews on post-publication outcomes $\textbf{Citations}$, $\textbf{hit-papers}$, $\textbf{Novelty}$, and $\textbf{Disruption}$. We present an experimental study gathering ground-truth reviewer rating scores from OpenReview and utilizing various frontier open-weight LLMs ($\textbf{Gemma-3 27b, Qwen-3 32b, Phi-4, Olmo2-32b}$, and $\textbf{Llama 3.3 70b}$) to generate reviews of the manuscript to gauge the safety and reliability of involving languages models in the scientific review pipeline. Our effort to connect the safety and reliability of using LLMs in academic peer-review with post-publication outcomes makes it easier to highlight the potential and pitfalls of automating peer-reviews using language models and gives us a pathway for making the process agentic. We open-source our dataset $D_{LMRSD}$ to help the research community expand of safety-framework of automating scientific reviews. ...

Sciscinet-v2, large-scale integrated datalake for science of science

About Sciscinet-v2 SciSciNet-v2 is a refreshed update to SciSciNet which is a large-scale, integrated dataset designed to support research in the science of science domain. It combines scientific publications with their network of relationships to funding sources, patents, citations, and institutional affiliations, creating a rich ecosystem for analyzing scientific productivity, impact, and innovation. Linkages The entity relationship diagram for SciSciNet-v2. The dataset includes papers, authors, affiliations, and fields as the main entities in the center, with linkages to other tables capturing data from NSF, NIH, clinical trials, USPTO, EUPTO, NewsFeed, and Nobel Laureates on the left and the aggregation of main entity tables on the right. ...

ReSci-Agent | Agentic PeerReview

Core motivation The entire process of a peer-review is a long-horizon task where the cognitive abilities of the human are put to test, and during this multi-hop process contextual understanding of the scientific work along with internal priors about the subject dictate the effectiveness of the review. Thematically, this peer-review process maps well as a task for most frontier LLMs. To be more specific, the peer-review agent must be multi-modal, and equipped with tool use (internet search, run code) to capture different aspects of the peer-review. ...

Large language models for Scientometrics

About Large Language Models: The capabilities of Large Language Models (LLM’s) to process data from different modalities and excel at different tasks ranging from information extraction, question and answering, math, coding, and recently reasoning simply shows the potential of this technology. Intuitively the complexities of training these models on different datasets/data mixes, opting different architectural choices, choosing different alignment strategies [1] seemingly could suggest picking a specific model for each task, but LLM’s are geared towards being considered as general task solvers. ...

Navigating the Landscape of Reproducible Research, A Predictive Modeling Approach

Abstract The reproducibility of scientific articles is central to the advancement of science. Despite this importance, evaluating reproducibility remains challenging due to the scarcity of ground truth data. Predictive models can address this limitation by streamlining the tedious evaluation process. Typically, a paper’s reproducibility is inferred based on the availability of artifacts such as code, data, or supplemental information, often without extensive empirical investigation. To address these issues, we utilized artifacts of papers as fundamental units to develop a novel, dual-spectrum framework that focuses on author-centric and external-agent perspectives. We used the author-centric spectrum, followed by the external-agent spectrum, to guide a structured, model-based approach to quantify and assess reproducibility. We explored the interdependencies between different factors influencing reproducibility and found that linguistic features such as readability and lexical diversity are strongly correlated with papers achieving the highest statuses on both spectrums. Our work provides a model-driven pathway for evaluating the reproducibility of scientific research. ...

Influence of Reproducibility on Scientific Impact

IRSI Reproducibility is an important feature of science; experiments are retested, and analyses are repeated.we examine a myriad of features in scholarly articles published in computer science conferences and journals and model how they influence scientific impact. Reproducibility Spectrum The author-centric framework focuses on acknowledging availability, accessibility, and quality of the artifact available within scientific document to signal satisfying prerequisites to reproduce a paper. The Author-Centric framework within the spectrum includes, $$A_i = A_{PWA}, A_{PUNX}, A_{PAX}$$ Papers without artifacts, Papers with artifacts that aren’t permanantly archived, and Papers with artifacts that are permanantly archived. ...

Reproducibility Signals in Science, A preliminary analysis

Abstract Reproducibility is an important feature of science; experiments are retested, and analyses are repeated. Trust in the findings increases when consistent results are achieved. Despite the importance of reproducibility, significant work is often involved in these efforts, and some published findings may not be reproducible due to oversights or errors. In this paper, we examine a myriad of features in scholarly articles published in computer science conferences and journals and test how they correlate with reproducibility. We collected data from three different sources that labeled publications as either reproducible or irreproducible and employed statistical significance tests to identify features of those publications that hold clues about reproducibility. We found the readability of the scholarly article and accessibility of the software artifacts through hyperlinks to be strong signals noticeable amongst reproducible scholarly articles. ...

Laying Foundations to Quantify the “Effort of Reproducibility”

Abstract Why are some research studies easy to reproduce while others are difficult? Casting doubt on the accuracy of scientific work is not fruitful, especially when an individual researcher cannot reproduce the claims made in the paper. There could be many subjective reasons behind the inability to reproduce a scientific paper. The field of Machine Learning (ML) faces a reproducibility crisis, and surveying a portion of published articles has resulted in a group realization that although sharing code repositories would be appreciable, code bases are not the end all be all for determining the reproducibility of an article. Various parties involved in the publication process have come forward to address the reproducibility crisis and solutions such as badging articles as reproducible, reproducibility checklists at conferences (NeurIPS, ICML, ICLR, etc.), and sharing artifacts on OpenReview come across as promising solutions to the core problem. The breadth of literature on reproducibility focuses on measures required to avoid ir-reproducibility, and there is not much research into the effort behind reproducing these articles. In this paper, we investigate the factors that contribute to the easiness and difficulty of reproducing previously published studies and report on the foundational framework to quantify effort of reproducibility. ...

A Brief Survey on Representation Learning based Graph Dimensionality Reduction Techniques

Abstract Dimensionality reduction techniques map data represented on higher dimensions onto lower dimensions with varying degrees of information loss. Graph dimensionality reduction techniques adopt the same principle of providing latent representations of the graph structure with minor adaptations to the output representations along with the input data. There exist several cutting edge techniques that are efficient at generating embeddings from graph data and projecting them onto low dimensional latent spaces. Due to variations in the operational philosophy, the benefits of a particular graph dimensionality reduction technique might not prove advantageous to every scenario or rather every dataset. As a result, some techniques are efficient at representing the relationship between nodes at lower dimensions, while others are good at encapsulating the entire graph structure on low dimensional space. We present this survey to outline the benefits as well as problems associated with the existing graph dimensionality reduction techniques. We also attempted to connect the dots regarding the potential improvements to some of the techniques. This survey could be helpful for upcoming researchers interested in exploring the usage of graph representation learning to effectively produce low-dimensional graph embeddings with varying degrees of granularity. ...

Early indicators of scientific impact: Predicting citations with altmetrics

Abstract Identifying important scholarly literature at an early stage is vital to the academic research community and other stakeholders such as technology companies and government bodies. Due to the sheer amount of research published and the growth of ever-changing interdisciplinary areas, researchers need an efficient way to identify important scholarly work. The number of citations a given research publication has accrued has been used for this purpose, but these take time to occur and longer to accumulate. In this article, we use altmetrics to predict the short-term and long-term citations that a scholarly publication could receive. We build various classification and regression models and evaluate their performance, finding neural networks and ensemble models to perform best for these tasks. We also find that Mendeley readership is the most important factor in predicting the early citations, followed by other factors such as the academic status of the readers (e.g., student, postdoc, professor), followers on Twitter, online post length, author count, and the number of mentions on Twitter, Wikipedia, and across different countries. ...