CHIMERA: Extraction models
Collection
7 items • Updated
# Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("noystl/scibert_token_classifier")
model = AutoModelForTokenClassification.from_pretrained("noystl/scibert_token_classifier")This Hugging Face repository contains a fine-tuned allenai/scibert_scivocab_uncased model trained for the task of extracting recombination examples from scientific abstracts, as described in the paper CHIMERA: A Knowledge Base of Idea Recombination in Scientific Literature. The model can be used for the information extraction task of identifying recombination examples within scientific text. For detailed usage instructions and reproduction of results, please refer to the Github repository linked above.
Non-Default Hyperparameters
per_device_train_batch_size: 1max_steps: 500weight_decay: 0.1learning_rate: 6.e-5Bibtex
@misc{sternlicht2025chimeraknowledgebaseidea,
title={CHIMERA: A Knowledge Base of Idea Recombination in Scientific Literature},
author={Noy Sternlicht and Tom Hope},
year={2025},
eprint={2505.20779},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.20779},
}
Quick Links
Base model
allenai/scibert_scivocab_uncased
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="noystl/scibert_token_classifier")