SpanBERT:使用和评估SpanBERT的代码

上传者: 42116604 | 上传时间: 2023-04-07 01:26:38 | 文件大小: 387KB | 文件类型: ZIP
斯潘伯特 该存储库包含该论文的代码和模型: 。 要求 顶尖 请使用较早的Apex提交 预训练模型 我们发布这两个基地和SpanBERT大套管模型。 基本模型和大型模型具有与相同的模型配置,但是在掩蔽方案和训练目标上都有所不同(有关更多详细信息,请参见我们的论文)。 :12层,隐藏头,110M参数 :24层,隐藏1024、16头,340M参数 这些模型与模型具有相同的格式,因此您可以轻松地用我们的SpanBET模型替换它们。 如果您想使用我们,则模型路径已经在代码中进行了硬编码:) SQuAD 1.1 SQuAD 2.0 酷睿 Tyk F1 F1 平均F1 F1 BERT(基地) 88.5 * 76.5 * 73.1 67.7 SpanBERT(基础) 92.4 * 83.6 * 77.4 68.2 BERT(大) 91.3 83.3 77.1

文件下载

资源详情

[{"title":"( 92 个子文件 387KB ) SpanBERT:使用和评估SpanBERT的代码","children":[{"title":"SpanBERT-master","children":[{"title":"pretraining","children":[{"title":"fairseq","children":[{"title":"models","children":[{"title":"fairseq_model.py <span style='color:#111;'> 7.04KB </span>","children":null,"spread":false},{"title":"hf_bert.py <span style='color:#111;'> 38.71KB </span>","children":null,"spread":false},{"title":"fairseq_encoder.py <span style='color:#111;'> 1.45KB </span>","children":null,"spread":false},{"title":"fairseq_incremental_decoder.py <span style='color:#111;'> 3.18KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 4.20KB </span>","children":null,"spread":false},{"title":"distributed_fairseq_model.py <span style='color:#111;'> 2.91KB </span>","children":null,"spread":false},{"title":"fairseq_decoder.py <span style='color:#111;'> 2.01KB </span>","children":null,"spread":false},{"title":"pair_bert.py <span style='color:#111;'> 42.04KB </span>","children":null,"spread":false}],"spread":true},{"title":"options.py <span style='color:#111;'> 18.57KB </span>","children":null,"spread":false},{"title":"optim","children":[{"title":"nag.py <span style='color:#111;'> 2.52KB </span>","children":null,"spread":false},{"title":"fairseq_optimizer.py <span style='color:#111;'> 3.16KB </span>","children":null,"spread":false},{"title":"adam.py <span style='color:#111;'> 5.49KB </span>","children":null,"spread":false},{"title":"bert_adam.py <span style='color:#111;'> 7.94KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"sgd.py <span style='color:#111;'> 1.03KB </span>","children":null,"spread":false},{"title":"fp16_optimizer.py <span style='color:#111;'> 6.39KB </span>","children":null,"spread":false},{"title":"adagrad.py <span style='color:#111;'> 1.12KB </span>","children":null,"spread":false},{"title":"lr_scheduler","children":[{"title":"cosine_lr_scheduler.py <span style='color:#111;'> 4.37KB </span>","children":null,"spread":false},{"title":"polynomial_decay_schedule.py <span style='color:#111;'> 2.76KB </span>","children":null,"spread":false},{"title":"reduce_lr_on_plateau.py <span style='color:#111;'> 1.71KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1.30KB </span>","children":null,"spread":false},{"title":"fairseq_lr_scheduler.py <span style='color:#111;'> 1.40KB </span>","children":null,"spread":false},{"title":"fixed_schedule.py <span style='color:#111;'> 2.35KB </span>","children":null,"spread":false},{"title":"inverse_square_root_schedule.py <span style='color:#111;'> 2.92KB </span>","children":null,"spread":false},{"title":"triangular_lr_scheduler.py <span style='color:#111;'> 2.57KB </span>","children":null,"spread":false}],"spread":false}],"spread":true},{"title":"meters.py <span style='color:#111;'> 3.73KB </span>","children":null,"spread":false},{"title":"tokenizer.py <span style='color:#111;'> 4.41KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 15.94KB </span>","children":null,"spread":false},{"title":"tasks","children":[{"title":"fairseq_task.py <span style='color:#111;'> 5.94KB </span>","children":null,"spread":false},{"title":"span_bert.py <span style='color:#111;'> 7.63KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 2.32KB </span>","children":null,"spread":false}],"spread":true},{"title":"data","children":[{"title":"dictionary.py <span style='color:#111;'> 7.28KB </span>","children":null,"spread":false},{"title":"iterators.py <span style='color:#111;'> 8.09KB </span>","children":null,"spread":false},{"title":"fairseq_dataset.py <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"masking.py <span style='color:#111;'> 12.40KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 874B </span>","children":null,"spread":false},{"title":"span_bert_dataset.py <span style='color:#111;'> 19.21KB </span>","children":null,"spread":false},{"title":"indexed_dataset.py <span style='color:#111;'> 8.63KB </span>","children":null,"spread":false},{"title":"data_utils.py <span style='color:#111;'> 6.08KB </span>","children":null,"spread":false},{"title":"no_nsp_span_bert_dataset.py <span style='color:#111;'> 9.81KB </span>","children":null,"spread":false}],"spread":true},{"title":"multiprocessing_pdb.py <span style='color:#111;'> 1.01KB </span>","children":null,"spread":false},{"title":"distributed_utils.py <span style='color:#111;'> 4.57KB </span>","children":null,"spread":false},{"title":"criterions","children":[{"title":"fairseq_criterion.py <span style='color:#111;'> 1.68KB </span>","children":null,"spread":false},{"title":"cross_entropy.py <span style='color:#111;'> 2.43KB </span>","children":null,"spread":false},{"title":"label_smoothed_cross_entropy.py <span style='color:#111;'> 3.12KB </span>","children":null,"spread":false},{"title":"mlm_loss.py <span style='color:#111;'> 2.67KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1.61KB </span>","children":null,"spread":false},{"title":"bert_loss.py <span style='color:#111;'> 3.39KB </span>","children":null,"spread":false},{"title":"spanbert_loss.py <span style='color:#111;'> 3.91KB </span>","children":null,"spread":false},{"title":"composite_loss.py <span style='color:#111;'> 2.89KB </span>","children":null,"spread":false},{"title":"mlm_nsp_sbo_loss.py <span style='color:#111;'> 4.57KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 512B </span>","children":null,"spread":false},{"title":"trainer.py <span style='color:#111;'> 14.85KB </span>","children":null,"spread":false},{"title":"legacy_distributed_data_parallel.py <span style='color:#111;'> 4.76KB </span>","children":null,"spread":false},{"title":"progress_bar.py <span style='color:#111;'> 6.91KB </span>","children":null,"spread":false},{"title":"modules","children":[{"title":"grad_multiply.py <span style='color:#111;'> 550B </span>","children":null,"spread":false},{"title":"bidirectional_multihead_attention.py <span style='color:#111;'> 5.97KB </span>","children":null,"spread":false},{"title":"learned_positional_embedding.py <span style='color:#111;'> 1.39KB </span>","children":null,"spread":false},{"title":"sinusoidal_positional_embedding.py <span style='color:#111;'> 3.71KB </span>","children":null,"spread":false},{"title":"downsampled_multihead_attention.py <span style='color:#111;'> 9.69KB </span>","children":null,"spread":false},{"title":"beamable_mm.py <span style='color:#111;'> 1.84KB </span>","children":null,"spread":false},{"title":"multihead_attention.py <span style='color:#111;'> 14.08KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1.23KB </span>","children":null,"spread":false},{"title":"adaptive_softmax.py <span style='color:#111;'> 7.41KB </span>","children":null,"spread":false},{"title":"highway.py <span style='color:#111;'> 1.82KB </span>","children":null,"spread":false},{"title":"scalar_bias.py <span style='color:#111;'> 996B </span>","children":null,"spread":false},{"title":"adaptive_input.py <span style='color:#111;'> 2.39KB </span>","children":null,"spread":false},{"title":"adaptive_inputs.py <span style='color:#111;'> 2.39KB </span>","children":null,"spread":false}],"spread":false}],"spread":false},{"title":"train.py <span style='color:#111;'> 14.19KB </span>","children":null,"spread":false},{"title":"distributed_train.py <span style='color:#111;'> 1.85KB </span>","children":null,"spread":false},{"title":"multiprocessing_train.py <span style='color:#111;'> 2.93KB </span>","children":null,"spread":false},{"title":"dict.txt <span style='color:#111;'> 374.44KB </span>","children":null,"spread":false},{"title":"preprocess.py <span style='color:#111;'> 11.07KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 1.21KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 2.12KB </span>","children":null,"spread":false},{"title":"bpe_tokenize.py <span style='color:#111;'> 2.08KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 18.88KB </span>","children":null,"spread":false},{"title":"CONTRIBUTING.md <span style='color:#111;'> 1.22KB </span>","children":null,"spread":false},{"title":"code","children":[{"title":"run_glue.py <span style='color:#111;'> 39.67KB </span>","children":null,"spread":false},{"title":"pytorch_pretrained_bert","children":[{"title":"tokenization.py <span style='color:#111;'> 16.60KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 646B </span>","children":null,"spread":false},{"title":"optimization.py <span style='color:#111;'> 7.84KB </span>","children":null,"spread":false},{"title":"file_utils.py <span style='color:#111;'> 8.98KB </span>","children":null,"spread":false},{"title":"modeling.py <span style='color:#111;'> 58.67KB </span>","children":null,"spread":false}],"spread":true},{"title":"run_tacred.py <span style='color:#111;'> 26.49KB </span>","children":null,"spread":false},{"title":"download_finetuned.sh <span style='color:#111;'> 264B </span>","children":null,"spread":false},{"title":"run_mrqa.py <span style='color:#111;'> 41.85KB </span>","children":null,"spread":false},{"title":"mrqa_official_eval.py <span style='color:#111;'> 4.87KB </span>","children":null,"spread":false},{"title":"run_squad.py <span style='color:#111;'> 49.73KB </span>","children":null,"spread":false}],"spread":true},{"title":"requirements.txt <span style='color:#111;'> 581B </span>","children":null,"spread":false},{"title":"CODE_OF_CONDUCT.md <span style='color:#111;'> 243B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 5.96KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明