神经机器翻译模型,用于将阿塞拜疆语翻译成英语。 在这个项目中,我发现了如何开发一种神经机器翻译系统来将阿塞拜疆语翻译成英语。 我使用阿塞拜疆语到英语术语的数据集作为语言学习卡片的基础。 该数据集可从ManyThings.org网站获得,其示例摘自Tatoeba项目。 清除文本数据后,就可以进行建模和定义了。 我已经在这个问题上使用了编解码器LSTM模型。 在这种架构中,输入序列由称为编码器的前端模型编码,然后由称为解码器的后端模型逐字解码。 使用有效的Adam方法对模型进行训练,以实现随机梯度下降,并最大程度地减少了分类损失函数,因为我们将预测问题构造为多类分类。 还创建了模型图,为模型配置提供了另一个视角。 接下来,对模型进行训练。 在现代CPU硬件上,每个时期大约需要30秒。 无需GPU。 然后,我们可以对数据集中的每个源短语重复此操作,并将预测结果与英语中的预期目标短语进行比
2021-09-25 21:42:45 1.55MB tensorflow neural-machine-translation Python
1
Machine Translation(自然语言处理 NLP)
2021-08-20 01:37:51 2.19MB 机器翻译 自然语言处理 NLP
迷你seq2seq 在PyTorch中注意神经机器翻译的最小Seq2Seq模型。 此实现重点在于以下功能: 用于其他项目的模块化结构 最少的代码可读性 充分利用批次和GPU。 此实现依赖于来最大程度地减少数据集管理和预处理部分。 型号说明 编码器:双向GRU 解码器:具有注意机制的GRU 注意: 要求 GPU和CUDA Python3 火炬 火炬文本 空间 麻木 智慧(可选) 通过这样做下载令牌生成器: python -m spacy download de python -m spacy download en 参考文献 基于以下实现
1
The field of machine translation has recently been energized by the emergence of statistical techniques, which have brought the dream of automatic language translation closer to reality. This class-tested textbook, authored by an active researcher in the field, provides a gentle and accessible introduction to the latest methods and enables the reader to build machine translation systems for any language pair. It provides the necessary grounding in linguistics and probabilities, and covers the major models for machine translation: word-based, phrasebased, and tree-based, as well as machine translation evaluation, language modeling, discriminative training and advanced methods to integrate linguistic annotation. The book reports on the latest research and outstanding challenges, and enables novices as well as experienced researchers to make contributions to the field. It is ideal for students at undergraduate and graduate level, or for any reader interested in the latest developments in machine translation. P H I L I P P KOEHN is a lecturer in the School of Informatics at the University of Edinburgh. He is the scientific coordinator of the European EuroMatrix project and is also involved in research funded by DARPA in the USA. He has also collaborated with leading companies in the field, such as Systran and Asia Online. He implemented the widely used decoder Pharaoh, and is leading the development of the open source machine translation toolkit Moses.
2019-12-21 21:02:03 5.45MB 机器翻译 剑桥出版社 koehn
1