用于对文本进行实体识别、语义标注的软件和源码的文档。
2023-03-15 22:29:18 283KB 信息抽取 非结构化 中文 分析
1
中文文本分类语料库
2023-03-04 20:51:30 113.53MB 中文文本分类
1
中文人名语料库(Chinese-Names-Corpus) 业余项目“萌名NameMoe(一个基于语料库技术的取名工具)”的副产品。 萌名手机网页测试版: ,欢迎体验。 不定期更新。只删词,不加词。 可用于中文分词、人名识别。 请勿将本库打包上传其他网站挣积分,已上传的请配合删除,谢谢! 中文常见人名(Chinese_Names_Corpus) 数据大小:120万。 语料来源:从亿级人名语料中提取。 数据清洗:已清洗,但仍存有少量badcase。 新增人名生成器。 中文古代人名(Ancient_Names_Corpus) 数据大小:25万。 语料来源:多个人名词典汇总。 数据清洗:已清洗。 中文姓氏(Chinese_Family_Name) 数据大小:1千。 语料来源:从亿级人名语料中提取。 数据清洗:已清洗。 中文称呼(Chinese_Relationship) 数据大小:5千,称呼词根
2023-02-23 16:26:55 17.62MB corpus names dataset dict
1
是一个由58k条精心挑选的评论组成的语料库,从Reddit网站上提取了27种情绪类别或中性情绪,并配有人工注释。其中包含了训练测试验证分割测试数据集的大小5,427。验证数据集的大小5,426。情感的分类是钦佩、娱乐、愤怒、烦恼、认可、关心、困惑、好奇、渴望、失望、反对、厌恶、尴尬、兴奋、恐惧、感激、悲伤、快乐、爱、紧张、乐观、骄傲、实现、宽慰、悔恨、悲伤、惊讶。
2022-12-18 18:28:28 17.6MB 语料库 数据集 评论 深度学习
多体裁NLI语料库,这是the Multi-genre NLI(多流派NLI)语料库的1.0发行版。许可信息和语料库的详细描述包含在附带的PDF中 多体裁NLI语料库,这是the Multi-genre NLI(多流派NLI)语料库的1.0发行版。许可信息和语料库的详细描述包含在附带的PDF中
2022-12-12 11:29:17 108.37MB 数据集 深度学习 语料库
多种语言歌词分类数据集,超过290.000个标签歌词数据样本 多种语言歌词分类数据集,超过290.000个标签歌词数据样本 多种语言歌词分类数据集,超过290.000个标签歌词数据样本
2022-12-12 11:29:17 102.53MB 数据集 语料库 歌词 分类
语音语料库_part_1 TRAIN DR1 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:48 40.05MB 音频数据集
1
语音语料库_part_2 TRAIN DR2 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:48 80.61MB 音频数据集
1
语音语料库_part_3 TRAIN DR3 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:47 80.92MB 音频数据集
1
语音语料库_part_4 TRAIN DR4 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:47 74.42MB 音频数据集
1