spark-iforest:星火上的隔离森林

上传者: 42099116 | 上传时间: 2022-05-03 16:15:24 | 文件大小: 46KB | 文件类型: ZIP
星火森林 隔离林(iForest)是关注异常隔离的有效模型。 iForest使用树结构来对数据建模,与正常点相比,iTree隔离离树根更近的异常。 通过iForest模型计算异常分数,以测量数据实例的异常。 越高,越异常。 有关iForest的更多详细信息,请参见以下论文:[1]和[2]。 我们在Spark上设计并实现了分布式iForest,该iForest通过基于模型的并行性进行训练,并通过基于数据的并行性来预测新的数据集。 它通过以下步骤实现: 从数据集中采样数据。 为每个iTree采样数据实例并将其分组。 如该论文所述,用于构建每棵树的样本数量通常不是很大(默认值256)。 因此,我们可以构造采样对RDD,其中每个行键是树索引,行值是一组树的采样数据实例。 通过地图操作并行训练和构建每个iTree,并收集所有iTree来构建iForest模型。 使用收集的iForest模

文件下载

资源详情

[{"title":"( 23 个子文件 46KB ) spark-iforest:星火上的隔离森林","children":[{"title":"spark-iforest-master","children":[{"title":".gitignore <span style='color:#111;'> 40B </span>","children":null,"spread":false},{"title":"data","children":[{"title":"anomaly-detection","children":[{"title":"breastw.csv <span style='color:#111;'> 19.66KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"src","children":[{"title":"main","children":[{"title":"scala","children":[{"title":"org","children":[{"title":"apache","children":[{"title":"spark","children":[{"title":"ml","children":[{"title":"iforest","children":[{"title":"IForest.scala <span style='color:#111;'> 32.94KB </span>","children":null,"spread":false},{"title":"IFNode.scala <span style='color:#111;'> 466B </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"examples","children":[{"title":"ml","children":[{"title":"IForestExample.scala <span style='color:#111;'> 2.37KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true}],"spread":true}],"spread":true}],"spread":true}],"spread":true},{"title":"test","children":[{"title":"resources","children":[{"title":"log4j.properties <span style='color:#111;'> 575B </span>","children":null,"spread":false}],"spread":true},{"title":"scala","children":[{"title":"org","children":[{"title":"apache","children":[{"title":"spark","children":[{"title":"ml","children":[{"title":"iforest","children":[{"title":"IForestSuite.scala <span style='color:#111;'> 8.96KB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"SparkFunSuite.scala <span style='color:#111;'> 2.37KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true}],"spread":true}],"spread":true}],"spread":true},{"title":".travis.yml <span style='color:#111;'> 67B </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 11.08KB </span>","children":null,"spread":false},{"title":"pom.xml <span style='color:#111;'> 5.02KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 10.79KB </span>","children":null,"spread":false},{"title":"python","children":[{"title":"setup.py <span style='color:#111;'> 1.20KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 244B </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 11.08KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 684B </span>","children":null,"spread":false},{"title":"requiremets.txt <span style='color:#111;'> 35B </span>","children":null,"spread":false},{"title":"pyspark_iforest","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"ml","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"iforest.py <span style='color:#111;'> 11.00KB </span>","children":null,"spread":false},{"title":"util.py <span style='color:#111;'> 1.57KB </span>","children":null,"spread":false}],"spread":false},{"title":"example","children":[{"title":"iforest_example.py <span style='color:#111;'> 1.35KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false}],"spread":false}],"spread":false}],"spread":true}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明