局部敏感哈希
2021-10-01 21:17:00 212KB 原始LSH
1
The proliferation of information housed in computerized domains makes it vital to find tools to search these resources efficiently and effectively. Ordinary retrieval techniques are inadequate because sorting is simply impossible. Consequently, proximity searching has become a fundamental computation task in a variety of application areas. Similarity Search focuses on the state of the art in developing index structures for searching the metric space. Part I of the text describes major theoretical principles, and provides an extensive survey of specific techniques for a large range of applications. Part II concentrates on approaches particularly designed for searching in large collections of data. After describing the most popular centralized disk-based metric indexes, approximation techniques are presented as a way to significantly speed up search time at the cost of some imprecision in query results. Finally, the scalable and distributed metric structures are discussed.
2021-08-10 16:00:00 11.61MB Similarity Search Metric Space
1
深度哈希 DeepHash是一种轻量级的深度学习哈希库,它实现了最新的深度哈希/量化算法。 我们将根据我们发布的持续实施更具代表性的深度哈希模型。 具体来说,我们欢迎其他研究人员根据我们的框架在该工具包中提供深层哈希模型。 我们将宣布对该项目的贡献。 实施的模型包括: DQN:,曹Yue,龙明生,王建民,韩涵,温庆福,AAAI人工智能大会(AAAI),2016 DHN:,韩涵,龙明生,王建民,曹跃,AAAI人工智能大会(AAAI),2016 DVSQ:,曹悦,龙明胜,王建民,刘诗辰,IEEE计算机视觉与模式识别会议(CVPR),2017 DCH: ,曹悦,龙明生,刘斌,王建民,IEEE计算机视觉与模式识别会议(CVPR),2018 DTQ: ,刘斌,曹岳,龙明生,王建民,王敬东,ACM多媒体(ACMMM),2018 注意:DTQ和DCH已更新,而DQN,DHN,DVSQ可能已
1
Part I Metric Searching in a Nutshell Overview 3 1. FOUNDATIONS OF METRIC SPACE SEARCHING 5 1 The Distance Searching Problem 6 2 The Metric Space 8 3 Distance Measures 9 3.1 Minkowski Distances 10 3.2 Quadratic Form Distance 11 3.3 Edit Distance 12 3.4 Tree Edit Distance 13 3.5 Jaccard’s Coefficient 13 3.6 Hausdorff Distance 14 3.7 Time Complexity 14 4 Similarity Queries 15 4.1 Range Query 15 4.2 Nearest Neighbor Query 16 4.3 Reverse Nearest Neighbor Query 17 4.4 Similarity Join 17 4.5 Combinations of Queries 18 4.6 Complex Similarity Queries 18 5 Basic Partitioning Principles 20 5.1 Ball Partitioning 20 5.2 Generalized Hyperplane Partitioning 21 5.3 Excluded Middle Partitioning 21 5.4 Extensions 21 6 Principles of Similarity Query Execution 22 6.1 Basic Strategies 22 6.2 Incremental Similarity Search 25 7 Policies for Avoiding Distance Computations 26 7.1 Explanatory Example 27 7.2 Object-Pivot Distance Constraint 28 7.3 Range-Pivot Distance Constraint 30 7.4 Pivot-Pivot Distance Constraint 31 7.5 Double-Pivot Distance Constraint 33 7.6 Pivot Filtering 34 8 Metric Space Transformations 35 8.1 Metric Hierarchies 36 8.1.1 Lower-Bounding Functions 36 8.2 User-Defined Metric Functions 38 8.2.1 Searching Using Lower-Bounding Functions 38 8.3 Embedding Metric Space 39 8.3.1 Embedding Examples 39 8.3.2 Reducing Dimensionality 40 9 Approximate Similarity Search 41 9.1 Principles 41 9.2 Generic Algorithms 44 9.3 Measures of Performance 46 9.3.1 Improvement in Efficiency 46 9.3.2 Precision and Recall 46 9.3.3 Relative Error on Distances 48 9.3.4 Position Error 49 10 Advanced Issues 50 10.1 Statistics on Metric Datasets 51 10.1.1 Distribution and Density Functions 51 10.1.2 Distance Distribution and Density 52 10.1.3 Homogeneity of Viewpoints 54 10.2 Proximity of Ball Regions 55 10.3 Performance Prediction 58 Contents ix 10.4 Tree Quality Measures 60 10.5 Choosing Reference Points 63 2. SURVEY OF EXISTING APPROACHES 67 1 Ball Partitioning Methods 67 1.1 Burkhard-Keller Tree 6
2019-12-21 20:21:18 11.65MB 相似性 搜索 查找 尺度空间方法
1