A GENETIC PROGRAMMING APPROACH TO RECORD DEDUPLICATION PDF
In this article we are going to discuss about how genetic programming can be used for record deduplication. Several systems that rely on the integrity of the data. GP-based approach we proposed to record deduplication by performing a comprehensive Keywords: Genetic Programming, DBMS, Duplication, Optimisation. Request PDF on ResearchGate | A Genetic Programming Approach to Record Deduplication | Several systems that rely on consistent data to.
|Published (Last):||9 March 2008|
|PDF File Size:||15.39 Mb|
|ePub File Size:||3.74 Mb|
|Price:||Free* [*Free Regsitration Required]|
From This Paper Topics from this paper.
ElmagarmidPanagiotis G. Improving efficiency and reducing capacity requirements. International Journal of Engineering and Computer Science2 Starting from the non duplicate reocord set, the two different classifiers, a Weighted Component Similarity Summing Classifier WCSS is used to knowing the duplicate records from the non duplicate record and presently a genetic programming GP approach to record deduplication.
Quick jump to page content. Effective method E-commerce Time complexity Data computing.
UDD, which for a given query, can effectively identify duplicates from the query result records of different web databases. A Survey Ahmed K.
Suresh Babu Published In this article we are going to discuss about how genetic programming can be used for record deduplication. Downloads Download data is not yet available.
The aim behind is to create a flexible and effective method that uses Data Mining algorithms. Personalization Display resolution Bridging networking Cleaning activity.
A Genetic Programming Approach for Record Deduplication
An analysis of the behavior of a class of genetic adaptive systems. In the existing system aims at providing Unsupervised Duplication Detection method which can be used to identify and remove the duplicate records from different data storge.
Home Archives Vol 2 No 06 Since record deduplication is a time taking task even for small repositories, the aim is to foster a method that finds a proper combination of the proper pieces of attribute with similarity function, thus yielding a deduplication function that maximizes performance using a small representative portion of the corresponding data for training purposes.
Vol 2 No 06 Page No.: Is you data dirty?
A Genetic Programming Approach for Record Deduplication – Semantic Scholar
The approach joins several different pieces of attribute with similarity function extracted from the data content to produce a deduplication function that is able to identify whether two or more entries in a repository are replicas or not. Skip to search form Skip to main content. Chitra DeviS. References Publications referenced by this paper. Several systems that rely on the integrity of the data in order to offer high quality deudplication, such as digital libraries and ecommerce brokers, may be affected by the existence of duplicates, quasi-replicas, or near-duplicates entries in their repositories.
IpeirotisVassilios S. Showing of 18 references. But the optimization of ho is less.
Genetic programming Data deduplication Repository Digital library. Citations Publications citing this paper.