Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Handling Redundancy in Data Integration

  • Redundant data occur often when integration of multiple databases
    • Object identification: The same attribute or object may have different names in different databases
    • Derivable data: One attribute may be a “derived” attribute in another table, e.g., annual revenue
  • Redundant attributes may be able to be detected by correlation analysis and covariance analysis
  • Careful integration of the data from multiple sources may help reduce/avoid redundancies and inconsistencies and improve mining speed and quality

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.