Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Data Integration

  • Data integration:
    • Combines data from multiple sources into a coherent store
  • Schema integration: e.g., A.cust-id ≡ B.cust-#
    • Integrate metadata from different sources
  • Entity identification problem:
    • Identify real world entities from multiple data sources, e.g., Bill Clinton = William Clinton
  • Detecting and resolving data value conflicts
    • For the same real world entity, attribute values from different sources are different
    • Possible reasons: different representations, different scales, e.g., metric vs. British units

Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.