Current Slide

Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.

Normalization: other languages

  • Normalization of things like date forms

    • 7月30日 vs. 7/30
    • Japanese use of kana vs. Chinese characters
  • Tokenization and normalization may depend on the language and so is intertwined with language detection

  • Crucial: Need to “normalize” indexed text as well as query terms into the same form


Speaker notes:

Content Tools

Sources

There are currently no sources for this slide.