Statistical and knowledge-based methods
- What are statistical methods?
- The knowledge in the system is based on large amounts of data.
- The system learns the knowledge.
- The knowledge may be procedural, as well as declarative.
- What is learned is relatively unstructured.
- What are knowledge-based methods?
- The knowledge in the system is based on a theory of language or language processing.
- The system is programmed to have the knowledge.
- The knowledge is almost always in a declarative form.
- The knowledge may be highly structured.
Statistical methods: what's hard?
- What makes something difficult or inappropriate for statistical methods?
- It's not really a statistical phenomenon; the rules are all-or-none and inferrable from a small number of examples.
- It is inherently structural, so statistical methods are not very effective at learning it.
- There's not enough data to learn it.
- Examples
- Morphological analysis and generation
- Sentence generation
- Some aspects of syntax (especially agreement)
- Logic
Knowledge-based methods: what's hard?
- What makes something difficult or inappropriate for knowledge-based methods?
- It's only really a tendency, not all-or-none or rigid.
- People aren't aware of it, or it's difficult to put into words.
- It's very complex, involving many exceptions and context-specific rules.
- Examples
- Categories and categorization
- Word sense
- Cross-lingual word and phrase correspondences
- Collocations
- Constituent order(?)
- Valency(?)