| 149 | |
| 150 | Presentation on a proposed Active Learning method to reduce the number of training examples an algorithm requires for machine learning, as annotation of examples is slow and expensive. |
| 151 | |
| 152 | The normal method for training an NLP algorithm is to randomly select a potion of the data to annotate. This method proposes initially annotating a small number of random samples and selecting subsequent samples to annotate based on the algorithm's output for that sample having a low ''prediction margin''. |
| 153 | |
| 154 | '''Prediction Margin = confidence for class A - confidence for class B''' |
| 155 | |
| 156 | In other words, how sure it is that the best answer is correct. |
| 157 | |
| 158 | The presentation showed that in general (but not for every example) the Active Learning method needed fewer annotated examples to reach a high level of confidence. |
| 159 | |