#ml

Beautiful and systematic derivation showing how and why negative sampling works

Negative sampling is a great technique to estimate the softmax especially when the calculation of the partition function is intractable. It's used in word2vec, and many other models such as node2vec.



Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv [cs.CL]. 2014. Available: http://arxiv.org/abs/1402.3722
 
 
Back to Top