Month: December 2017

Char2vec – Character embeddings for word similarity

Most of my applied data science work is in text heavy domains where the objects are small, there isn’t a clear “vocabulary”, and most of the tasks focus on similarity. My go to tool is almost always cosine similarity,¬†although other metrics such as Levenshtein or character n-grams also feature heavily. The reason these tools are …

Continue reading "Char2vec – Character embeddings for word similarity"