Embeddings as Dirichlet counts: Attention is the tip of the iceberg.

Summary: Imagine words as points on a giant map inside a computer's brain. Even though we speak or write one word at a time, computers understand language by looking at the distance between these points on the map. This paper explains that these "word maps" make perfect sense because language connects to a huge system of ideas in our minds. It also shows that these points act like math tools to guess how often certain words show up together.

Tags

Ice Cover