NLP-C1-W4: Machine Translation and Document Search
https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/4
Machine Translation Key Ida
Let be a matrix with each row as an English word, let be a matrix with each row as the corresponding French word. The translation matrix is the matrix that minimizes the squared Frobenius norm of the difference:
can be obtained by iterative update via gradient:
Using K-nearest Neighborhood for Translation
Once we have the matrix, we can transform an English word into a French word vector space. However, we need to use the K-nearest neighbor search to identify exactly which word to translate.
Locality Sensitive Hashing for Faster Search
We can define hyperplanes with a normal vector matrix . Each word can be represented with or as below or above the hyperlane . The final has value of each word can be calculated as
So, overall, we can first determine the number of hash buckets . This can be decided by considering how many items we want in each bucket and dividing the total number of items by the desired item count per bucket. Then, generate hyperplanes based on the determined number of bucket . For each set of hyperplanes, if two items have the same hash value, they are in the same bucket. We can repeat this process many times to improve accuracy.
Approximate K-nearest Neighborhood
Use multiple sets of hyperplanes to determine the final set of vectors that are potential candidates of nearest neighbors.
Flow:
Generate multiple hash tables
Given a target vector, hash it, and grab all items that are in the hash tables across all hash tables.
Run the nearest neighbor search on the reduced candidate list.
Code Example
// Some code
num_dimensions = 2
num_planes = 3
random_panes_matrix = np.random.normal(size = (num_palnes, num_dimensions ))
def side_of_plane_matrix(P,v):
dotproduct = np.dot(P,v.T)
sign_of_dot_product = np.sign(dotproduct)
return sign_of_dot_product
num_planes_matrix = side_of_plane_matrix(random_panes_matrix , v)
# then use \sum_{i=0}^n 2^i * w_i to calculate the hash value
Completed Notebook
Manual PCA
Machine Translation and Nearest Neighbor Search Notebook
LSH and approximate neighborhood
Manual Gradient Descent
Last updated