First, the huge inflow of new multimedia items creates billions of vectors. Yet traditional databases that can be queried with SQL are not adapted to these new representations. These representations are much more powerful and flexible than a fixed symbolic representation, as we’ll explain in this post. Rows can be linked to entries from other tables as well, such as an image with people in it being linked to a table of names.ĪI tools, like text embedding (word2vec) or convolutional neural net (CNN) descriptors trained with deep learning, generate high-dimensional vectors. Each row contains information such as an image identifier and descriptive text. For example, an image collection would be represented as a table with one row per indexed photo. Traditional databases are made up of structured tables containing symbolic information. This lets us break some records, including the first k-nearest-neighbor graph constructed on 1 billion high-dimensional vectors. We’ve built nearest-neighbor search implementations for billion-scale data sets that are some 8.5x faster than the previous reported state-of-the-art, along with the fastest k-selection algorithm on the GPU known in the literature. This month, we released Facebook AI Similarity Search (Faiss), a library that allows us to quickly search for multimedia documents that are similar to each other - a challenge where traditional query search engines fall short.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |