Vectors: Difference between revisions

From Freephile Wiki
Created page with "Disambiguation: The term wp:Vector has many different definitions depending on the context. In biology, a vector is an organism, often an insect or animal, that transmits a disease-causing agent (pathogen) from one host to another. A vector, as related to molecular biology, is a DNA molecule (often plasmid or virus) that is used as a vehicle to carry a particular DNA segment into a host cell as part of a cloning or recombinant DNA technique. The vector typically ass..."
 
extract to new article
 
(5 intermediate revisions by the same user not shown)
Line 6: Line 6:
In computing, a vector is a one-dimensional array data structure. The [https://vectors.nlpl.eu/ Nordic Language Processing Laboratory], [https://www.mn.uio.no/ifi/english/research/groups/ltg/ Language Technology Group] at the University of Oslo, Norway publishes their research tools which help visualize how word vectors work in LLMs For instance, here's the [https://vectors.nlpl.eu/explore/embeddings/en/MOD_enwiki_upos_skipgram_300_2_2021/cat_NOUN/ vector for cat] showing word relationships using '''English Wikipedia''' as the training corpus. (Click the link that says "Show the raw vector" to see the full numerical word vector).  
In computing, a vector is a one-dimensional array data structure. The [https://vectors.nlpl.eu/ Nordic Language Processing Laboratory], [https://www.mn.uio.no/ifi/english/research/groups/ltg/ Language Technology Group] at the University of Oslo, Norway publishes their research tools which help visualize how word vectors work in LLMs For instance, here's the [https://vectors.nlpl.eu/explore/embeddings/en/MOD_enwiki_upos_skipgram_300_2_2021/cat_NOUN/ vector for cat] showing word relationships using '''English Wikipedia''' as the training corpus. (Click the link that says "Show the raw vector" to see the full numerical word vector).  


[[wp:Vector_database|Vector databases]] like [[wp:Neo4j|Neo4j]] have been important for quite some time now. They are ever more important now that [[Artificial Intelligence]] is mainstream. A vector database is a collection of data stored as mathematical representations. Vector databases make it possible for computer programs to draw comparisons, identify relationships, and understand context. They enable '''Semantic Search''' which is search based on meaning rather than exact text matching. While semantic searching has been around for decades, tagging and ontologies have morphed into LLMs. Vector databases enable the creation of advanced artificial intelligence (AI) programs like large language models (LLMs).
[[Vector database]]s are those that are specifically designed to work with vector datasets and data types.
 
There are many open source vector databases<ref>mw:Vector_database#Implementations</ref> such as Apache Cassandra, [[Elasticsearch]], Meilisearch and MongoDB. (Apparently [[MariaDB]]<cite>https://mariadb.org/amazon-mariadb-vector/</cite> and [[PostgreSQL]]<cite>https://github.com/pgvector/pgvector</cite> offer vector capability.)  One interesting open source vector database is Memgraph. Memgraph uses the same Cypher query language as Neo4j. However, it is written in C++ and integrates better with Python than Neo4j, which uses Java to build applications. An interesting case study is how NASA is building a People Knowledge Graph with LLMs and Memgraph<cite>https://www.theregister.com/2025/05/07/nasa_people_memgraph/</cite>. {{#ev:youtube|https://www.youtube.com/watch?v=xqJhzuWAGtA}}
 


{{References}}
{{References}}
[[Category:Artificial Intelligence]]
[[Category:Artificial Intelligence]]
[[Category:Database]]
[[Category:Database]]

Latest revision as of 11:58, 10 May 2025

Disambiguation: The term wp:Vector has many different definitions depending on the context.

In biology, a vector is an organism, often an insect or animal, that transmits a disease-causing agent (pathogen) from one host to another. A vector, as related to molecular biology, is a DNA molecule (often plasmid or virus) that is used as a vehicle to carry a particular DNA segment into a host cell as part of a cloning or recombinant DNA technique. The vector typically assists in replicating and/or expressing the inserted DNA sequence inside the host cell.[1] So, while understanding vectors in biology is important for disease prevention, understanding vectors in domains such as math or computing are altogether different.

Computing

In computing, a vector is a one-dimensional array data structure. The Nordic Language Processing Laboratory, Language Technology Group at the University of Oslo, Norway publishes their research tools which help visualize how word vectors work in LLMs For instance, here's the vector for cat showing word relationships using English Wikipedia as the training corpus. (Click the link that says "Show the raw vector" to see the full numerical word vector).

Vector databases are those that are specifically designed to work with vector datasets and data types.

References