Technology Encyclopedia Home >What is a vector in machine learning?

What is a vector in machine learning?

In machine learning, a vector is a fundamental mathematical object that represents a list of numbers, typically used to encode data points, features, or quantities in a structured format. Vectors are essential for performing operations like calculations, transformations, and comparisons in algorithms such as linear regression, neural networks, and support vector machines.

A vector is essentially an ordered array of numerical values, often arranged in a single row (row vector) or a single column (column vector). Each value in the vector corresponds to a specific dimension or feature. For example, in a 2-dimensional space, a vector could represent a point with x and y coordinates, like [x, y].

Key Characteristics:

  • Dimensionality: The number of elements in a vector defines its dimension. A vector with n elements is called an n-dimensional vector.
  • Operations: Common operations include addition, subtraction, dot product, and scalar multiplication, which are foundational in training models and representing data geometrically.
  • Usage: Vectors are used to represent input data, weights in neural networks, embeddings, and more.

Example:

Suppose you want to represent a house for a real estate price prediction model. You might use a feature vector like:

[1500, 3, 2, 1] 

Here, this 4-dimensional vector could represent:

  • 1500: square footage
  • 3: number of bedrooms
  • 2: number of bathrooms
  • 1: whether it has a garage (1 for yes, 0 for no)

This vector is then fed into a machine learning model to predict the house's price.

In many machine learning applications, especially those involving large-scale data processing or high-dimensional computations, using cloud-based infrastructure can significantly enhance performance and scalability. For such needs, services like Tencent Cloud's Vector Database (Tencent Cloud VectorDB) are designed to efficiently store, index, and query high-dimensional vectors, making them ideal for AI applications like recommendation systems, semantic search, and image retrieval. These services help manage the computational demands of handling millions or even billions of vectors seamlessly.