RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://developers.google.com/machine-learning/crash-course/numerical-data/feature-vectors below:

Numerical data: How a model ingests data using feature vectors | Machine Learning

Numerical data: How a model ingests data using feature vectors

Stay organized with collections Save and categorize content based on your preferences.

Until now, we've given you the impression that a model acts directly on the rows of a dataset; however, models actually ingest data somewhat differently.

For example, suppose a dataset provides five columns, but only two of those columns (b and d) are features in the model. When processing the example in row 3, does the model simply grab the contents of the highlighted two cells (3b and 3d) as follows?

Figure 1. Not exactly how a model gets its examples.

In fact, the model actually ingests an array of floating-point values called a feature vector. You can think of a feature vector as the floating-point values comprising one example.

Figure 2. Closer to the truth, but not realistic.

However, feature vectors seldom use the dataset's raw values. Instead, you must typically process the dataset's values into representations that your model can better learn from. So, a more realistic feature vector might look something like this:

Figure 3. A more realistic feature vector.

Wouldn't a model produce better predictions by training from the actual values in the dataset than from altered values? Surprisingly, the answer is no.

You must determine the best way to represent raw dataset values as trainable values in the feature vector. This process is called feature engineering, and it is a vital part of machine learning. The most common feature engineering techniques are:

Normalization: Converting numerical values into a standard range.
Binning (also referred to as bucketing): Converting numerical values into buckets of ranges.

This unit covers normalizing and binning. The next unit, Working with categorical data, covers other forms of preprocessing, such as converting non-numerical data, like strings, to floating point values.

Every value in a feature vector must be a floating-point value. However, many features are naturally strings or other non-numerical values. Consequently, a large part of feature engineering is representing non-numerical values as numerical values. You'll see a lot of this in later modules.

Key terms:

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-01-02 UTC.

[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-01-02 UTC."],[[["Models ingest data through floating-point arrays called feature vectors, which are derived from dataset features."],["Feature vectors often utilize processed or transformed values instead of raw dataset values to enhance model learning."],["Feature engineering is the crucial process of converting raw data into suitable representations for the model, encompassing techniques like normalization and binning."],["Non-numerical data like strings must be converted into numerical values for use in feature vectors, a key aspect of feature engineering."]]],[]]

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4