Stay organized with collections Save and categorize content based on your preferences.
The ML.ADVANCED_WEIGHTS functionThis document describes the ML.ADVANCED_WEIGHTS
function, which lets you see the underlying weights that a linear or binary logistic regression model uses during prediction, along with the associated p-values and standard errors for that weight. ML.ADVANCED_WEIGHTS
is an extended version of ML.WEIGHTS
for linear and binary logistic regression models.
You can only use ML.ADVANCED_WEIGHTS
on linear and binary logistic regression models that are trained with the following option settings:
CALCULATE_P_VALUES
value is TRUE
.CATEGORY_ENCODING_METHOD
value is DUMMY_ENCODING
.L1_REG
value is 0
.It's common to require standard errors or p-values for either the regression coefficients or other estimated quantities for these penalized regression methods. In principle, such standard errors can be calculated—for example, using the bootstrap. In practice, this calculation isn't done for reasons that the authors of the R package explain as follows:
Multiclass logistic regression models aren't supported.
SyntaxML.ADVANCED_WEIGHTS( MODEL `PROJECT_ID.DATASET.MODEL`, STRUCT( [STANDARDIZE AS standardize]))Arguments
ML.ADVANCED_WEIGHTS
takes the following arguments:
PROJECT_ID
: your project ID.DATASET
: the BigQuery dataset that contains the model.MODEL
: the name of the model.STANDARDIZE
: a BOOL
value that specifies whether the model weights should be standardized to assume that all features have a mean of zero and a standard deviation of one. Standardizing the weights allows the absolute magnitude of the weights to be compared to each other. The default value is FALSE
.ML.ADVANCED_WEIGHTS
returns the following columns:
processed_input
: a STRING
value that contains the name of the feature column. The value of this column is the name of the feature column that's provided in the query_statement
clause used during model training. If the feature is non-numeric, then there are multiple rows with the same processed_input
value, one for each category of the feature.category
: a STRING
value that contains the category name if the column identified in the processed_input
value is non-numeric. Returns a NULL
value for numeric columns.weight
: a FLOAT64
value that contains the weight of each feature.standard_error
: a FLOAT64
value that contains the standard error of the weight.p_value
: a FLOAT64
value that contains the p-value that was tested against the null hypothesis. The p-value for feature $j$ is calculated using the following formula:
$$ p(j) = 2 * (1 - stats.norm.cdf(abs(\hat\beta_j), loc=0, scale=\sigma_j)) $$
such that $\hat\beta_j$ is the weight of feature $j$ after training and $\sigma_j$ is its standard error.
If the TRANSFORM
clause was used in the CREATE MODEL
statement that created the model, ML.ADVANCED_WEIGHTS
outputs the weights of the TRANSFORM
output features. The weights are denormalized by default, with the option to get normalized weights, exactly like models that are created without TRANSFORM
.
You must have the bigquery.models.create
andbigquery.models.getData
Identity and Access Management (IAM) permissions in order to run ML.ADVANCED_WEIGHTS
.
The total cardinality of training features must be less than 1,000. This limitation is the result of the limitations of computing p-values and standard error when you set the CALCULATE_P_VALUES
option to TRUE
when training the model.
The following examples demonstrate ML.ADVANCED_WEIGHTS
with and without standardization.
The following example retrieves weight information from mymodel
in mydataset
where the dataset is in your default project.
The query returns the weights associated with each one-hot encoded category for the input column input_col
.
SELECT * FROM ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`, STRUCT(FALSE AS standardize))Note: Because un-standardizing the standard error for the intercept column is computationally expensive, the standard error and p-value aren't provided. If the standard error and p-value for the intercept are required, then set the
STANDARDIZE
argument to TRUE
. With standardization
The following example retrieves weight information from mymodel
in mydataset
. The dataset is in your default project.
The query retrieves standardized weights, which assume all features have a mean of 0
and a standard deviation of 1.0
.
SELECT * FROM ML.ADVANCED_WEIGHTS(MODEL `mydataset.mymodel`, STRUCT(TRUE AS standardize))What's next
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-07 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["The `ML.ADVANCED_WEIGHTS` function provides detailed insights into the underlying weights of linear or binary logistic regression models, including p-values and standard errors."],["This function is an extension of `ML.WEIGHTS` and is specifically designed for linear and binary logistic regression models trained with specific settings, such as `CALCULATE_P_VALUES` set to `TRUE` and `CATEGORY_ENCODING_METHOD` set to `DUMMY_ENCODING`."],["`ML.ADVANCED_WEIGHTS` outputs information such as `processed_input`, `category`, `weight`, `standard_error`, and `p_value` for each feature, offering a comprehensive analysis of the model's components."],["The function allows the option to standardize the weights, enabling the comparison of the absolute magnitude of the weights by assuming all features have a mean of zero and a standard deviation of one."],["There are limitations to the function, such as it only supports models that meet the usage requirements specified, and a cardinality limit of under 1000 training features."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4