Stay organized with collections Save and categorize content based on your preferences.
Generate text by using a Gemini model and the ML.GENERATE_TEXT functionThis tutorial shows you how to create a remote model that's based on the gemini-2.0-flash
model, and then how to use that model with the ML.GENERATE_TEXT
function to extract keywords from and perform sentiment analysis on movie reviews from the bigquery-public-data.imdb.reviews
public table.
To run this tutorial, you need the following Identity and Access Management (IAM) roles:
roles/bigquery.admin
).roles/resourcemanager.projectIamAdmin
).These predefined roles contain the permissions required to perform the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:
Required permissionsbigquery.datasets.create
bigquery.connections.*
bigquery.config.*
resourcemanager.projects.getIamPolicy
and resourcemanager.projects.setIamPolicy
bigquery.jobs.create
bigquery.models.create
bigquery.models.getData
bigquery.models.updateData
bigquery.models.updateMetadata
You might also be able to get these permissions with custom roles or other predefined roles.
CostsIn this document, you use the following billable components of Google Cloud:
To generate a cost estimate based on your projected usage, use the pricing calculator.
New Google Cloud users might be eligible for a
free trial.
For more information about BigQuery pricing, see BigQuery pricing in the BigQuery documentation.
For more information about Vertex AI pricing, see the Vertex AI pricing page.
Before you beginIn the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Verify that billing is enabled for your Google Cloud project.
Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.
Create a BigQuery dataset to store your ML model.
ConsoleIn the Google Cloud console, go to the BigQuery page.
In the Explorer pane, click your project name.
Click more_vert View actions > Create dataset.
On the Create dataset page, do the following:
For Dataset ID, enter bqml_tutorial
.
For Location type, select Multi-region, and then select US (multiple regions in United States).
Leave the remaining default settings as they are, and click Create dataset.
To create a new dataset, use the bq mk
command with the --location
flag. For a full list of possible parameters, see the bq mk --dataset
command reference.
Create a dataset named bqml_tutorial
with the data location set to US
and a description of BigQuery ML tutorial dataset
:
bq --location=US mk -d \ --description "BigQuery ML tutorial dataset." \ bqml_tutorial
Instead of using the --dataset
flag, the command uses the -d
shortcut. If you omit -d
and --dataset
, the command defaults to creating a dataset.
Confirm that the dataset was created:
bq ls
Call the datasets.insert
method with a defined dataset resource.
{ "datasetReference": { "datasetId": "bqml_tutorial" } }BigQuery DataFrames
Before trying this sample, follow the BigQuery DataFrames setup instructions in the BigQuery quickstart using BigQuery DataFrames. For more information, see the BigQuery DataFrames reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up ADC for a local development environment.
Create the remote modelCreate a remote model that represents a hosted Vertex AI model:
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following statement:
CREATE OR REPLACE MODEL `bqml_tutorial.gemini_model` REMOTE WITH CONNECTION DEFAULT OPTIONS (ENDPOINT = 'gemini-2.0-flash');
The query takes several seconds to complete, after which the model gemini_model
appears in the bqml_tutorial
dataset in the Explorer pane. Because the query uses a CREATE MODEL
statement to create a model, there are no query results.
Perform keyword extraction on IMDB movie reviews by using the remote model and the ML.GENERATE_TEXT
function:
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement to perform keyword extraction on five movie reviews:
SELECT ml_generate_text_result['candidates'][0]['content'] AS generated_text, * EXCEPT (ml_generate_text_result) FROM ML.GENERATE_TEXT( MODEL `bqml_tutorial.gemini_model`, ( SELECT CONCAT('Extract the key words from the text below: ', review) AS prompt, * FROM `bigquery-public-data.imdb.reviews` LIMIT 5 ), STRUCT( 0.2 AS temperature, 100 AS max_output_tokens));
The output is similar to the following, with non-generated columns omitted for clarity:
+----------------------------------------+-------------------------+----------------------------+-----+ | generated_text | ml_generate_text_status | prompt | ... | +----------------------------------------+-------------------------+----------------------------+-----+ | {"parts":[{"text":"## Key words:\n\n* | | Extract the key words from | | | **Negative sentiment:** \"terribly | | the text below: I had to | | | bad acting\", \"dumb story\", \"not | | see this on the British | | | even a kid would enjoy this\", | | Airways plane. It was | | | \"something to switch off\"\n* | | terribly bad acting and | | | **Context:** \"British Airways plane\" | | a dumb story. Not even | | | \n* **Genre:** \"movie\" (implied)... | | a kid would enjoy this... | | +----------------------------------------+-------------------------+----------------------------+-----+ | {"parts":[{"text":"## Key words:\n\n* | | Extract the key words from | | | **Movie:** The Real Howard Spitz\n* | | the text below: This is | | | **Genre:** Family movie\n* | | a family movie that was | | | **Broadcast:** ITV station, 1.00 am\n* | | broadcast on my local | | | **Director:** Vadim Jean\n* | | ITV station at 1.00 am a | | | **Main character:** Howard Spitz, | | couple of nights ago. | | | a children's author who hates... | | This might be a strange... | | +----------------------------------------+-------------------------+----------------------------+-----+
The results include the following columns:
generated_text
: the generated text.ml_generate_text_status
: the API response status for the corresponding row. If the operation was successful, this value is empty.prompt
: the prompt that is used for the sentiment analysis.bigquery-public-data.imdb.reviews
table.Optional: Instead of manually parsing the JSON returned by the function, as you did in the previous step, use the flatten_json_output
argument to return the generated text and the safety attributes in separate columns.
In the query editor, run the following statement:
SELECT * FROM ML.GENERATE_TEXT( MODEL `bqml_tutorial.gemini_model`, ( SELECT CONCAT('Extract the key words from the text below: ', review) AS prompt, * FROM `bigquery-public-data.imdb.reviews` LIMIT 5 ), STRUCT( 0.2 AS temperature, 100 AS max_output_tokens, TRUE AS flatten_json_output));
The output is similar to the following, with non-generated columns omitted for clarity:
+----------------------------------------+----------------------------------------------+-------------------------+----------------------------+-----+ | ml_generate_text_llm_result | ml_generate_text_rai_result | ml_generate_text_status | prompt | ... | +----------------------------------------+----------------------------------------------+-------------------------+----------------------------+-----+ | ## Keywords: | | | Extract the key words from | | | | | | the text below: I had to | | | * **Negative sentiment:** | | | see this on the British | | | "terribly bad acting", "dumb | | | Airways plane. It was | | | story", "not even a kid would | | | terribly bad acting and | | | enjoy this", "switch off" | | | a dumb story. Not even | | | * **Context:** "British | | | a kid would enjoy this... | | +----------------------------------------+----------------------------------------------+-------------------------+----------------------------+-----+ | ## Key words: | | | Extract the key words from | | | | | | the text below: This is | | | * **Movie:** The Real Howard Spitz | | | a family movie that was | | | * **Genre:** Family movie | | | broadcast on my local | | | * **Broadcast:** ITV, 1.00 | | | ITV station at 1.00 am a | | | am | | | couple of nights ago. | | | - ... | | | This might be a strange... | | +----------------------------------------+----------------------------------------------+-------------------------+----------------------------+-----+
The results include the following columns:
ml_generate_text_llm_result
: the generated text.ml_generate_text_rai_result
: the safety attributes, along with information about whether the content is blocked due to one of the blocking categories. For more information about the safety attributes, see Configure safety filters.ml_generate_text_status
: the API response status for the corresponding row. If the operation was successful, this value is empty.prompt
: the prompt that is used for the keyword extraction.bigquery-public-data.imdb.reviews
table.Perform sentiment analysis on IMDB movie reviews by using the remote model and the ML.GENERATE_TEXT
function:
In the Google Cloud console, go to the BigQuery page.
In the query editor, run the following statement to perform sentiment analysis on five movie reviews:
SELECT ml_generate_text_result['candidates'][0]['content'] AS generated_text, * EXCEPT (ml_generate_text_result) FROM ML.GENERATE_TEXT( MODEL `bqml_tutorial.gemini_model`, ( SELECT CONCAT( 'perform sentiment analysis on the following text, return one the following categories: positive, negative: ', review) AS prompt, * FROM `bigquery-public-data.imdb.reviews` LIMIT 5 ), STRUCT( 0.2 AS temperature, 100 AS max_output_tokens));
The output is similar to the following, with non-generated columns omitted for clarity:
+--------------------------------------------+-------------------------+----------------------------+-----+ | generated_text | ml_generate_text_status | prompt | ... | +--------------------------------------------+-------------------------+----------------------------+-----+ | {"parts":[{"text":"## Sentiment Analysis: | | perform sentiment analysis | | | Negative \n\nThis text expresses a | | on the following text, | | | strongly negative sentiment towards the | | return one the following | | | movie. Here's why:\n\n* **Negative | | negative: I had to see | | | like \"terribly,\" \"dumb,\" and | | this on the British | | | \"not even\" to describe the acting... | | Airways plane. It was... | | +--------------------------------------------+-------------------------+----------------------------+-----+ | {"parts":[{"text":"## Sentiment Analysis: | | perform sentiment analysis | | | Negative \n\nThis review expresses a | | on the following text, | | | predominantly negative sentiment towards | | return one the following | | | the movie \"The Real Howard Spitz.\" | | categories: positive, | | | Here's why:\n\n* **Criticism of the film's | | negative: This is a family | | | premise:** The reviewer finds it strange | | movie that was broadcast | | | that a film about a children's author... | | on my local ITV station... | | +--------------------------------------------+-------------------------+----------------------------+-----+
The results include the same columns documented for Perform keyword extraction.
appspot.com
URL, delete selected resources inside the project instead of deleting the whole project.If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-14 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-14 UTC."],[[["This tutorial demonstrates creating a remote model based on the `gemini-1.5-flash-002` model in Vertex AI."],["The `ML.GENERATE_TEXT` function is utilized with the remote model to analyze movie reviews from the `bigquery-public-data.imdb.reviews` public table, covering keyword extraction and sentiment analysis."],["Creating and using the remote model requires specific IAM permissions for dataset creation, connection creation, model creation, and inference execution."],["Costs are incurred for BigQuery ML data processing and Vertex AI service calls, with details available on the pricing calculator."],["The document provides step-by-step instructions for creating a dataset, creating a connection, granting permissions, creating the remote model, performing keyword extraction, and sentiment analysis, along with cleanup procedures."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4