A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://datastation.multiprocess.io/docs/tutorials/Query_AWS_Athena_with_DataStation.html below:

Querying AWS Athena with DataStation

Querying AWS Athena with DataStation

Requires DataStation 0.8.0+.

Database initialization [Optional]

If you want to follow along with this tutorial verbatim, create two S3 buckets. One is for storing data we will query. The other is for storing the query result. Athena requires you to give it a bucket for storing results.

My two buckets are called s3://datastation-tests and s3://datastation-test-results respectively. You'll have to pick your own names.

Data

Create a csv and upload it to the bucket for storing data we will query:

$ cat users.csv
name,age
Garry,43
Mina,39
Karl,50
Nile,41
Emma,52
$ aws s3 cp users.csv s3://datastation-tests

Swap out datastation-tests with the name of your bucket.

Athena definition

Now go to Athena and create a new table and database.

The create table statement for this file is:

CREATE EXTERNAL TABLE `basic_users`(
  `name` string, 
  `age` int)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://datastation-tests'
TBLPROPERTIES (
  'has_encrypted_data'='false', 
  'transient_lastDdlTime'='1646067804')
API access

Finally, make sure you have an AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with all relevant permissions.

If you are creating an account solely for read-only access to Athena, you can copy these permissions and modify them for own buckets:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabase",
                "s3:GetObject",
                "athena:GetWorkGroup",
                "athena:StartQueryExecution",
                "glue:GetTables",
                "athena:StopQueryExecution",
                "athena:GetQueryExecution",
                "athena:GetQueryResults",
                "s3:ListBucket",
                "glue:GetTable"
            ],
            "Resource": [
                "arn:aws:glue:us-east-1:693079658475:table/testdata/*",
                "arn:aws:glue:us-east-1:693079658475:database/testdata",
                "arn:aws:glue:us-east-1:693079658475:catalog",
                "arn:aws:s3:::datastation-tests",
                "arn:aws:s3:::datastation-test-results",
                "arn:aws:s3:::datastation-test-results/*",
                "arn:aws:s3:::datastation-tests/*",
                "arn:aws:athena:*:693079658475:workgroup/primary"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucketMultipartUploads",
                "s3:AbortMultipartUpload",
                "s3:ListBucketVersions",
                "s3:CreateBucket",
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:GetBucketLocation",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
                "arn:aws:s3:::datastation-test-results",
                "arn:aws:s3:::datastation-test-results/*"
            ]
        }
    ]
}
Data source setup

Now in DataStation, create a new data source in the left sidebar.

Give it a nice name so you easily can find it later. And select Athena in the Vendor dropdown.

Fill in database as testdata, output bucket as the bucket you created for query results, and region, access key id and secret access key according to your region and user.

Panel setup

Now create a new panel and select the Database type.

Run a query

Enter the following query and hit play!

SELECT * FROM basic_users;

Graph the results

There is a graph button below the query panel, beside the New Panel button. Click it to generate a graph panel pointed at the query panel.

About this page

See an error or want to add a clarification? This page is generated from this file on Github.

Last edited Mar 21, 2022.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4