GCP Connector
GcpConnector
Instantiate a GCP connector.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
credential_file |
str
|
Credential json file |
required |
proxy |
str
|
Proxy address |
''
|
Source code in honeydew/gcp.py
Python | |
---|---|
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
|
bq_export_table_to_gcs(project_id, dataset_id, table_id, gcs_uri, format='CSV', delimiter=',', enable_compression=True, compression='GZIP', overwrite=True, region='northamerica-northeast1')
Export BigQuery table into Google Cloud Storage (GCS).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
table_id |
str
|
Table ID |
required |
dataset_id |
str
|
Dataset ID |
required |
gcs_uri |
str
|
GCS URI as destination. Example: 'gs://my-bucket/my-dir/tickets-20220101-*.csv.gz' |
required |
format |
str
|
File format (CSV, JSON, Avro, Parquet). Default: 'CSV' |
'CSV'
|
delimiter |
str
|
CSV delimiter character. Default: ',' |
','
|
enable_compression |
boolean
|
Files will be compressed if the value is True. Default: True |
True
|
compression |
str
|
Compression format. Default: GZIP. Reference: https://cloud.google.com/bigquery/docs/exporting-data#export_formats_and_compression_types |
'GZIP'
|
overwrite |
boolean
|
GCS URI destination will be overwritten if the value is True. Default: True |
True
|
region |
str
|
Region to run the process. Default: 'northamerica-northeast1' |
'northamerica-northeast1'
|
Returns:
Name | Type | Description |
---|---|---|
result |
result
|
Iterator of row data. Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJob.html?highlight=job%20result#google.cloud.bigquery.job.QueryJob.result |
Source code in honeydew/gcp.py
bq_query_non_dql(project_id, query)
Submit non Data Query Language (DQL) type of query to BigQuery. Example: CREATE, DROP, TRUNCATE, INSERT, UPDATE, DELETE.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
query |
str
|
SQL query |
required |
Returns:
Name | Type | Description |
---|---|---|
result |
result
|
Iterator of row data. Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJob.html?highlight=job%20result#google.cloud.bigquery.job.QueryJob.result |
Source code in honeydew/gcp.py
bq_query_to_dataframe(project_id, query, timeout=3600, method=1)
Submit query to BigQuery and store result into pandas dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
query |
str
|
SQL query |
required |
timeout |
int
|
Query timeout in seconds |
3600
|
method |
int
|
API that will be used to query (1: google-cloud-bigquery, 2: pandas-gbq) |
1
|
Returns:
Name | Type | Description |
---|---|---|
result |
dataframe)
|
Result in pandas dataframe |
Source code in honeydew/gcp.py
gcs_download_objects_with_pattern(project_id, bucket_id, blob_prefix, destination_dir_path, printout=True)
Download multiple objects which have same prefix pattern from Google Cloud Storage (GCS).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
bucket_id |
str
|
Bucket ID |
required |
blob_prefix |
str
|
The blob prefix pattern that wil be downloaded. Example: 'gcs-directory/tickets-20220101-' |
required |
destination_dir_path |
str
|
Local destination directory path. Example: '/my-directory' |
required |
printout |
boolean
|
File name will be displayed if this value is true. Default: True |
True
|
Source code in honeydew/gcp.py
gcs_download_single_file(project_id, bucket_id, source_blob_path, destination_path)
Download a single object from Google Cloud Storage (GCS).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
bucket_id |
str
|
Bucket ID |
required |
source_blob_path |
str
|
The path of source object. Example: 'gcs-directory/my-filename.txt' |
required |
destination_path |
str
|
Local destination path. Example: '/my-directory/my-filename.txt' |
required |
Returns:
Name | Type | Description |
---|---|---|
result |
result
|
Iterator of row data. Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJob.html?highlight=job%20result#google.cloud.bigquery.job.QueryJob.result |
Source code in honeydew/gcp.py
gcs_upload_single_file(project_id, bucket_id, local_file, destination_blob)
Upload a single object from Google Cloud Storage (GCS).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_id |
str
|
Project ID |
required |
bucket_id |
str
|
Bucket ID |
required |
local_file |
str
|
Local file as source. Example: '/local-directory/my-filename.txt' |
required |
destination_blob |
str
|
Destination blob in GCS bucket. Example: 'gcs-directory/my-filename.txt' |
required |
Returns:
Name | Type | Description |
---|---|---|
result |
str
|
It returns 'OK' when successful |