Package tilda.utils.gcp
Class BQHelper
- java.lang.Object
-
- tilda.utils.gcp.BQHelper
-
public class BQHelper extends java.lang.Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
BQHelper.BQCSVExporter
-
Field Summary
Fields Modifier and Type Field Description protected static org.apache.logging.log4j.Logger
LOG
-
Constructor Summary
Constructors Constructor Description BQHelper()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static com.google.cloud.bigquery.Job
checkJobDone(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId)
Returns a job if it's done, null otherwise.static void
copyComments(com.google.cloud.bigquery.BigQuery bq, java.lang.String srcDatasetName, java.lang.String srcTableName, java.lang.String destDatasetName, java.lang.String destTableName)
static boolean
createDataset(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName)
static boolean
createTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, com.google.cloud.bigquery.Schema schema)
static void
createTable(com.google.cloud.bigquery.Schema schema, Connection C)
Creates a table in a JDBC destination (based on the Connection) from a BQ Schema definitionstatic boolean
deleteTable_BE_CAREFUL_WITH_THIS_ONE(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
Use this method carefully as it will delete/drop the table without an easy way to restore the data.static void
exportCSV(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, BQHelper.BQCSVExporter BQE)
static com.google.cloud.bigquery.BigQuery
getBigQuery(java.lang.String dataProjectName)
Given the environment variable name passed in (default isFileUtil._DEFAULT_ENV_VAR_NAME
), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". static com.google.cloud.bigquery.BigQuery
getBigQuery(java.lang.String envVariable, java.lang.String dataProjectName)
Given the environment variable name passed in (default isFileUtil._DEFAULT_ENV_VAR_NAME
), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". static com.google.cloud.bigquery.Schema
getBQSchemaFromDB(Connection C, java.lang.String SchemaName, java.lang.String TableViewName)
static com.google.cloud.bigquery.Schema
getBQSchemaFromMeta(TableViewMeta tvm)
static com.google.cloud.bigquery.Schema
getBQSchemaFromTilda(java.lang.String SchemaName, java.lang.String TableViewName, java.lang.String outputMapName)
static com.google.cloud.bigquery.Schema
getBQTableSchema(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
static JobCostDetails
getJobCostInBytesCents(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId)
Returns the billed bytes and cost in cents for a given job, or null if the job could be located or an exception occurred internally (see logs).static java.time.LocalDate[]
getRepeatableFieldLocalDate(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)
static java.lang.String[]
getRepeatableFieldString(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)
static java.lang.String
getSchemaColumns(com.google.cloud.bigquery.Schema schema)
static com.google.cloud.bigquery.Table
getTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
static com.google.cloud.bigquery.TableDataWriteChannel
getTableWriterChannel(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, java.lang.String format, com.google.cloud.bigquery.Schema schema, boolean truncate)
Sets up a Writer to a BQ Table in the appropriate format.static com.google.cloud.bigquery.Schema
getTildaBQSchema(java.lang.String SchemaName, java.lang.String TableViewName)
static java.time.LocalDate
getTimestampFieldAsLocalDate(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)
static com.google.cloud.bigquery.Job
launchQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)
static java.util.List<com.google.cloud.bigquery.Dataset>
lookupDatasets(com.google.cloud.bigquery.BigQuery bq)
static java.util.List<com.google.cloud.bigquery.Table>
lookupTables(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName)
static JobResults
runQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)
static JobResults
runQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q, long maxQueryResults)
static boolean
tableExists(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
static boolean
truncateTable_BE_CAREFUL_WITH_THIS_ONE(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
Use this method carefully as it will truncate all contents of a table without an easy way to restore the data.static boolean
undeleteTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, long epochMillis)
Undeletes a table based on a snapshot time.
-
-
-
Method Detail
-
getBigQuery
public static com.google.cloud.bigquery.BigQuery getBigQuery(java.lang.String dataProjectName) throws java.io.FileNotFoundException, java.io.IOException
Given the environment variable name passed in (default isFileUtil._DEFAULT_ENV_VAR_NAME
), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". 'xxx' can be anything (and is optional) but is generally the first few characters of the original key file. This is done so different keys to the same project could be used in a team while keeping file names specific to the actual key file generated by GCP. - Parameters:
dataProjectName
-- Returns:
- An authenticated BigQuery instance
- Throws:
java.io.FileNotFoundException
java.io.IOException
-
getBigQuery
public static com.google.cloud.bigquery.BigQuery getBigQuery(java.lang.String envVariable, java.lang.String dataProjectName) throws java.io.FileNotFoundException, java.io.IOException
Given the environment variable name passed in (default isFileUtil._DEFAULT_ENV_VAR_NAME
), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". 'xxx' can be anything (and is optional) but is generally the first few characters of the original key file. This is done so different keys to the same project could be used in a team while keeping file names specific to the actual key file generated by GCP. - Parameters:
envVariable
-dataProjectName
-- Returns:
- An authenticated BigQuery instance
- Throws:
java.io.FileNotFoundException
java.io.IOException
-
createDataset
public static boolean createDataset(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName)
- Parameters:
bq
-datasetName
-- Returns:
-
lookupDatasets
public static java.util.List<com.google.cloud.bigquery.Dataset> lookupDatasets(com.google.cloud.bigquery.BigQuery bq) throws java.lang.Exception
- Parameters:
bq
-- Returns:
- Throws:
java.lang.Exception
-
lookupTables
public static java.util.List<com.google.cloud.bigquery.Table> lookupTables(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName) throws java.lang.Exception
- Parameters:
bq
-datasetName
-- Returns:
- Throws:
java.lang.Exception
-
getTable
public static com.google.cloud.bigquery.Table getTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName) throws java.lang.Exception
- Parameters:
bq
-datasetName
-tableName
-- Returns:
- Throws:
java.lang.Exception
-
checkJobDone
public static com.google.cloud.bigquery.Job checkJobDone(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId) throws java.lang.Exception
Returns a job if it's done, null otherwise.- Parameters:
bq
-jobId
-- Returns:
- Throws:
java.lang.Exception
java.lang.InterruptedException
-
runQuery
public static JobResults runQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)
-
runQuery
public static JobResults runQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q, long maxQueryResults)
-
launchQuery
public static com.google.cloud.bigquery.Job launchQuery(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)
-
getJobCostInBytesCents
public static JobCostDetails getJobCostInBytesCents(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId)
Returns the billed bytes and cost in cents for a given job, or null if the job could be located or an exception occurred internally (see logs).- Parameters:
bq
-jobId
-- Returns:
- LongDoublePair
-
getTableWriterChannel
public static com.google.cloud.bigquery.TableDataWriteChannel getTableWriterChannel(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, java.lang.String format, com.google.cloud.bigquery.Schema schema, boolean truncate) throws java.lang.Exception
Sets up a Writer to a BQ Table in the appropriate format. Once obtained, you can write date to the writer, and when done, you can get the Job and check for completion withboolean success = JobHelper.completeJob(writer.getJob());
- Parameters:
bq
-datasetName
-tableName
-format
- either 'csv' or 'jsonl' only- Returns:
- Throws:
java.lang.Exception
-
getTildaBQSchema
public static com.google.cloud.bigquery.Schema getTildaBQSchema(java.lang.String SchemaName, java.lang.String TableViewName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getBQSchemaFromTilda
public static com.google.cloud.bigquery.Schema getBQSchemaFromTilda(java.lang.String SchemaName, java.lang.String TableViewName, java.lang.String outputMapName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getBQSchemaFromDB
public static com.google.cloud.bigquery.Schema getBQSchemaFromDB(Connection C, java.lang.String SchemaName, java.lang.String TableViewName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getBQSchemaFromMeta
public static com.google.cloud.bigquery.Schema getBQSchemaFromMeta(TableViewMeta tvm)
-
getSchemaColumns
public static java.lang.String getSchemaColumns(com.google.cloud.bigquery.Schema schema)
-
getBQTableSchema
public static com.google.cloud.bigquery.Schema getBQTableSchema(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
-
copyComments
public static void copyComments(com.google.cloud.bigquery.BigQuery bq, java.lang.String srcDatasetName, java.lang.String srcTableName, java.lang.String destDatasetName, java.lang.String destTableName)
-
exportCSV
public static void exportCSV(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, BQHelper.BQCSVExporter BQE) throws java.lang.Exception
- Throws:
java.lang.Exception
-
createTable
public static boolean createTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, com.google.cloud.bigquery.Schema schema)
-
truncateTable_BE_CAREFUL_WITH_THIS_ONE
public static boolean truncateTable_BE_CAREFUL_WITH_THIS_ONE(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
Use this method carefully as it will truncate all contents of a table without an easy way to restore the data. Do note that BigQuery does create shadow versions of tables for up to 24 or 48h, so technically, data COULD be recovered if acted upon quickly. See https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table for documentation!- Parameters:
bq
-datasetName
-tableName
-- Returns:
-
deleteTable_BE_CAREFUL_WITH_THIS_ONE
public static boolean deleteTable_BE_CAREFUL_WITH_THIS_ONE(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
Use this method carefully as it will delete/drop the table without an easy way to restore the data. Do note that BigQuery does create shadow versions of tables for up to 24 or 48h, so technically, data COULD be recovered if acted upon quickly. See https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table for documentation!- Parameters:
bq
-datasetName
-tableName
-- Returns:
-
undeleteTable
public static boolean undeleteTable(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, long epochMillis)
Undeletes a table based on a snapshot time. As per the documentation https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table one cannot undelete a table that has been re-created since. BQ will also track the most recent snapshot as per the milliseconds provided.- Parameters:
bq
-datasetName
-tableName
-epochMillis
-- Returns:
-
tableExists
public static boolean tableExists(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
-
getRepeatableFieldString
public static java.lang.String[] getRepeatableFieldString(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getRepeatableFieldLocalDate
public static java.time.LocalDate[] getRepeatableFieldLocalDate(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getTimestampFieldAsLocalDate
public static java.time.LocalDate getTimestampFieldAsLocalDate(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName) throws java.lang.Exception
- Throws:
java.lang.Exception
-
createTable
public static void createTable(com.google.cloud.bigquery.Schema schema, Connection C)
Creates a table in a JDBC destination (based on the Connection) from a BQ Schema definition- Parameters:
schema
-C
-
-
-