Class BQHelper


  • public class BQHelper
    extends java.lang.Object
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static interface  BQHelper.BQCSVExporter  
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.logging.log4j.Logger LOG  
    • Constructor Summary

      Constructors 
      Constructor Description
      BQHelper()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static com.google.cloud.bigquery.Job checkJobDone​(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId)
      Returns a job if it's done, null otherwise.
      static void copyComments​(com.google.cloud.bigquery.BigQuery bq, java.lang.String srcDatasetName, java.lang.String srcTableName, java.lang.String destDatasetName, java.lang.String destTableName)  
      static boolean createDataset​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName)  
      static boolean createTable​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, com.google.cloud.bigquery.Schema schema)  
      static void createTable​(com.google.cloud.bigquery.Schema schema, Connection C)
      Creates a table in a JDBC destination (based on the Connection) from a BQ Schema definition
      static boolean deleteTable_BE_CAREFUL_WITH_THIS_ONE​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
      Use this method carefully as it will delete/drop the table without an easy way to restore the data.
      static void exportCSV​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, BQHelper.BQCSVExporter BQE)  
      static com.google.cloud.bigquery.BigQuery getBigQuery​(java.lang.String dataProjectName)
      Given the environment variable name passed in (default is FileUtil._DEFAULT_ENV_VAR_NAME), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json".
      static com.google.cloud.bigquery.BigQuery getBigQuery​(java.lang.String envVariable, java.lang.String dataProjectName)
      Given the environment variable name passed in (default is FileUtil._DEFAULT_ENV_VAR_NAME), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json".
      static com.google.cloud.bigquery.Schema getBQSchemaFromDB​(Connection C, java.lang.String SchemaName, java.lang.String TableViewName)  
      static com.google.cloud.bigquery.Schema getBQSchemaFromMeta​(TableViewMeta tvm)  
      static com.google.cloud.bigquery.Schema getBQSchemaFromTilda​(java.lang.String SchemaName, java.lang.String TableViewName, java.lang.String outputMapName)  
      static com.google.cloud.bigquery.Schema getBQTableSchema​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)  
      static JobCostDetails getJobCostInBytesCents​(com.google.cloud.bigquery.BigQuery bq, java.lang.String jobId)
      Returns the billed bytes and cost in cents for a given job, or null if the job could be located or an exception occurred internally (see logs).
      static java.time.LocalDate[] getRepeatableFieldLocalDate​(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)  
      static java.lang.String[] getRepeatableFieldString​(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)  
      static java.lang.String getSchemaColumns​(com.google.cloud.bigquery.Schema schema)  
      static com.google.cloud.bigquery.Table getTable​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)  
      static com.google.cloud.bigquery.TableDataWriteChannel getTableWriterChannel​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, java.lang.String format, com.google.cloud.bigquery.Schema schema, boolean truncate)
      Sets up a Writer to a BQ Table in the appropriate format.
      static com.google.cloud.bigquery.Schema getTildaBQSchema​(java.lang.String SchemaName, java.lang.String TableViewName)  
      static java.time.LocalDate getTimestampFieldAsLocalDate​(com.google.cloud.bigquery.FieldValueList row, java.lang.String fielName)  
      static com.google.cloud.bigquery.Job launchQuery​(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)  
      static java.util.List<com.google.cloud.bigquery.Dataset> lookupDatasets​(com.google.cloud.bigquery.BigQuery bq)  
      static java.util.List<com.google.cloud.bigquery.Table> lookupTables​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName)  
      static JobResults runQuery​(com.google.cloud.bigquery.BigQuery bq, java.lang.String q)  
      static JobResults runQuery​(com.google.cloud.bigquery.BigQuery bq, java.lang.String q, long maxQueryResults)  
      static boolean tableExists​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)  
      static boolean truncateTable_BE_CAREFUL_WITH_THIS_ONE​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName)
      Use this method carefully as it will truncate all contents of a table without an easy way to restore the data.
      static boolean undeleteTable​(com.google.cloud.bigquery.BigQuery bq, java.lang.String datasetName, java.lang.String tableName, long epochMillis)
      Undeletes a table based on a snapshot time.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        protected static final org.apache.logging.log4j.Logger LOG
    • Constructor Detail

      • BQHelper

        public BQHelper()
    • Method Detail

      • getBigQuery

        public static com.google.cloud.bigquery.BigQuery getBigQuery​(java.lang.String dataProjectName)
                                                              throws java.io.FileNotFoundException,
                                                                     java.io.IOException
        Given the environment variable name passed in (default is FileUtil._DEFAULT_ENV_VAR_NAME), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". 'xxx' can be anything (and is optional) but is generally the first few characters of the original key file. This is done so different keys to the same project could be used in a team while keeping file names specific to the actual key file generated by GCP.
        Parameters:
        dataProjectName -
        Returns:
        An authenticated BigQuery instance
        Throws:
        java.io.FileNotFoundException
        java.io.IOException
      • getBigQuery

        public static com.google.cloud.bigquery.BigQuery getBigQuery​(java.lang.String envVariable,
                                                                     java.lang.String dataProjectName)
                                                              throws java.io.FileNotFoundException,
                                                                     java.io.IOException
        Given the environment variable name passed in (default is FileUtil._DEFAULT_ENV_VAR_NAME), looks up the value which points to a path, and then based on the project name, lookup the file "[SERVICE_ACCOUNTS_CREDENTIALS_PATH]/.xxx.key.bq.json". 'xxx' can be anything (and is optional) but is generally the first few characters of the original key file. This is done so different keys to the same project could be used in a team while keeping file names specific to the actual key file generated by GCP.
        Parameters:
        envVariable -
        dataProjectName -
        Returns:
        An authenticated BigQuery instance
        Throws:
        java.io.FileNotFoundException
        java.io.IOException
      • createDataset

        public static boolean createDataset​(com.google.cloud.bigquery.BigQuery bq,
                                            java.lang.String datasetName)
        Parameters:
        bq -
        datasetName -
        Returns:
      • lookupDatasets

        public static java.util.List<com.google.cloud.bigquery.Dataset> lookupDatasets​(com.google.cloud.bigquery.BigQuery bq)
                                                                                throws java.lang.Exception
        Parameters:
        bq -
        Returns:
        Throws:
        java.lang.Exception
      • lookupTables

        public static java.util.List<com.google.cloud.bigquery.Table> lookupTables​(com.google.cloud.bigquery.BigQuery bq,
                                                                                   java.lang.String datasetName)
                                                                            throws java.lang.Exception
        Parameters:
        bq -
        datasetName -
        Returns:
        Throws:
        java.lang.Exception
      • getTable

        public static com.google.cloud.bigquery.Table getTable​(com.google.cloud.bigquery.BigQuery bq,
                                                               java.lang.String datasetName,
                                                               java.lang.String tableName)
                                                        throws java.lang.Exception
        Parameters:
        bq -
        datasetName -
        tableName -
        Returns:
        Throws:
        java.lang.Exception
      • checkJobDone

        public static com.google.cloud.bigquery.Job checkJobDone​(com.google.cloud.bigquery.BigQuery bq,
                                                                 java.lang.String jobId)
                                                          throws java.lang.Exception
        Returns a job if it's done, null otherwise.
        Parameters:
        bq -
        jobId -
        Returns:
        Throws:
        java.lang.Exception
        java.lang.InterruptedException
      • runQuery

        public static JobResults runQuery​(com.google.cloud.bigquery.BigQuery bq,
                                          java.lang.String q)
      • runQuery

        public static JobResults runQuery​(com.google.cloud.bigquery.BigQuery bq,
                                          java.lang.String q,
                                          long maxQueryResults)
      • launchQuery

        public static com.google.cloud.bigquery.Job launchQuery​(com.google.cloud.bigquery.BigQuery bq,
                                                                java.lang.String q)
      • getJobCostInBytesCents

        public static JobCostDetails getJobCostInBytesCents​(com.google.cloud.bigquery.BigQuery bq,
                                                            java.lang.String jobId)
        Returns the billed bytes and cost in cents for a given job, or null if the job could be located or an exception occurred internally (see logs).
        Parameters:
        bq -
        jobId -
        Returns:
        LongDoublePair
      • getTableWriterChannel

        public static com.google.cloud.bigquery.TableDataWriteChannel getTableWriterChannel​(com.google.cloud.bigquery.BigQuery bq,
                                                                                            java.lang.String datasetName,
                                                                                            java.lang.String tableName,
                                                                                            java.lang.String format,
                                                                                            com.google.cloud.bigquery.Schema schema,
                                                                                            boolean truncate)
                                                                                     throws java.lang.Exception
        Sets up a Writer to a BQ Table in the appropriate format. Once obtained, you can write date to the writer, and when done, you can get the Job and check for completion with boolean success = JobHelper.completeJob(writer.getJob());
        Parameters:
        bq -
        datasetName -
        tableName -
        format - either 'csv' or 'jsonl' only
        Returns:
        Throws:
        java.lang.Exception
      • getTildaBQSchema

        public static com.google.cloud.bigquery.Schema getTildaBQSchema​(java.lang.String SchemaName,
                                                                        java.lang.String TableViewName)
                                                                 throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getBQSchemaFromTilda

        public static com.google.cloud.bigquery.Schema getBQSchemaFromTilda​(java.lang.String SchemaName,
                                                                            java.lang.String TableViewName,
                                                                            java.lang.String outputMapName)
                                                                     throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getBQSchemaFromDB

        public static com.google.cloud.bigquery.Schema getBQSchemaFromDB​(Connection C,
                                                                         java.lang.String SchemaName,
                                                                         java.lang.String TableViewName)
                                                                  throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getBQSchemaFromMeta

        public static com.google.cloud.bigquery.Schema getBQSchemaFromMeta​(TableViewMeta tvm)
      • getSchemaColumns

        public static java.lang.String getSchemaColumns​(com.google.cloud.bigquery.Schema schema)
      • getBQTableSchema

        public static com.google.cloud.bigquery.Schema getBQTableSchema​(com.google.cloud.bigquery.BigQuery bq,
                                                                        java.lang.String datasetName,
                                                                        java.lang.String tableName)
      • copyComments

        public static void copyComments​(com.google.cloud.bigquery.BigQuery bq,
                                        java.lang.String srcDatasetName,
                                        java.lang.String srcTableName,
                                        java.lang.String destDatasetName,
                                        java.lang.String destTableName)
      • exportCSV

        public static void exportCSV​(com.google.cloud.bigquery.BigQuery bq,
                                     java.lang.String datasetName,
                                     java.lang.String tableName,
                                     BQHelper.BQCSVExporter BQE)
                              throws java.lang.Exception
        Throws:
        java.lang.Exception
      • createTable

        public static boolean createTable​(com.google.cloud.bigquery.BigQuery bq,
                                          java.lang.String datasetName,
                                          java.lang.String tableName,
                                          com.google.cloud.bigquery.Schema schema)
      • truncateTable_BE_CAREFUL_WITH_THIS_ONE

        public static boolean truncateTable_BE_CAREFUL_WITH_THIS_ONE​(com.google.cloud.bigquery.BigQuery bq,
                                                                     java.lang.String datasetName,
                                                                     java.lang.String tableName)
        Use this method carefully as it will truncate all contents of a table without an easy way to restore the data. Do note that BigQuery does create shadow versions of tables for up to 24 or 48h, so technically, data COULD be recovered if acted upon quickly. See https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table for documentation!
        Parameters:
        bq -
        datasetName -
        tableName -
        Returns:
      • deleteTable_BE_CAREFUL_WITH_THIS_ONE

        public static boolean deleteTable_BE_CAREFUL_WITH_THIS_ONE​(com.google.cloud.bigquery.BigQuery bq,
                                                                   java.lang.String datasetName,
                                                                   java.lang.String tableName)
        Use this method carefully as it will delete/drop the table without an easy way to restore the data. Do note that BigQuery does create shadow versions of tables for up to 24 or 48h, so technically, data COULD be recovered if acted upon quickly. See https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table for documentation!
        Parameters:
        bq -
        datasetName -
        tableName -
        Returns:
      • undeleteTable

        public static boolean undeleteTable​(com.google.cloud.bigquery.BigQuery bq,
                                            java.lang.String datasetName,
                                            java.lang.String tableName,
                                            long epochMillis)
        Undeletes a table based on a snapshot time. As per the documentation https://cloud.google.com/bigquery/docs/samples/bigquery-undelete-table one cannot undelete a table that has been re-created since. BQ will also track the most recent snapshot as per the milliseconds provided.
        Parameters:
        bq -
        datasetName -
        tableName -
        epochMillis -
        Returns:
      • tableExists

        public static boolean tableExists​(com.google.cloud.bigquery.BigQuery bq,
                                          java.lang.String datasetName,
                                          java.lang.String tableName)
      • getRepeatableFieldString

        public static java.lang.String[] getRepeatableFieldString​(com.google.cloud.bigquery.FieldValueList row,
                                                                  java.lang.String fielName)
                                                           throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getRepeatableFieldLocalDate

        public static java.time.LocalDate[] getRepeatableFieldLocalDate​(com.google.cloud.bigquery.FieldValueList row,
                                                                        java.lang.String fielName)
                                                                 throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getTimestampFieldAsLocalDate

        public static java.time.LocalDate getTimestampFieldAsLocalDate​(com.google.cloud.bigquery.FieldValueList row,
                                                                       java.lang.String fielName)
                                                                throws java.lang.Exception
        Throws:
        java.lang.Exception
      • createTable

        public static void createTable​(com.google.cloud.bigquery.Schema schema,
                                       Connection C)
        Creates a table in a JDBC destination (based on the Connection) from a BQ Schema definition
        Parameters:
        schema -
        C -