HappyBase Table#

Google Cloud Bigtable HappyBase table module.

class google.cloud.happybase.table.Table(name, connection)[source]#

Bases: object

Representation of Cloud Bigtable table.

Used for adding data and

Parameters:
  • name (str) – The name of the table.
  • connection (Connection) – The connection which has access to the table.
batch(timestamp=None, batch_size=None, transaction=False, wal=<object object>)[source]#

Create a new batch operation for this table.

This method returns a new Batch instance that can be used for mass data manipulation.

Parameters:
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch) that all mutations will be applied at.
  • batch_size (int) – (Optional) The maximum number of mutations to allow to accumulate before committing them.
  • transaction (bool) – Flag indicating if the mutations should be sent transactionally or not. If transaction=True and an error occurs while a Batch is active, then none of the accumulated mutations will be committed. If batch_size is set, the mutation can’t be transactional.
  • wal (object) – Unused parameter (to be passed to the created batch). Provided for compatibility with HappyBase, but irrelevant for Cloud Bigtable since it does not have a Write Ahead Log.
Return type:

Batch

Returns:

A batch bound to this table.

cells(row, column, versions=None, timestamp=None, include_timestamp=False)[source]#

Retrieve multiple versions of a single cell from the table.

Parameters:
  • row (str) – Row key for the row we are reading from.
  • column (str) – Column we are reading from; of the form fam:col.
  • versions (int) – (Optional) The maximum number of cells to return. If not set, returns all cells found.
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch). If specified, only cells returned before (or at) the timestamp will be returned.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
Return type:

list

Returns:

List of values in the cell (with timestamps if include_timestamp is True).

counter_dec(row, column, value=1)[source]#

Atomically decrement a counter column.

This method atomically decrements a counter column in row. If the counter column does not exist, it is automatically initialized to 0 before being decremented.

Parameters:
  • row (str) – Row key for the row we are decrementing a counter in.
  • column (str) – Column we are decrementing a value in; of the form fam:col.
  • value (int) – Amount to decrement the counter by. (If negative, this is equivalent to increment.)
Return type:

int

Returns:

Counter value after decrementing.

counter_get(row, column)[source]#

Retrieve the current value of a counter column.

This method retrieves the current value of a counter column. If the counter column does not exist, this function initializes it to 0.

Note

Application code should never store a counter value directly; use the atomic counter_inc() and counter_dec() methods for that.

Parameters:
  • row (str) – Row key for the row we are getting a counter from.
  • column (str) – Column we are get-ing from; of the form fam:col.
Return type:

int

Returns:

Counter value (after initializing / incrementing by 0).

counter_inc(row, column, value=1)[source]#

Atomically increment a counter column.

This method atomically increments a counter column in row. If the counter column does not exist, it is automatically initialized to 0 before being incremented.

Parameters:
  • row (str) – Row key for the row we are incrementing a counter in.
  • column (str) – Column we are incrementing a value in; of the form fam:col.
  • value (int) – Amount to increment the counter by. (If negative, this is equivalent to decrement.)
Return type:

int

Returns:

Counter value after incrementing.

counter_set(row, column, value=0)[source]#

Set a counter column to a specific value.

Note

Be careful using this method. It can be useful for setting the initial value of a counter, but it defeats the purpose of using atomic increment and decrement.

Parameters:
  • row (str) – Row key for the row we are setting a counter in.
  • column (str) – Column we are setting a value in; of the form fam:col.
  • value (int) – Value to set the counter to.
delete(row, columns=None, timestamp=None, wal=<object object>)[source]#

Delete data from a row in this table.

This method deletes the entire row if columns is not specified.

Note

This method will send a request with a single delete mutation. In many situations, batch() is a more appropriate method to manipulate data since it helps combine many mutations into a single request.

Parameters:
  • row (str) – The row key where the delete will occur.
  • columns (list) –

    (Optional) Iterable containing column names (as strings). Each column name can be either

    • an entire column family: fam or fam:
    • a single column: fam:col
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch) that the mutation will be applied at.
  • wal (object) – Unused parameter (to be passed to a created batch). Provided for compatibility with HappyBase, but irrelevant for Cloud Bigtable since it does not have a Write Ahead Log.
families()[source]#

Retrieve the column families for this table.

Return type:dict
Returns:Mapping from column family name to garbage collection rule for a column family.
put(row, data, timestamp=None, wal=<object object>)[source]#

Insert data into a row in this table.

Note

This method will send a request with a single “put” mutation. In many situations, batch() is a more appropriate method to manipulate data since it helps combine many mutations into a single request.

Parameters:
  • row (str) – The row key where the mutation will be “put”.
  • data (dict) – Dictionary containing the data to be inserted. The keys are columns names (of the form fam:col) and the values are strings (bytes) to be stored in those columns.
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch) that the mutation will be applied at.
  • wal (object) – Unused parameter (to be passed to a created batch). Provided for compatibility with HappyBase, but irrelevant for Cloud Bigtable since it does not have a Write Ahead Log.
regions()[source]#

Retrieve the regions for this table.

Warning

Cloud Bigtable does not give information about how a table is laid out in memory, so this method does not work. It is provided simply for compatibility.

Raises:NotImplementedError always
row(row, columns=None, timestamp=None, include_timestamp=False)[source]#

Retrieve a single row of data.

Returns the latest cells in each column (or all columns if columns is not specified). If a timestamp is set, then latest becomes latest up until timestamp.

Parameters:
  • row (str) – Row key for the row we are reading from.
  • columns (list) –

    (Optional) Iterable containing column names (as strings). Each column name can be either

    • an entire column family: fam or fam:
    • a single column: fam:col
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch). If specified, only cells returned before the the timestamp will be returned.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
Return type:

dict

Returns:

Dictionary containing all the latest column values in the row.

rows(rows, columns=None, timestamp=None, include_timestamp=False)[source]#

Retrieve multiple rows of data.

All optional arguments behave the same in this method as they do in row().

Parameters:
  • rows (list) – Iterable of the row keys for the rows we are reading from.
  • columns (list) –

    (Optional) Iterable containing column names (as strings). Each column name can be either

    • an entire column family: fam or fam:
    • a single column: fam:col
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch). If specified, only cells returned before (or at) the timestamp will be returned.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
Return type:

list

Returns:

A list of pairs, where the first is the row key and the second is a dictionary with the filtered values returned.

scan(row_start=None, row_stop=None, row_prefix=None, columns=None, timestamp=None, include_timestamp=False, limit=None, **kwargs)[source]#

Create a scanner for data in this table.

This method returns a generator that can be used for looping over the matching rows.

If row_prefix is specified, only rows with row keys matching the prefix will be returned. If given, row_start and row_stop cannot be used.

Note

Both row_start and row_stop can be None to specify the start and the end of the table respectively. If both are omitted, a full table scan is done. Note that this usually results in severe performance problems.

The keyword argument filter is also supported (beyond column and row range filters supported here). HappyBase / HBase users will have used this as an HBase filter string. (See the Thrift docs for more details on those filters.) However, Google Cloud Bigtable doesn’t support those filter strings so a RowFilter should be used instead.

The arguments batch_size, scan_batching and sorted_columns are allowed (as keyword arguments) for compatibility with HappyBase. However, they will not be used in any way, and will cause a warning if passed. (The batch_size determines the number of results to retrieve per request. The HBase scanner defaults to reading one record at a time, so this argument allows HappyBase to increase that number. However, the Cloud Bigtable API uses HTTP/2 streaming so there is no concept of a batched scan. The sorted_columns flag tells HBase to return columns in order, but Cloud Bigtable doesn’t have this feature.)

Parameters:
  • row_start (str) – (Optional) Row key where the scanner should start (includes row_start). If not specified, reads from the first key. If the table does not contain row_start, it will start from the next key after it that is contained in the table.
  • row_stop (str) – (Optional) Row key where the scanner should stop (excludes row_stop). If not specified, reads until the last key. The table does not have to contain row_stop.
  • row_prefix (str) – (Optional) Prefix to match row keys.
  • columns (list) –

    (Optional) Iterable containing column names (as strings). Each column name can be either

    • an entire column family: fam or fam:
    • a single column: fam:col
  • timestamp (int) – (Optional) Timestamp (in milliseconds since the epoch). If specified, only cells returned before (or at) the timestamp will be returned.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
  • limit (int) – (Optional) Maximum number of rows to return.
  • kwargs (dict) – Remaining keyword arguments. Provided for HappyBase compatibility.
Raises:

If limit is set but non-positive, or if row_prefix is used with row start/stop, TypeError if a string filter is used.

google.cloud.happybase.table.make_ordered_row(sorted_columns, include_timestamp)[source]#

Make a row dict for sorted Thrift column results from scans.

Warning

This method is only provided for HappyBase compatibility, but does not actually work.

Parameters:
  • sorted_columns (list) – List of TColumn instances from Thrift.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
Raises:

NotImplementedError always

google.cloud.happybase.table.make_row(cell_map, include_timestamp)[source]#

Make a row dict for a Thrift cell mapping.

Warning

This method is only provided for HappyBase compatibility, but does not actually work.

Parameters:
  • cell_map (dict) – Dictionary with fam:col strings as keys and TCell instances as values.
  • include_timestamp (bool) – Flag to indicate if cell timestamps should be included with the output.
Raises:

NotImplementedError always