The current signature on the Bigtable Table read_row() method has a default of filter_=None:
def read_row(self, row_key, filter_=None):
...
In cases where a cell value may have been updated multiple times, the default will be to return the full time series with timestamps for each value which can slow down read performance in a non-obvious way.
In the current Python API the cells() method on row_data (PartialRowData) makes a deep copy of the cells, which compounds the performance issue.
@property
def cells(self):
"""Property returning all the cells accumulated on this partial row.
:rtype: dict
:returns: Dictionary of the :class:`Cell` objects accumulated. This
dictionary has two-levels of keys (first for column families
and second for column names/qualifiers within a family). For
a given column, a list of :class:`Cell` objects is stored.
"""
return copy.deepcopy(self._cells)
Consider:
- Making a default filter on
read_row() to retrieve only the most recent value of any cell unless the full or partial time series is requested.
- Allowing a
ColumnFamily to implicitly or explicitly limit cells to only one value (no timeseries).
- Adding a
cell_value(column_family_id, column, index=0) method to row_data (PartialRowData) to allow more efficient retrieval of a single cell value.
The current signature on the Bigtable
Tableread_row()method has a default offilter_=None:In cases where a cell value may have been updated multiple times, the default will be to return the full time series with timestamps for each value which can slow down read performance in a non-obvious way.
In the current Python API the
cells()method onrow_data(PartialRowData) makes a deep copy of the cells, which compounds the performance issue.Consider:
read_row()to retrieve only the most recent value of any cell unless the full or partial time series is requested.ColumnFamilyto implicitly or explicitly limit cells to only one value (no timeseries).cell_value(column_family_id, column, index=0)method torow_data(PartialRowData) to allow more efficient retrieval of a single cell value.