Closed
Description
Is your feature request related to a problem? Please describe.
When you want to get all rows in a table (or just a preview), it's more expensive to run a SELECT *
query than the alternatives.
Describe the solution you'd like
- If there is no whitespace in the query text, assume it's a table ID. Call
list_rows
(possibly with the BigQuery Storage API option, if that's set). - To allow for a preview, add a
--max_results
option to the%%bigquery
magic. If not set, get all rows. If set, download at mostmax_results
rows.
Describe alternatives you've considered
- Add a special
%%bigquery preview
command. I prefer looking to see if the query text is just a table name, as that will be more consistent with pandas-gbq. Also, we'd prefer to limit the%%bigquery
magic to just "queries". I think entering just a table ID as query text is clear that it's different fromSELECT *
(anti-pattern) but also the intention is clear. - Client-side query processing to automatically detect
SELECT * [LIMIT N]
queries client-side. I think it could make sense to parse queries client-side in just the%%bigquery
magic (or opt-in viaQueryJobConfig
option), but this is much more complex than checking if there is any whitespace in query text.
Additional context
Related feature requests in pandas-gbq:
- List rows without a query. ENH: Read BigQuery query table without a query python-bigquery-pandas#266
- Limit rows returned. Add option for limiting rows of retrieved of results python-bigquery-pandas#102