Skip to content

BigQuery: load_table_from_dataframe Not working with string columns #9007

Closed
@milonimrod

Description

@milonimrod

Environment details

Running on Ubuntu 16.04 with Python 2.7.14 (google-cloud-bigquery==1.17.0)

Code example

I'm running the code example showed here

Created the following table:
image

dataset_ref = bq_client.dataset('rwr')
table_ref = dataset_ref.table("monty_python")
records = [
    {"title": "The Meaning of Life", "release_year": 1983},
    {"title": "Monty Python and the Holy Grail", "release_year": 1975},
    {"title": "Life of Brian", "release_year": 1979},
    {"title": "And Now for Something Completely Different", "release_year": 1971},
]
# Optionally set explicit indices.
# If indices are not specified, a column will be created for the default
# indices created by pandas.
index = ["Q24980", "Q25043", "Q24953", "Q16403"]
dataframe = pd.DataFrame(records, index=pd.Index(index, name="wikidata_id"))

job = bq_client.load_table_from_dataframe(dataframe, table_ref, location="US")

job.result()  # Waits for table load to complete.

using the following code for the insert works well:

bq_client.insert_rows_json(table_ref, dataframe.to_dict(orient='records'))

Stack trace

---------------------------------------------------------------------------
BadRequest                                Traceback (most recent call last)
<ipython-input-105-5021150a87a3> in <module>()
     15 job = bq_client.load_table_from_dataframe(dataframe, table_ref, location="US")
     16 
---> 17 job.result()  # Waits for table load to complete.

/home/nimrodm/miniconda/envs/garage/lib/python2.7/site-packages/google/cloud/bigquery/job.pyc in result(self, timeout, retry)
    731             self._begin(retry=retry)
    732         # TODO: modify PollingFuture so it can pass a retry argument to done().
--> 733         return super(_AsyncJob, self).result(timeout=timeout)
    734 
    735     def cancelled(self):

/home/nimrodm/miniconda/envs/garage/lib/python2.7/site-packages/google/api_core/future/polling.pyc in result(self, timeout)
    125             # pylint: disable=raising-bad-type
    126             # Pylint doesn't recognize that this is valid in this case.
--> 127             raise self._exception
    128 
    129         return self._result

BadRequest: 400 Provided Schema does not match Table tr:rwr.monty_python. Field title has changed type from STRING to BYTES

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.type: docsImprovement to the documentation for an API.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions