"Retrying due to 504 Deadline Exceeded, sleeping 0.0s ..." frequently in GKE when publishing data

Using a python gunicorn/flask service to pull streaming data and then publish to pubsub running in a pod on GKE, I receive errors regardless of batch size settings (including default).

Using both the pubsub and pubsub_v1 PublisherClient, with certain sets of data and default or larger batch sizes (>1000), I get:

```
"Retrying due to 504 Deadline Exceeded, sleeping 0.0s ..."
```

Until ultimately I receive the following error in the added `done_callback`.

```
Deadline of 60.0s exceeded while calling functools.partial(<function _wrap_unary_errors.<locals>.error_remapped_callable at 0x7f869c25ae50>
```


Or, in some cases, no logs at all, the gunicorn worker just reboots silently. With smaller batch sizes in the low hundreds, I typically see no indication of an error, but the worker still silently fails and reboots and does so in under 60s.

This occurs regardless of if I'm running many publish workers or 1. 

The GKE cluster is on a shared private VPC and is using workload identity.


This does seem to occur with specific datasets but there seems to be no clear reason why those datasets would cause any error like this. 

This is the output of `pip3 freeze`

```
cachetools==4.1.0
certifi==2020.4.5.2
chardet==3.0.4
click==7.1.2
Flask==1.1.2
gevent==20.6.2
google-api-core==1.20.1
google-auth==1.17.2
google-cloud-bigquery==1.25.0
google-cloud-core==1.3.0
google-cloud-pubsub==1.6.0
google-resumable-media==0.5.1
googleapis-common-protos==1.52.0
greenlet==0.4.16
grpc-google-iam-v1==0.12.3
grpcio==1.29.0
gunicorn==20.0.4
idna==2.9
itsdangerous==1.1.0
Jinja2==2.11.2
MarkupSafe==1.1.1
protobuf==3.12.2
pyasn1==0.4.8
pyasn1-modules==0.2.8
pytz==2020.1
requests==2.23.0
rsa==4.6
six==1.15.0
urllib3==1.25.9
Werkzeug==1.0.1
zope.event==4.4
zope.interface==5.1.0
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Retrying due to 504 Deadline Exceeded, sleeping 0.0s ..." frequently in GKE when publishing data #126

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"Retrying due to 504 Deadline Exceeded, sleeping 0.0s ..." frequently in GKE when publishing data #126

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions