Connection fields
| Field | Required | Description |
|---|---|---|
projectId | yes | The Google Cloud project ID that holds your BigQuery datasets |
datasetId | yes | The default dataset for the connection (definitions can override this in SQL) |
serviceAccountJson | yes | A service-account key JSON, pasted in full |
Creating a service account
- In the Google Cloud console, open IAM & Admin → Service Accounts for the project that holds your data.
- Create service account. Name it something descriptive —
traffical-readerworks. - Grant the roles below.
- Open the new service account, go to Keys → Add key → JSON. Download the file.
- Paste the entire JSON into the Traffical connection form.
IAM roles
The pipeline needs read access to your source data plus read + write access to a Traffical-owned dataset where it creates intermediate tables.| Scope | Role | Why |
|---|---|---|
| Project | roles/bigquery.jobUser | Run query jobs |
| Source datasets | roles/bigquery.dataViewer | Read your assignment / fact tables |
| Traffical dataset | roles/bigquery.dataEditor | Create, write, delete, and drop tables inside the Traffical dataset |
dataViewer only on the specific source datasets, and dataEditor only on the Traffical dataset:
Cost
BigQuery charges per byte scanned. The pipeline reportsbytesScanned for every query — visible in Pipeline → Runs in the dashboard. Some practical guidance:
- Always scope by date in your fact/assignment definitions. Use
{{start_date}}and{{end_date}}to restrict scans to the pipeline’s window. AWHERE event_time BETWEEN '{{start_date}}' AND '{{end_date}}'on a partitioned table scans only the relevant partitions. - Partition on time. A partitioned table is dramatically cheaper to scan when the pipeline filters by
event_timeorassigned_at. - Cluster on user/entity columns. Clustering on
user_idmakes joins between assignments and facts cheaper. - Avoid
SELECT *. BigQuery is columnar — project only the columns you need.
Recommended schema layout
A typical assignment table:Example definitions
Assignment definition SQL:Network access
BigQuery is HTTPS-only and Traffical reaches it directly via the BigQuery REST API. No firewall or peering setup is needed for the standard configuration. If your BigQuery is restricted via VPC Service Controls, you’ll need to add Traffical’s egress IPs to the perimeter ingress policy. The IPs are listed in Settings → Warehouse.Gotchas
- Service account expiry. Service account JSON keys don’t expire automatically, but rotating them periodically is a good practice. After rotation, paste the new JSON into the connection and re-save.
- Multi-region datasets. If your dataset is multi-region (
EU,US), the pipeline still works — queries run in the dataset’s region. - Slot reservations. If you use BigQuery slot reservations rather than on-demand billing, Traffical’s queries consume slots from whatever reservation the service account inherits.
- Streaming buffer. Recently-inserted streaming rows may take a few minutes to appear in queries. If your application writes events to BigQuery and the pipeline runs immediately afterwards, those events might be missed in the first run. The next run picks them up.