JSON to BigQuery Schema Generator.
Generate BigQuery table schemas and dbt model stubs from sample JSON. See field counts, nested record totals, and copy schema JSON — all client-side, free to use.
Options
Paste JSON to generate schema
Provide a JSON object, array of objects, or JSONL sample. Press ⌘↵ to regenerate, then switch to the dbt model tab for a staging SQL stub.
How to use the JSON to BigQuery Schema Generator
- Paste a representative JSON object or array of records so the generator can infer BigQuery column types from real sample values.
- Review field counts and nested record totals, then toggle strict nulls or descriptions to match your table contract.
- Switch between BigQuery schema JSON and a dbt staging model stub when you need warehouse DDL or transformation scaffolding.
- Copy or download the output for CREATE TABLE statements, streaming inserts, or dbt project bootstrap work.
What is BigQuery Schema Generation?
BigQuery requires a specific JSON schema format to define table structures before load jobs, streaming inserts, or dbt models can run reliably. This tool analyzes your JSON data and generates the correct BigQuery schema with proper data types such as STRING, INTEGER, FLOAT, BOOLEAN, TIMESTAMP, and RECORD, along with field modes like NULLABLE, REQUIRED, and REPEATED. It handles complex payloads with nested objects and arrays, automatically inferring whether arrays contain primitives or nested records. The generator also produces a basic dbt model stub so analytics engineers can start staging transformations without hand-writing every column. All processing happens in your browser, which means sensitive sample payloads never leave your machine.
Frequently asked questions
What BigQuery data types are supported?+
The generator supports all common BigQuery types including STRING, INTEGER, FLOAT, BOOLEAN, TIMESTAMP, DATE, TIME, BYTES, and RECORD for nested objects. Types are inferred from JSON values in your sample, so strings that look like timestamps become TIMESTAMP and integer-like numbers become INTEGER. When samples disagree, the tool surfaces conflicts and lets you choose a resolution strategy before finalizing schema output.
How does it handle nested objects and arrays?+
Nested objects are emitted as RECORD fields with their own nested fields array, preserving structure for BigQuery table definitions and downstream SQL. Arrays of primitives use REPEATED mode on the inferred scalar type, while arrays of objects become REPEATED RECORD nodes with child fields. The field and nested-record counters help you estimate schema complexity before deploying to production datasets.
What's the difference between NULLABLE and REQUIRED?+
NULLABLE allows null or missing values in a column, while REQUIRED means every row must provide a non-null value for that field. With strict nulls enabled, fields that never contain null across your analyzed records are marked REQUIRED, which is useful when sample JSON closely matches production completeness. Use REQUIRED carefully because BigQuery rejects inserts that violate mode constraints.
Can I use this for BigQuery streaming inserts?+
Yes. The generated schema works for batch loads, streaming inserts, and schema evolution discussions with data engineering teams. Ensure your production payloads follow the same nesting and repeated-field patterns as the sample JSON you analyzed here. If upstream APIs change shape frequently, regenerate from fresh samples and compare with JSON Diff before altering live tables.
What does the dbt model stub include?+
The dbt stub is a starter staging model with source(), renamed CTE, and select lines mapped from inferred schema fields, including basic casts for TIMESTAMP and DATE columns. It is intentionally generic so you can replace source names, materialization, and repeated-field unnest logic for your project conventions. Treat it as scaffolding rather than production-ready SQL without review.
What if my JSON has inconsistent types?+
The generator detects hard type conflicts across records, such as a field being an object in one row and a string in another, then prompts you to resolve them before output is finalized. You can cast conflicting fields to STRING for maximum compatibility or keep RECORD and ignore invalid entries depending on data quality expectations. Warnings also highlight softer issues like mixed timestamp and plain string values.