Back to Blog

The Data Analyst's JSON Toolkit: Essential Skills and Resources

JSON has become unavoidable. APIs return it. Databases export it. Log files use it. Marketing platforms dump it. As a data analyst, you need to work with JSON whether you like it or not.

This guide covers everything you need: understanding the format, handling common patterns, and getting JSON data into your analysis tools efficiently.

Part 1: Understanding JSON (5-Minute Primer)

The Basic Structure

JSON is text that represents structured data. It has two main building blocks:

Objects (curly braces) - key-value pairs:

{"name": "Alice", "age": 30, "active": true}

Arrays (square brackets) - ordered lists:

["apple", "banana", "cherry"]

These can nest inside each other:

{
  "user": {
    "name": "Alice",
    "orders": [
      {"id": 1, "amount": 99.99},
      {"id": 2, "amount": 149.99}
    ]
  }
}

Reading JSON Like an Analyst

When you get a JSON file, you're looking for:

  1. The pattern - Is it an array of records? An object with nested data?
  2. The structure - How deep is the nesting?
  3. The data types - Numbers, strings, dates, nulls?
  4. The relationships - Are there IDs linking different parts?

Common JSON Patterns You'll Encounter

Pattern 1: Flat Array (Easy)

[
  {"id": 1, "name": "Product A", "price": 29.99},
  {"id": 2, "name": "Product B", "price": 49.99}
]

This maps directly to Excel rows.

Pattern 2: Nested Objects (Medium)

{
  "data": [
    {"id": 1, "user": {"name": "Alice", "email": "alice@example.com"}}
  ]
}

Need to flatten: user.name, user.email.

Pattern 3: Arrays Within Records (Harder)

{
  "orders": [
    {
      "id": 1,
      "items": [
        {"product": "A", "qty": 2},
        {"product": "B", "qty": 1}
      ]
    }
  ]
}

One order has multiple items. Flattening creates items[0].product, items[1].product, etc.

Pattern 4: Mixed Nesting (Complex)

{
  "results": {
    "page": 1,
    "total": 100,
    "data": [
      {"id": 1, "tags": ["urgent", "new"], "meta": {"source": "web"}}
    ]
  }
}

Metadata at root level, arrays inside records, nested objects.


Part 2: Your JSON Toolkit

Tool 1: Quick Browser Conversion

For one-time conversions where speed matters:

JsonExport (jsonexport.com)

  • Paste JSON, get Excel
  • Auto-flattens nested structures
  • Privacy-first (data stays in browser)
  • Best for: Ad-hoc analysis, sensitive data

When to use: You have a JSON file, you need it in Excel, you want to start analyzing in 30 seconds.

Tool 2: Power Query (Excel Native)

For recurring data sources you'll access repeatedly:

Power Query (built into Excel)

  • Import from file or URL
  • Transform and reshape data
  • Refresh to get updated data
  • Best for: Weekly reports, API endpoints

When to use: Same data source every week, need one-click refresh.

Quick setup:

  1. Data → Get Data → From File → From JSON
  2. Expand nested columns in the editor
  3. Load to worksheet
  4. Click "Refresh All" to update later

Tool 3: Python + Pandas

For large files or automation:

import pandas as pd

# Read JSON
df = pd.json_normalize(data, record_path=['items'], meta=['order_id'])

# Export to Excel
df.to_excel('output.xlsx', index=False)

When to use: Files over 10MB, complex transformations, automated pipelines.

Tool 4: jq (Command Line)

For quick inspection and filtering:

# View structure
cat data.json | jq 'keys'

# Extract specific field
cat data.json | jq '.users[].email'

# Filter records
cat data.json | jq '.items | map(select(.price > 100))'

When to use: Quickly check structure, extract specific fields, command-line workflows.

Tool 5: Online Viewers

For visual exploration:

  • JSON Viewer (Chrome extension) - Format JSON in browser
  • JSON Crack - Visualize as graph
  • JSONPath Online - Test extraction queries

When to use: Understanding unfamiliar JSON structure before conversion.


Part 3: Common Analyst Workflows

Workflow 1: API Export to Excel Report

Scenario: Marketing asks for a report from the HubSpot API.

Steps:

  1. Export data via API or dashboard (usually JSON)
  2. Open in browser or text editor to understand structure
  3. Use JsonExport to flatten and convert
  4. Build pivot tables in Excel
  5. Create charts for the report

Time: 10 minutes vs. 2 hours of manual copying.

Workflow 2: Database Dump Analysis

Scenario: Developer gives you a MongoDB export.

Steps:

  1. Check file size (determines tool choice)
  2. Under 50MB: JsonExport (in-browser)
  3. Over 100MB: Python or split file
  4. Flatten nested documents
  5. Analyze in Excel or BI tool

Key insight: MongoDB exports often have deeply nested structures. Auto-flattening saves hours.

Workflow 3: Log File Investigation

Scenario: Something broke, need to analyze error logs.

Steps:

  1. Export relevant log entries (usually JSON lines format)
  2. Convert each line to a record
  3. Flatten to Excel
  4. Filter by timestamp, error type, user
  5. Identify patterns

Tip: JSON Lines format (one JSON object per line) needs slight preprocessing—wrap in array brackets.

Workflow 4: Comparing Two Data Sources

Scenario: Reconcile API data with database export.

Steps:

  1. Export both sources
  2. Flatten both to Excel
  3. Use VLOOKUP or Power Query merge
  4. Identify mismatches

Key: Ensure both have a common ID field before joining.


Part 4: Handling Tricky Situations

Problem: Null Values

JSON uses null for missing data:

{"name": "Alice", "phone": null}

Excel behavior: Shows as empty cell (usually fine) or "null" text (needs cleaning).

Fix: In Excel, Find and Replace: null → (empty)

Problem: Date Formats

JSON dates are often ISO strings:

{"created": "2024-01-15T14:30:00.000Z"}

Excel behavior: May import as text.

Fix: =DATEVALUE(LEFT(A2, 10)) extracts just the date.

Problem: Arrays as Values

Some fields contain arrays:

{"tags": ["urgent", "review", "marketing"]}

After flattening: Becomes tags[0], tags[1], tags[2] columns.

Alternative: If you want one row per tag, use Power Query's "Expand to New Rows" option.

Problem: Inconsistent Structure

Real-world JSON often has inconsistent fields:

[
  {"id": 1, "name": "A", "discount": 10},
  {"id": 2, "name": "B"}
]

Result: Second record has no discount.

Handling: Most tools create the column, show blank for missing values. This is usually fine.

Problem: Double-Encoded Strings

Sometimes JSON contains escaped JSON:

{"data": "{\"name\": \"Alice\", \"age\": 30}"}

That data field is a string containing JSON, not actual JSON.

Fix: JsonExport auto-detects and unescapes these. Power Query needs a custom step.


Part 5: Best Practices

Before You Start

  1. Check the file size - Determines which tool to use
  2. Preview the structure - Understand nesting before converting
  3. Identify the key field - What uniquely identifies each record?

During Conversion

  1. Flatten thoughtfully - Sometimes you want separate tables, not one giant flat table
  2. Watch for arrays - Decide whether to expand columns or rows
  3. Validate a sample - Check a few records match the source

After Conversion

  1. Clean column headers - Rename data.user.profile.name to user_name
  2. Check data types - Ensure numbers are numbers, dates are dates
  3. Document your process - Notes for next time or your replacement

Part 6: Building Your Skills

Level 1: Basic Competency

  • Understand JSON syntax
  • Use JsonExport or similar for conversions
  • Build basic Excel reports from JSON data

Level 2: Intermediate

  • Power Query for recurring sources
  • Handle nested structures confidently
  • Write basic Python for large files

Level 3: Advanced

  • jq for command-line processing
  • Automate with scripts
  • Design data pipelines from JSON sources

Resources to Learn More

JSON Fundamentals:

  • MDN Web Docs: JSON Guide
  • JSON.org specification

Power Query:

  • Microsoft's official Power Query documentation
  • "M is for Data Monkey" (book)

Python for Data:

  • pandas documentation (json_normalize especially)
  • "Python for Data Analysis" by Wes McKinney

Quick Reference Card

When to Use What

SituationTool
Quick one-time conversionJsonExport
Repeating weekly reportPower Query
File over 100MBPython
Sensitive dataJsonExport (client-side)
Complex transformationsPython
View structure quicklyJSON Viewer extension

Common JSON Paths

  • .data or .results - Often wraps the actual records
  • .items or .records - Common array names
  • .meta or .pagination - Metadata about the response
  • .id or ._id - Unique identifiers

Excel Formulas for Cleaned JSON Data

// Extract date from ISO string
=DATEVALUE(LEFT(A2, 10))

// Combine flattened name fields
=CONCAT(B2, " ", C2)

// Handle null text
=IF(A2="null", "", A2)

Conclusion

JSON doesn't have to be intimidating. With the right tools and a systematic approach, you can get any JSON data into your analysis workflow quickly.

Start simple: JsonExport for quick conversions. Learn Power Query for recurring sources. Add Python when you hit the limits.

The goal isn't to become a JSON expert—it's to spend less time on data wrangling and more time on actual analysis.


Related Guides: