The Data Analyst's JSON Toolkit: Essential Skills and Resources
JSON has become unavoidable. APIs return it. Databases export it. Log files use it. Marketing platforms dump it. As a data analyst, you need to work with JSON whether you like it or not.
This guide covers everything you need: understanding the format, handling common patterns, and getting JSON data into your analysis tools efficiently.
Part 1: Understanding JSON (5-Minute Primer)
The Basic Structure
JSON is text that represents structured data. It has two main building blocks:
Objects (curly braces) - key-value pairs:
{"name": "Alice", "age": 30, "active": true}
Arrays (square brackets) - ordered lists:
["apple", "banana", "cherry"]
These can nest inside each other:
{
"user": {
"name": "Alice",
"orders": [
{"id": 1, "amount": 99.99},
{"id": 2, "amount": 149.99}
]
}
}
Reading JSON Like an Analyst
When you get a JSON file, you're looking for:
- The pattern - Is it an array of records? An object with nested data?
- The structure - How deep is the nesting?
- The data types - Numbers, strings, dates, nulls?
- The relationships - Are there IDs linking different parts?
Common JSON Patterns You'll Encounter
Pattern 1: Flat Array (Easy)
[
{"id": 1, "name": "Product A", "price": 29.99},
{"id": 2, "name": "Product B", "price": 49.99}
]
This maps directly to Excel rows.
Pattern 2: Nested Objects (Medium)
{
"data": [
{"id": 1, "user": {"name": "Alice", "email": "alice@example.com"}}
]
}
Need to flatten: user.name, user.email.
Pattern 3: Arrays Within Records (Harder)
{
"orders": [
{
"id": 1,
"items": [
{"product": "A", "qty": 2},
{"product": "B", "qty": 1}
]
}
]
}
One order has multiple items. Flattening creates items[0].product, items[1].product, etc.
Pattern 4: Mixed Nesting (Complex)
{
"results": {
"page": 1,
"total": 100,
"data": [
{"id": 1, "tags": ["urgent", "new"], "meta": {"source": "web"}}
]
}
}
Metadata at root level, arrays inside records, nested objects.
Part 2: Your JSON Toolkit
Tool 1: Quick Browser Conversion
For one-time conversions where speed matters:
JsonExport (jsonexport.com)
- Paste JSON, get Excel
- Auto-flattens nested structures
- Privacy-first (data stays in browser)
- Best for: Ad-hoc analysis, sensitive data
When to use: You have a JSON file, you need it in Excel, you want to start analyzing in 30 seconds.
Tool 2: Power Query (Excel Native)
For recurring data sources you'll access repeatedly:
Power Query (built into Excel)
- Import from file or URL
- Transform and reshape data
- Refresh to get updated data
- Best for: Weekly reports, API endpoints
When to use: Same data source every week, need one-click refresh.
Quick setup:
- Data → Get Data → From File → From JSON
- Expand nested columns in the editor
- Load to worksheet
- Click "Refresh All" to update later
Tool 3: Python + Pandas
For large files or automation:
import pandas as pd
# Read JSON
df = pd.json_normalize(data, record_path=['items'], meta=['order_id'])
# Export to Excel
df.to_excel('output.xlsx', index=False)
When to use: Files over 10MB, complex transformations, automated pipelines.
Tool 4: jq (Command Line)
For quick inspection and filtering:
# View structure
cat data.json | jq 'keys'
# Extract specific field
cat data.json | jq '.users[].email'
# Filter records
cat data.json | jq '.items | map(select(.price > 100))'
When to use: Quickly check structure, extract specific fields, command-line workflows.
Tool 5: Online Viewers
For visual exploration:
- JSON Viewer (Chrome extension) - Format JSON in browser
- JSON Crack - Visualize as graph
- JSONPath Online - Test extraction queries
When to use: Understanding unfamiliar JSON structure before conversion.
Part 3: Common Analyst Workflows
Workflow 1: API Export to Excel Report
Scenario: Marketing asks for a report from the HubSpot API.
Steps:
- Export data via API or dashboard (usually JSON)
- Open in browser or text editor to understand structure
- Use JsonExport to flatten and convert
- Build pivot tables in Excel
- Create charts for the report
Time: 10 minutes vs. 2 hours of manual copying.
Workflow 2: Database Dump Analysis
Scenario: Developer gives you a MongoDB export.
Steps:
- Check file size (determines tool choice)
- Under 50MB: JsonExport (in-browser)
- Over 100MB: Python or split file
- Flatten nested documents
- Analyze in Excel or BI tool
Key insight: MongoDB exports often have deeply nested structures. Auto-flattening saves hours.
Workflow 3: Log File Investigation
Scenario: Something broke, need to analyze error logs.
Steps:
- Export relevant log entries (usually JSON lines format)
- Convert each line to a record
- Flatten to Excel
- Filter by timestamp, error type, user
- Identify patterns
Tip: JSON Lines format (one JSON object per line) needs slight preprocessing—wrap in array brackets.
Workflow 4: Comparing Two Data Sources
Scenario: Reconcile API data with database export.
Steps:
- Export both sources
- Flatten both to Excel
- Use VLOOKUP or Power Query merge
- Identify mismatches
Key: Ensure both have a common ID field before joining.
Part 4: Handling Tricky Situations
Problem: Null Values
JSON uses null for missing data:
{"name": "Alice", "phone": null}
Excel behavior: Shows as empty cell (usually fine) or "null" text (needs cleaning).
Fix: In Excel, Find and Replace: null → (empty)
Problem: Date Formats
JSON dates are often ISO strings:
{"created": "2024-01-15T14:30:00.000Z"}
Excel behavior: May import as text.
Fix: =DATEVALUE(LEFT(A2, 10)) extracts just the date.
Problem: Arrays as Values
Some fields contain arrays:
{"tags": ["urgent", "review", "marketing"]}
After flattening: Becomes tags[0], tags[1], tags[2] columns.
Alternative: If you want one row per tag, use Power Query's "Expand to New Rows" option.
Problem: Inconsistent Structure
Real-world JSON often has inconsistent fields:
[
{"id": 1, "name": "A", "discount": 10},
{"id": 2, "name": "B"}
]
Result: Second record has no discount.
Handling: Most tools create the column, show blank for missing values. This is usually fine.
Problem: Double-Encoded Strings
Sometimes JSON contains escaped JSON:
{"data": "{\"name\": \"Alice\", \"age\": 30}"}
That data field is a string containing JSON, not actual JSON.
Fix: JsonExport auto-detects and unescapes these. Power Query needs a custom step.
Part 5: Best Practices
Before You Start
- Check the file size - Determines which tool to use
- Preview the structure - Understand nesting before converting
- Identify the key field - What uniquely identifies each record?
During Conversion
- Flatten thoughtfully - Sometimes you want separate tables, not one giant flat table
- Watch for arrays - Decide whether to expand columns or rows
- Validate a sample - Check a few records match the source
After Conversion
- Clean column headers - Rename
data.user.profile.nametouser_name - Check data types - Ensure numbers are numbers, dates are dates
- Document your process - Notes for next time or your replacement
Part 6: Building Your Skills
Level 1: Basic Competency
- Understand JSON syntax
- Use JsonExport or similar for conversions
- Build basic Excel reports from JSON data
Level 2: Intermediate
- Power Query for recurring sources
- Handle nested structures confidently
- Write basic Python for large files
Level 3: Advanced
- jq for command-line processing
- Automate with scripts
- Design data pipelines from JSON sources
Resources to Learn More
JSON Fundamentals:
- MDN Web Docs: JSON Guide
- JSON.org specification
Power Query:
- Microsoft's official Power Query documentation
- "M is for Data Monkey" (book)
Python for Data:
- pandas documentation (json_normalize especially)
- "Python for Data Analysis" by Wes McKinney
Quick Reference Card
When to Use What
| Situation | Tool |
|---|---|
| Quick one-time conversion | JsonExport |
| Repeating weekly report | Power Query |
| File over 100MB | Python |
| Sensitive data | JsonExport (client-side) |
| Complex transformations | Python |
| View structure quickly | JSON Viewer extension |
Common JSON Paths
.dataor.results- Often wraps the actual records.itemsor.records- Common array names.metaor.pagination- Metadata about the response.idor._id- Unique identifiers
Excel Formulas for Cleaned JSON Data
// Extract date from ISO string
=DATEVALUE(LEFT(A2, 10))
// Combine flattened name fields
=CONCAT(B2, " ", C2)
// Handle null text
=IF(A2="null", "", A2)
Conclusion
JSON doesn't have to be intimidating. With the right tools and a systematic approach, you can get any JSON data into your analysis workflow quickly.
Start simple: JsonExport for quick conversions. Learn Power Query for recurring sources. Add Python when you hit the limits.
The goal isn't to become a JSON expert—it's to spend less time on data wrangling and more time on actual analysis.
Related Guides: