Ultra-fast streaming JSON Lines (JSONL/NDJSON) for Python,
pip install fastjsonl
# For zstd:
pip install "fastjsonl[zstd]"
.gz).bz2).xz, .lzma).zst) — best speed/compression ratio (requires zstandard extra)orjson option flags (e.g. pretty-print, sort keys, numpy/UUID/dataclass support)load() and dump()pip install fastjsonl
pip install fastjsonl
# For zstd support:
pip install fastjsonl[zstd]
├── src/ # Source code (src layout prevents accidental imports during dev)
│ └── fastjsonl/
│ ├── __init__.py
│ ├── __version__.py # Single source of truth: __version__ = "0.1.0"
│ └── core.py # Your FastJSONL class + load/dump functions
├── tests/ # pytest suite
│ ├── test_load.py
│ └── test_dump.py
├── benchmarks/ # asv or simple scripts comparing to orjsonl, jsonlines, manual orjson
├── .github/
│ └── workflows/
│ ├── ci.yml # Tests + lint on push/PR
│ └── release.yml # Build & publish on tags
├── pyproject.toml # All config: metadata, build-system, deps, tools
├── README.md # Detailed usage, benchmarks, install, why faster
├── LICENSE # MIT (common for perf libs) or Apache-2.0
├── CHANGELOG.md # KeepVersion-style or conventional commits
├── .gitignore # Standard Python + Rust/C extensions if any
└── MANIFEST.in # If needed for non-Python files (rare with pyproject.toml)
fastjsonl/
├── src/
│ └── fastjsonl/
│ ├── __init__.py
│ ├── __version__.py # __version__ = "0.1.0"
│ └── fastjsonl.py # Paste/improve our FastJSONL class from earlier
├── tests/
│ ├── __init__.py
│ ├── test_load.py
│ └── test_dump.py # Add simple pytest cases
├── benchmarks/
│ └── bench.py # Optional: pytest-benchmark or simple timing script
├── .github/workflows/
│ ├── ci.yml # Tests/lint
│ └── publish.yml # Release on tags
├── pyproject.toml # See below
├── README.md
├── LICENSE # MIT
├── CHANGELOG.md # Start with v0.1.0 initial release
└── requirements-dev.txt # For local dev: pytest, ruff, etc.
from fastjsonl import load, dump
import orjson
# Read compressed streaming (auto-detects .gz/.zst/etc.)
for record in load("huge_logs.jsonl.zst"):
print(record["timestamp"], record["level"])
# Write with zstd level 5 (good speed/ratio balance)
data = [{"id": i, "value": f"test_{i}"} for i in range(100_000)]
dump(data, "output.jsonl.zst", compression="zstd", level=5, option=orjson.OPT_INDENT_2)
```python from fastjsonl import load_apl # hypothetical APL-inspired mode
data = load_apl(“huge.jsonl.zst”) # returns an “APL-like array” proxy (lazy, chunked) timestamps = data[‘timestamp’] # array select — no loop high_values = data[data[‘value’] > 1000] # vectorized filter avg = (+/data[‘value’]) / ⍴data # sum / shape — APL-style reduction